[R] Links to vignettes in the base packages

2015-01-14 Thread Richard Cotton
Contributed packages have copies of their vignettes on CRAN (in their
package page at cranmirror/web/packages/packagename).

Since base packages no longer have a page here, I can't find a web link to them.

I'm aware that I can find the vignette via browseVignettes() or
vignette(vignettename, package = packagename).  There are two use
cases where a web link is more useful.  If someone is asking for help
and I want to direct them to a vignette, having a link to a vignette
rather than getting them to open R and type a command increases the
chance that they will actually bother to look at the thing. Secondly,
if I want to read a vignette on a machine without R (ebook reader,
phone, whatever), then a web link is more convenient.

Are the base vignettes available online?  If so, where are they?  If
not, is it feasible to make them available?

-- 
Regards,
Richie

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Style Guide -- Was Post-hoc tests in MASS using glm.nb

2011-05-23 Thread Richard . Cotton
 LIST OF CONVENTIONS/STYLES FOR R:
 
 [1] R coding standards in the R Internals manual
 http://www.cran.r-project.org/doc/manuals/R-ints.html#R-coding-standards
 
 [2] Bioconductor coding standards
   http://wiki.fhcrc.org/bioc/Coding_Standards
 
 [3] Google R style
 [http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html]
 
 [4] R style guide by Hadley Wickham (based on [3])
   http://had.co.nz/stat405/resources/r-style-guide.html
 
 [5] R Coding Conventions (RCC) - a draft by Henrik Bengtsson
   http://aroma-project.org/developers/RCC
 
 [6] The Aroma R Coding Conventions (Aroma RCC) by Henrik Bengtsson
 (based on [5])
   http://aroma-project.org/developers/AromaRCC

It seems that I'm a little late to this conversation, but I'd like to add 
my own guide to the list.  It contains links to all the other mentioned 
style guides and another one from Colin Gillespie.

http://4dpiecharts.com/r-code-style-guide/

Regards,
Richie.

Mathematical Sciences Unit
HSL
4D Pie Charts




ATTENTION:

This message contains privileged and confidential inform...{{dropped:22}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How is the relation between Frequency and Counts in hist/density defined?

2011-05-23 Thread Richard . Cotton

In order to do this I can use the relation between count and density, but
   I
would like to know if there is a way for me to predict it upfront.
   In the code for hist.default, you'll see the line
   dens - counts/(n * diff(breaks))
Here is an example:
   
set.seed(242)
z = rnorm(30)
hist_z - hist(z)
hist_z$counts / hist_z$density # the relation is 15
   In your example, n is 30, and the breaks are evenly spaced every 0.5.
   Regards,
   Richie.
   Mathematical Sciences Unit
   [1]HSL
   [2]4D Pie Charts
 _

   ATTENTION:



   This message contains privileged and confidential information intended for
   the addressee(s) only. If this message was sent to you in error, you must
   not disseminate, copy or take any action in reliance on it and we request
   that you notify the sender immediately by return email.



   Opinions expressed in this message and any attachments are not necessarily
   those held by the [3]Health and Safety Laboratory or any person connected
   with the organisation, save those by whom the opinions were expressed.



   Please note that any messages sent or received by the [4]Health and Safety
   Laboratory  email system may be monitored and stored in an information
   retrieval system.
 _

   Think before you print - do you really need to print this email?
 _
 _

   Scanned by MailMarshal - Marshal's comprehensive email content security
   solution. Download a free evaluation of MailMarshal at [5]www.marshal.com
 _

References

   1. http://www.hsl.gov.uk/contact-us.htm
   2. http://4dpiecharts.com/
   3. http://www.hsl.gov.uk/
   4. http://www.hsl.gov.uk/
   5. http://www.marshal.com/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qbeta

2011-02-22 Thread Richard . Cotton
  Does any body know how I can see the code behind qbeta function?
 
 As the code seems to be internal, you'll need to download the r-source
 code and find it in there. In my copy of R it is here:
 R-2.11.1/src/nmath/qbeta.c

An alternative is to view the source code online.  The code for qbeta is 
in
http://svn.r-project.org/R/trunk/src/nmath/qbeta.c

In general, you can find internal code by searching for
site:http://svn.r-project.org/R/trunk/src function name

Regards,
Richie.

Mathematical Sciences Unit
HSL
4D Pie Charts



ATTENTION:

This message contains privileged and confidential inform...{{dropped:22}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Implementing R's recycling rule

2010-10-19 Thread Richard . Cotton
 x - c(1, 2, 3)
 n - 10
 ## so using the recycling rules, I would like to get from FUN(x, n)==1
 ## I am doing:
 xRecycled - rep(x, length.out=n)[n]
 
 This works, but it seems to me that I am missing something really basic 
here
 - is there  more straightforward way of doing this?

x[n %% length(x)] gives you the same answer as rep(x, length.out=n)[n], 
without having to create the longer vector.

Regards,
Richie.

Mathematical Sciences Unit
HSL
4D Pie Charts




ATTENTION:

This message contains privileged and confidential inform...{{dropped:22}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] table, sum, cat function

2009-08-25 Thread Richard . Cotton

   the second step in my exercice is to calculate the sum of the amout for each
   class et not the frequency
   i have this vector
   x   y
   1  100
   2  1500
   3  3250
   4  6250
   5  2000
   6   450
   i want to use the function table and cat to calculate the sum of the amount
   in each class [0-1000], [1000-3000],[ 3000]
   You want to use cut, not cat.  Take a look at the examples on the cut help
   page - they will help you.
   Regards,
   Richie.
   Mathematical Sciences Unit
   [1]HSL
 _

   ATTENTION:



   This message contains privileged and confidential information intended for
   the addressee(s) only. If this message was sent to you in error, you must
   not disseminate, copy or take any action in reliance on it and we request
   that you notify the sender immediately by return email.



   Opinions expressed in this message and any attachments are not necessarily
   those held by the [2]Health and Safety Laboratory or any person connected
   with the organisation, save those by whom the opinions were expressed.



   Please note that any messages sent or received by the [3]Health and Safety
   Laboratory  email system may be monitored and stored in an information
   retrieval system.
 _

   Think before you print - do you really need to print this email?
 _
 _

   Scanned by MailMarshal - Marshal's comprehensive email content security
   solution. Download a free evaluation of MailMarshal at [4]www.marshal.com
 _

References

   1. http://www.hsl.gov.uk/contact-us.htm
   2. http://www.hsl.gov.uk/
   3. http://www.hsl.gov.uk/
   4. http://www.marshal.com/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mixed normal distriburtion

2009-08-25 Thread Richard . Cotton
 I'm trying to draw the density function of a mixed normal distribution
 in the form of:
 .6*N(.4,.1)+ .4*N(.8,.1)
 At first I generate a random sample with size 200 by the below code:
 means = c(.4,.8)
 sds = sqrt(c(.1,.1))
 ind = sample(1:2, n, replace=TRUE, prob=c(.6,.4))
 x=rnorm(n,mean=means[ind],sd=sds[ind])
 
 Then I use the below code for drawing the graph:
 plot(density(x))
 
 The plot doesn't seem to be belonging to the desired distribution,
 because there is just one mode in it (I've seen the real graph of this
 mixed normal in a paper, it has two clear distinct modes). Even the
 hist() doesn't draw a plot similar to the real graph. I think the
 generation code isn't correct. Is it? (I've asked the generation code
 here!)

The code is fine - the reason you can't see two peaks is that the two 
distributions overlap a lot.  Set 
means - c(.4, 10) 
to see double peaks more clearly.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:22}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] formats

2009-08-25 Thread Richard . Cotton
 what do you mean by %d-%b-%y. is it reading format or writing format.

%d-%b-%y is a date format - see the help page for strptime.

Example usage:
strptime(01-Jan-84, %d-%b-%y)
strftime(Sys.time(), %d-%b-%y)

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:22}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] HELP: BRUGS/WinBUGS/RBUGS Response is a combination of random variables

2009-07-20 Thread Richard . Cotton
 Is there anyone know if BUGS language allows the combination of 
 variables as response

 It seems doesn't work in my model. The problem is between two ##.

   modelCompile(numChains=1)
 multiple definitions of node bm[1]

  ###
  bm[iter] -  inprod(biomarker[iter,1:4], bta[1:4])
  bm[iter] ~ dnorm(bmu[iter], sdbm.precision);
  ###

The problem seems to be that you've defined bm[iter] as a logical node 
(-), then redefined it as a stochastic node (~).  If you need more help, 
try the BUGS mailing list.  (See 
http://www.mrc-bsu.cam.ac.uk/bugs/overview/list.shtml)

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:22}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression for loop test HELP! URGENT!

2009-07-20 Thread Richard . Cotton
 I'm new to R, and I've sent this message as a non-member, but since it's
 pretty urgent, I'm sending it again now I'm on the mailing list (Thanks
 Daniel for your suggestion nevertheless).
 
 I have calculated a regression in the form of M ~ D + O + S, and I would
 like to take this regression and test it with other samples, 5 sets of 
M, D,
 O, and S at a time(I actually have 2000 sets, so it's probably not 
efficient
 to make each a separate set and then index). Since I'll need to test the
 regression for 400 groups, I thought a for loop might be necessary. I've 
put
 everything into a data frame already. Can anyone tell me how to write 
the
 code? I'm especially not sure about how to do the for loop.
 And then how would I calculate the error of how well the test samples 
fit
 the original regression?
 This is for my internship, so it's very urgent.

Take a deep breath, and think calm thoughts.  Take a look at the posting 
guide (http://www.r-project.org/posting-guide.html) - it has useful ideas 
on thinking through your problem.  If you can provide some code then we 
can see what you want more clearly.

Show us how you've done your regression what form your data is in.  Tell 
us which tests you'd like to do on the samples.

If you are stuck with for loops, then take a look at section 9.2.2 in the 
Intro to R guide that comes with R.  (Click Help - Manuals - an 
Introduction to R in RGui.)

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:22}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Histograms on a log scale

2009-07-20 Thread Richard . Cotton
 I would like to be able to plot histograms/densities on a semi-log or 
 log-log scale.

 # Get a random log-normal distribution
 r - rlnorm(1000)
 
 # Get the distribution without plotting it using tighter breaks
 h - hist(r, plot=F, breaks=c(seq(0,max(r)+1, .1)))
 
 # Plot the distribution using log scale on both axes, and use
 # blue points
 plot(h$counts, log=xy, pch=20, col=blue,
 main=Log-normal distribution,
 xlab=Value, ylab=Frequency)
 
 This is very close to what I need, but how can I have filled rectangles 
 in the plot, so that it looks more like a traditional histogram?

You can use type=h to specify histogram-like plotting.  You probably 
also want to use a linear y-scale to make frequencies easier to compare. 
Change the last line to:

plot(h$mids, h$counts, log=x, pch=20, col=blue,
main=Log-normal distribution,
xlab=Value, ylab=Frequency, type=h)

Alternatively, plot a histogram of the log data, and draw your own axes:

logr - log(r)
par(las=2)
h2 - hist(logr, axes=FALSE, breaks=seq(min(logr)-1, max(logr)+1, .5))
Axis(side=2)
Axis(side=1, at=h2$mids, labels=format(exp(h2$mids), digits=1), lty=NULL)

Or, even better, install the ggplot2 package and try something like:

qplot(r, geom=histogram) + scale_x_continuous(trans=log10)

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:22}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rle

2009-07-07 Thread Richard . Cotton
 I have an other problem, I have this vector signData with an alternation 
of
 1 and -1 that corrispond to the duration of two different percepts. I
 extracted the durations like this:
 
 signData- scan(dataTR10.txt)
 dur-rle(signData)$length

I think that last line should be 
dur-rle(signData)$lengths

 Now I would like to extract only the positive duration, e.g. 
 
 signData - c(1,1,1,1,-1,-1,-1,1,1,-1,-1)
 posduration - c(4,2)

If you know that the first element of signData will always be 1, then you 
can simply extract the first, third, fifth etc values from signdata, like 
so:

posduration - dur[c(TRUE, FALSE)]

Otherwise you need to test to see if you are extracting odd or even 
elements.

if(signData[1]==1)
{
   index - c(TRUE, FALSE)
} else
{
   index - c(FALSE, TRUE)
}
posduration - dur[index]

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to apply a self-written function to a data frame

2009-07-06 Thread Richard . Cotton
 I have written a function in order to analyse gaze paths. It works 
 with the test data but when I try to apply the function to a data 
 frame that stores the real data in columns I receive the error 
 message that the 
 
  In if (pp  1) { :
   condition has length  1 only the first element will be used
 

This means that pp is a vector (and hence pp  1 is also a vector), but 
'if' requires a scalar logical value.  Try, e.g.
if(1:10  5) message(foo)
for a simpler way of recreating this error message.

 I interpret this error message as saying that only the first element
 of pp is used. However, I'd like to analyse each row of the data 
 frame, row by row. Using 
 
 apply(final, 1, abst, gx,gy,tx,ty,p_pos) (with gx - p_pos being 
 columns of the data frame final) 

By the looks of things, when you call apply(final, 1, abst, 
gx,gy,tx,ty,p_pos), the value of pp (inside the abst function) is being 
taken as the value of p_pos, which is all the relevant columns of the data 
frame.  Are you sure that this is the value you meant for pp?

There is a vectorised version of if, namely the ifelse function.  Perhaps 
the line if (pp  1) {px = -px } should be
px - ifelse(pp  1, -px, px)

or even more simply, 
px[pp  1] - -px[pp  1]

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] variable driven csv file names?

2009-06-01 Thread Richard . Cotton
 Is it possible to have variable driven csv file names?  Such as:
 
 ds.name-bob.csv
 
 write.table(
   distribution.data,
   file = ~//Documents/Research/Distribution Analysis/ds.name,
   sep = ,,
   col.names = FALSE,
   qmethod = double)

Yes.  You just need to construct your filename string using paste. Replace 
the file argument with

paste(~//Documents/Research/Distribution Analysis/, ds.name, sep=)

Also take a look at ?file.path for platform independent file paths.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help

2009-05-28 Thread Richard . Cotton
 I am interested in modeling hydrological extreme events. I found 
 MSClaio2008 very interesting function. In this function four 
 criterions for choosing distributions. Can we call these criterions 
 as model selection techniques or goodness of fit techniques or both?
 Because goodness of fit techniques are usually performed after modle
 selection. 

What is MSClaio2008?  I don't quite understand this bit.  If you provide 
code examples, then questions are often easier to answer.

  Can I found chi-square, kolmogrov-sminov and cramer-von mises tests
 for testing goodness of fit for proposed distributions?

Yes, you can do these things in R.  The functions you want are: 
chisq.test for the chi-square test
ks.test  for the kolmogorov-smirnoff test.

You can view the help pages for these functions by typing a question mark 
before the function name, e.g. ?chisq.test.

For the Cramer-von Mises tests, I didn't know, so I typed 
RSiteSearch(cramer von mises test), from which there are several 
suggestions.  In particular, look at the CvM2SL1Test and CvM2SL2Test 
packages.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sample unique pairs from a matrix

2009-05-28 Thread Richard . Cotton
 I have a matrix of both negative and positive values that I would like
 to randomly sample with the following 2 conditions:
 
 1. only sample positive values
 2. once a cell in the matrix has been sampled the row and column of
 that cell cannot be sampled from again.
 
 #some dummy data
 set.seed(101)
 dataf - matrix(rnorm(36,1,2), nrow=6)
 
 I can do this quite simply if all the values are positive by using the
 sample function without replacement on the column and row indices.
 
 samrow - sample(6,replace=F)
 samcol - sample(6,replace=F)
 values - numeric(6)
 for(i in 1:6){
values[i] - dataf[samrow[i], samcol[i]]
 }
 
 However, I am not sure how to include the logical condition to only
 include postitive values

You could create a new variable containing only the positive values of 
dataf, and sample from that, e.g.
dataf - matrix(rnorm(36,1,2), nrow=6)
posdata - dataf[dataf  0]
sample(posdata, 6, replace=FALSE)

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot error

2009-05-28 Thread Richard . Cotton
 I want to plot data such that the 3 time points(a,b,c) lie on the X-axis 
and
 the values of these times points are on Y-axis for n samples (e.g.100).
 
 So, I have an object x, dim 100 4, it is a dataframe (when checked the
 class)
 x = 
 name   a   b  c
 10.11  1.11   0.86
 2   .  .   .
 3   .  .   .
 .
 .
 .
 100
 
 so when i say:
 
  plot(1:3, x[,2:4], type=l) - I get the error below
 
 Error in xy.coords(x, y, xlabel, ylabel, log) : 
(list) object cannot be coerced to type 'double'
 
 However if I do:
  plot(1:3, x[1,2:4], type=l) -- It works for the 1st row, and 
each
  individual row BUT NOT ALL ROWS
 
 Please could someone explain what is happening here?
 
 I wonder if I need to use 'lines' for the remaining, BUT I have another
 dataset y with same dimensions as x, which I want to plot on the same
 graph/plot to see the difference between x and y.

Your data looks like this:
x - data.frame(name=sample(letters, 10), a=runif(10), b=rnorm(10), 
c=rlnorm(10))

The problem is that the subset x[,2:4] is also a data frame, not a matrix.
class(x[,2:4])  #[1] data.frame

The simplest thing is probably to use lines, as you say.
row - seq_len(nrow(x))
xx - x[,2:4]
plot(row, xx$a, ylim=range(xx), type=l)
lines(row, xx$b, col=blue)
lines(row, xx$c, col=green)

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] r-plot

2009-05-27 Thread Richard . Cotton
 I want to make a simple plot, here is my code: 
http://gist.github.com/118550
 Unfortunately, the annotation of both the x- and y-axis are not correct, 
as
 you can see in the following picture: 
 http://www.nabble.com/file/p23739356/plot.png 
 I am not an expert of R, so maybe someone can point me to the solution 
of
 this problem, i.e. both of the axes should start and end at the min / 
max
 values of the two vectors.

From the help page on par:

'xaxs' The style of axis interval calculation to be used for the
  x-axis.  Possible values are 'r', 'i', 'e', 's',
  'd'.  The styles are generally controlled by the range of
  data or 'xlim', if given. Style 'r' (regular) first extends
  the data range by 4 percent at each end and then finds an
  axis with pretty labels that fits within the extended range.
  Style 'i' (internal) just finds an axis with pretty labels
  that fits within the original data range.

You've explicitly set xaxs=r, when you really want xaxs=i.  You can 
also explicitly set the axis limits using xlim/ylim parameters in the call 
to plot.

Compare these examples:
#Ex 1
plot(1:10)  #implicitly uses par(xaxs=r, yaxs=r) unless you've 
changed something

#Ex 2
oldpar - par(xaxs=i, yaxs=i)
plot(1:10)
par(oldpar)

#Ex 3
plot(1:10, xlim=c(-5, 15), ylim=c(-100, 100))

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] XML parse error

2009-05-21 Thread Richard . Cotton
I am trying to parse XML file ( binary hex) but get an error. 
 Code I am using is:
 xsd = xmlTreeParse(system.file(exampleData, norel.xsd, package =
 XML), isSchema =TRUE) doc = xmlInternalTreeParse(system.
 file(exampleData, LogCallSummary.bin, package = XML)) Start 
 tag expected, '' not found
 
  xmlParse command results in same error as well: 
 f = system.file(exampleData, LogCallSummary.bin, package = 
 XML)  doc = xmlParse(f)Start tag expected, '' not found
 I am at beginner level with XML and will appreciate any help with 
 this error or general guidance. 
 Thanks 
 Kulwinder Banipal
 
 file is:
 000: 0281 0001 0201 0098 c1d5 c000    
 010: 000a c0a8 db35 0055 6000 00af 0001 0001  .5.U`...
 020: 5f00 2200 4530  4411 2233 4455 0f08  _..E0..D.3DU..
 030: 0123 4567 8901 2340  04d2    .#eg...@
 040:   0002 0100 0001 0003 0303   
 050:   0100  6400  0100   d...
 060: 6401 0300 0900 00fe fe00 012f 0001   d../
 070: 0101 0001   0001  2200 0033  ..
 3080: 3306   0022  1100    3...33.
 090: 0033 3400 2300 0011  0001  3335  .34.#.
 350a0: 0024  1100  0200 0033 3600 2500  .$.36.%.
 0b0: 0011  0003  3337 0026  1100  37.
 0c0:  0400 0033 3800 2700 0011  0005  .38.'...
 0d0: 5504 7700 8800 0044 4406  2323 0099  U.wDD...##..
 0e0: 0100 0200   0023 2400 9901 0002  .#$.
 0f0:  0001  2325 0099 0100 0200   ..#%
 100: 0200 0023 2600 9901 0002  0003   ...#...
 110: 2327 0099 0100 0200  0400 0023 2800  #'...#(.
 120: 9901 0002  0005 0102 0008 0100 0066  ...
 f130: 6600 0055 5533   3400  0a35  f..UU34
 5140:  0014 3600  1e37  0028 3800  67...(8.
 150:  3239  003c 3a00  463b   ..29...:...F;..
 160: 0050 3c00  5a00 0088 8800 0077 7744  .P...Z..
 wwD170:   4500  0a46  0014 4700  EF
 G.180:  1e48  0028 4900  324a   ...H...(I...2J..
 190: 003c 4b00  464c  0050 4d00   .K...FL...PM...
 1a0: 5a02 2207 7766 6604 0500  1100 0088  Z..wff.
 1b0: 8800  0106  0011  8889   
 1c0: 0011 0700  1100 0088 8a00  2108  ..!.
 1d0:  0011  888b  0031 0405   ...1
 1e0: 0022  0044  0001 0600  2200  D.
 1f0:  4500  1107  0022  0046  ..E...
 F200:  0021 0800  2200  4700   ...!...G...
 210: 3106  0001 0002 0003 0004 0005 0200  1...
 220: 0033 4400 0055 6609 0101 0202 0303 0404  .3D..Uf.
 230: 0505 0606 0707 0808 0909 0405  0011  
 240:  0044  0022  0088 0500   ...D...
 250: 1200  4500  2300  8905   E...#...
 260: 0013  0046  0024  008a 0500  .F...$..
 270:  1400  4700  2500  8bfa 
..G...%.280: ae

Um, this isn't an XML file. An XML file should look something like this:

?xml version=1.0 encoding=utf-8 ?
tag
   subtagvalue/subtag
/tag

The wikipedia entry on XML gives a reasonable intro to the format.  
http://en.wikipedia.org/wiki/Xml

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [newbie] how to do a 3d plot of bivariate density?

2009-05-21 Thread Richard . Cotton
 I am new to R. Yesterday I passed the afternoon reading the
 introduction and language reference, but I could'nt find a way to do a
 3d plot of the density of a data table of size 2.
 I am trying with:
 
  plot(density(t(t2)))
 
 but it mixes the two columns and calculate the density like it is a
 1-dimensional casual variable.

I presume what you mean by a data table of size 2 is something like this:

data - matrix(c(3,53,36,17), nrow=2)

3D plots are almost never the best solution.  (You get all sorts of issues 
with perspective problems and some bits of the plot being obscured by 
other bits.)  Would something like this to the trick?

image(data)

If you want something different, then please provide an example of your 
data (nothing too big, so we can reproduce it), and a description of what 
you would like to show.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need help on ploting Histograms

2009-05-21 Thread Richard . Cotton
 this is the command i made for a normal distribution, but when i try to 
plot
 the histograms, i dont know why the bars don't stick on the line...
 
 nsamples-1000
 sampsize-15
 Samples-matrix(rnorm(nsamples*sampsize,0,1),nrow=nsamples) 
 a-apply(Samples,1,var)
 NC14-a*14
 x-0:40
 plot(x,dchisq(x,14),type='h') 
 hist(NC14,freq=F,add=T)

Set the histogram bins to match the plot.  Change the last line of your 
code to, e.g.

hist(NC14,freq=F,add=T, breaks=x, border=transparent, col=#)

Note that the histogram contains sample values, while the plot shows a 
theoretical distribution, so they won't match exactly.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] arrangement of crowded labels

2009-05-20 Thread Richard . Cotton
 I'm looking for algorithms that assist in spreading out crowded labels, 
e.g. 
 labels of points in a scatter plot, in order to obtain a nicer visual 
 appearance and better legibility. 
 
 I'm probably just stuck because I didn't find the right key words for a 
 successful search on the R websites or in the mailing list archives.

Try thigmophobe.labels in the plotrix package.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where to find a changelog for the survival package

2009-05-20 Thread Richard . Cotton
 since some days I try to use the versions 2.35-4 of the survival 
 package instead of versions 2.31, I had installed until now. Several 
 changes in print.survfit, plot.survfit and seemingly in the structure 
 of ratetabels effect some of my syntax files.
 Is there somewhere a documentation of these changes, besides the code 
itself?

It's in the repository on R-Forge.  The latest version is here:
http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/survival/Changelog.09?rev=11234root=survivalview=markup

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to google for R stuff?

2009-05-20 Thread Richard . Cotton
 One thing I find most frustrating about R is how difficult it is to use
 Google (or any other search tool) to look for answers to my R-related
 questions.  With languages with even slightly more distinctive names 
like
 Perl, Java, Python, Matlab, OCaml, etc., usually including the name of 
the
 language in the query is enough to ensure that the top hits are 
relevant.
  But this trick does not work for R, because the letter R appears by 
itself
 in so many pages, that the chaff overwhelms the wheat, so to speak.

There are loads of ways of finding information.

Use the function RSiteSearch, or
The R mail archive  
http://www.googlesyndicatedsearch.com/u/newcastlemaths
RSeek   http://www.rseek.org/
R Searchhttp://www.dangoldstein.com/search_r.html  

The R Graph Gallery http://addictedtor.free.fr/graphiques/
R Help Wiki http://wiki.r-project.org/rwiki/doku.php
R manuals   http://cran.r-project.org/manuals.html
FAQshttp://cran.r-project.org/faqs.html
Task Views  http://cran.r-project.org/web/views/
News
http://www.r-project.org/doc/Rnews/index.html
Books   
http://www.r-project.org/doc/bib/R-books.html
Cranberries http://dirk.eddelbuettel.com/cranberries/

R-Forge (http://r-forge.r-project.org/) and Bioconductor (
http://www.bioconductor.org/GettingStarted) also have their own search 
tools.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help on Nan error

2009-05-15 Thread Richard . Cotton
 When i want to do ANOSIM i get an NaN error message. What is wrong?
  (lots of other code)
  iwithin=rep(0,(N*(N-1)/2) )
  r.w=sum(r*iwithin)/sum(iwithin)

iwithin is a vector of zeroes and so is its sum.  r*iwithin is also a 
vector of zeroes, and so is its sum.  Thus r.w=sum(r*iwithin)/sum(iwithin) 
is zero divided by zero, which is not defined.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding root using Newton's method

2009-05-14 Thread Richard . Cotton
 I want generate R code to determine the real root of the polynomial
 x^3-2*x^2+3*x-5. Using an initial guess of 1 with Newton's method.


 Homework? - see your instructor! Otherwise, provide minimal 
self-contained
 code and show us where you are stuck.

It gets a little suspicious when there are several people asking to 
determine the root of the exact same polynomial using the same method.  Is 
it really so hard to RSiteSearch for an appropriate keyword?

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] decimal troubles ?

2009-05-12 Thread Richard . Cotton
 I have some trouble with the number of decimals in R (currently R 
 2.9.0). For instance:
 
   options()$digits
 [1] 3
 
 let me hope that I will get three digits where useful when a number is 
 printed. BUT:
 
   44.25+31.1+50
 [1] 125
 
 No way to get the right result 125.35
 
 Can anybody tell me what's happens ?

The digits option specifies the number of significant figures, not the 
number of decimal places.  (The help documentation on the options page 
doesn't make this clear at the moment, though it does point you to 
print.default, which describes it as setting significant digits.)

Also note that the true value is being stored, so you can retrieve it with 
explicit formatting, e.g.

x - 44.25+31.1+50
x   # 125
print(x, digits=5)  # 125.35

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2009-05-12 Thread Richard . Cotton
 if i have the following function,
 
 f - function(x) x^3-2*x^2+3*x-5
 
 i need a simple function for the derivative of this with respect to 
 'x', so that i can then sub in values to the the derivative 
 function, and use Newtons method of finding a root for this.

You could take a look at ?deriv, but since the example you presented is 
just a polymonial, you can probably learn about calculus (
http://en.wikipedia.org/wiki/Derivative#Derivatives_of_elementary_functions
) and differentiate it in your head in less time than it took you to write 
this help question.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice histogram for multiple variables : adjusting x axis

2009-05-12 Thread Richard . Cotton
 I have a large data frame and I want to look at the distribution of 
 each variable very quickly by plotting an individual histogram for 
 each variable.
 I'd like to do so using lattice.
 
 Here is a small example using the iris data set:
 
 histogram(as.formula(paste(~,paste(colnames(iris[,!sapply(iris,is.
 factor)]),collapse=+))),data=iris[,!sapply(iris,is.factor)])
 
 However in this graphic, the x-axis and the breaks are the same for 
 all 4 variables. I would like them to be adjusted to the data in 
 each panel separately. I have tried using the scales argument to no 
avail:
 
 histogram( as.formula(paste(~,paste(colnames(iris[,!sapply(iris,
 is.factor)]),collapse=+))) ,data=iris[,!sapply(iris,is.factor)], 
 scales=list(x=list(relation=free)) )
 
 
 How can I individualize the x-axis and breaks for each individual panel?

You're on the right track.  If you don't specify a breaks argument, then 
histogram calculates a common axis and breaks across all the panels. 
Declaring breaks=NULL will force it to calculate them for each panel.

histogram( 
 
as.formula(paste(~,paste(colnames(iris[,!sapply(iris,is.factor)]),collapse=+))),
 

   data=iris[,!sapply(iris,is.factor)], 
   scales=list(x=list(relation=free)),
   breaks=NULL
)

You can also specify a hist-style method, e.g. breaks=Sturges to get 
what you want.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how the break is calculated by R?

2009-05-12 Thread Richard . Cotton
 As to hist,the help file says: R's default with equi-spaced breaks 
 (also the default) is to plot the counts in the cells defined by 
breaks. 
 I wanna know how the break is calculated by R? 
 
 In other words: break = (max - min)/(number of group) but how the 
 number of group is calculated by R? Thanks!

If you look closely at the help page for hist, you'll see that in the 
Usage section it says breaks=Sturges.  Then in the Arguments section it 
says that when breaks is a character string, it means it is the name of a 
method to compute the number of cells.  Finally, in the Details section, 
it tells you to look at the nclass.Sturges help page for details of 
Sturges' algorithm.

Wikipedia has a good description of common alorgithms for calculating the 
number of breaks.
http://en.wikipedia.org/wiki/Histogram#Number_of_bins_and_width

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xyplot: no ticks for a factor scale?

2009-05-12 Thread Richard . Cotton
 I would like to have no ticks on a scale that represents a factor. The
 tick.number argument from scales does not work in such a situation, as
 the help page as well as this simple (fairly stupid) code show:
 
 require(lattice)
 fact-gl(4,1,labels=LETTERS[1:4])
 y-c(1,4,3,2)
 xyplot(y~fact,scales=list(x=list(tick.number=0)))
 
 I want to leave out the ticks at the x-axis (removing labels is easy,
 of course). What can I do with that?

Replacing tick.number with tck will do the trick.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] working with groups of labels?

2009-05-11 Thread Richard . Cotton
 I have a graph with groups of variables.  I have include the group 
 names as variables so that I can have them positioned correctly. 
 Unfortunately this means that the group names have to follow all of 
 the same rules as the variables within the groups.  I would rather 
 have those group names left justified and bolded.
 
 I know that I can do this in an mtext() function using font=2 and 
 adj=0, but I can never get the labels lined up exactly right.
 
 Is there a simpler way of getting one group of labels right justified 
 and another group of labels left justified and still have them all 
 automatically positioned correctly?

mtext accepts vector inputs, so you can specify different justifications 
for different labels, e.g.

plot(1:10)
mtext(c(foo, bar, baz), adj=c(0, 0.5, 1), side=1)

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What does it mean by skip=2 and skip=7?

2009-05-11 Thread Richard . Cotton
 Can anyone tell me what is skip=2, skip =7

From ?read.csv:

skip: integer: the number of lines of the data file to skip before
  beginning to read data.

 and %in% mean here?

%in% matches values; see ?'%in%', and example('%in%')

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] working with groups of labels?

2009-05-11 Thread Richard . Cotton
It seems that the structure of your data is that you have two groups 
(Real Bad Stuff and Other Crazy Things) which are then subdivided into 
further categories.  I'd be tempted to set your data up like this:

dfr - data.frame(
   score=c(23, 14, 17, 8, 43, 13),
   group=rep(c(Real Bad Stuff, Other Crazy Things), each=3),
   variable=c(Real,Bad,Stuff,Other,Crazy,Things))

Then you can use a lattice plot to emphasise the different groups, and 
avoid your mtext worries.

library(lattice)
barchart(
   variable~score|group, 
   data=dfr, 
   horizontal=TRUE,
   layout=c(1,2),
   scales=list(y=list(relation=free)),
   prepanel=function(x,y,...)
   {
  y - y[,drop=TRUE]
  prepanel.default.bwplot(x,y,...)
   }, 
   panel=function(x,y,...)
   {
  y - y[,drop=TRUE]
  panel.barchart(x,y,...)
   }
)

(The scales, prepanel and panel arguments are just to get rid of the 
variables that appear in one group but not the other.)  I realise it isn't 
exactly what you asked for, but I thought you might be interested in the 
alternate approach.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting questions

2009-05-08 Thread Richard . Cotton
 1. How to plot several lines in a figure? Suppose I have several sets of
 points (xi,yi), where xi and yi are equal-length vector. plot(x1,y1) 
will
 give a line connecting these points. Another plot(x2,y2) will erase what
 plot before and plot the new line. Can I have these lines all drawn in 
the
 same figure?

#Draw your plot
plot(seq(0,1,length.out=20))

#Add lines to the existing plot
lines(runif(20))

#Add points to the existing plot
points(runif(20), col=red)

 2. How to open another figure window? Repeating plot will redraw in the 
same
 window instead of opening another one.

On Windows, windows() will open a new figure window; quartz() on Mac OSX 
and x11() on Linux do the same.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I'm offering $300 for someone who know R-programming to do the assignments for me.

2009-05-08 Thread Richard . Cotton
 There are six assignments in total. It won't take you long if you were
 familiar with R. For those who are interested, please send me an email
 with your profile (your experience with R, how long and how often have
 you been using it.) I will be paying through paypal. Thanks!

Now see, you made three mistakes here.

First, you asked for help cheating with homework on a mailing list filled 
with lecturers.  Second, you offered a pitiful amount to do something that 
unethical.  And thirdly, you violated the rules of the mailing list by not 
including commented, minimal, self-contained, reproducible code.

If you can't even cheat properly, you deserve to fail.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] howto find x value where x=max(x)

2009-05-08 Thread Richard . Cotton
 fp is a data frame like this
 
 ,[ fp ]
 |Frequenz  AmpNorm
 | 1  3322 0.0379490639
 | 2  3061 0.0476033058
 | 3  2833 0.0592954124
 | 4  2242 0.1275510204
 `
 
 i want to find the Frequenz where AmpNorm is max.

Use which.max.

fp - data.frame(Freqenz=c(3322,3061,2833,2242), 
AmpNorm=c(0.0379,0.0476,0.0592,0.1276))
index.of.max - which.max(fp$AmpNorm)
fp$Freqenz[index.of.max]

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create Pie chart from .csv file

2009-05-07 Thread Richard . Cotton
 Ive found out a way around my problem. I was trying to plaot a histogram 
of
 strings, but I had to change it into integers. I ran an sql query on the
 original DB that I got the CSV file from and used COUNT to get the 
number of
 each unique item in a given column. I then used these numbers to create 
a
 histogram in R. It's a round about way of doing it, but it will have to
 suffice.

Glad you found a solution to your problem.  For future reference, an 
easier way to convert strings to numbers is to use as.numeric or 
as.integer.

as.numeric(1.23)  # 1.23
as.integer(4) # 4

If the process of reading in your data from the CSV file converts the 
strings to factors, then see the FAQ on R (
http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f).

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Show name of dataset in graph

2009-05-07 Thread Richard . Cotton
 I?ve written a script to run several multivariate statistical analysis
 automatically.
 As one result a biplot and screeplot is produced.
 Now I?d like to display the name of the inputdatset as part of the title 
of
 these graphics and I do not want to enter it each time I run the script.
 How can I extract the name of a dataset?
 An (shortened) extraction from the script:
 
 Test1 - function(pri, sec, Pca, Ca, Dca){
 #pri  - primary Parameters (biological Dataset (abundances))
 #sec ? secondary Parameters 
 #Pca, Ca, Dca ? Default=F, If set to TRUE these Methods are active
 ?several mathematical Operations?
 biplot(pri, pc.biplot=T, main=paste(?PCA - ?, ???)?.
 
 Instead of the ??? I? d like to have a variable.
 For example if 
 Test1(dataset.typ5, sec.typ5, Pca=TRUE)
 then the title of the plots should be 
 ?PCA ? dataset.typ5?

This demonstrates the basic idea:

dfr1 - data.frame(x=1:10, y=runif(10))
dfr2 - data.frame(x=11:20, y=runif(10))

library(lattice)
myplot - function(dfr)
{
   xyplot(y~x, 
   data=dfr, 
   main=as.character(substitute(dfr))) 
}

myplot(dfr1)
myplot(dfr2)

Regards,
Richie.

Mathematical Sciences Unit
HSL


ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create Pie chart from .csv file

2009-05-05 Thread Richard . Cotton
 I am looking to create a pie chart from a given column in a .csv file. 
 
 My class variables are as follows:
 
 entry_type,   uniquekey,   types, title,url, abstract, journal, 
 author, month,
 year, howpublished
 
 So say I want to export a pie chart that groups together all  entries 
under
 'types' e.g. 3 x statistics 2x education etc. Im looking to have a 
piechart
 represent this graphically that shows which type of entry is in most
 frequently. Preferably I'd like to export to a PDF chart and while I can 
do
 this by typing variables directly into the R console, I cannot manage it
 from a .csv file.
 
 If you cannot help me with this specific problem, just knowing how to 
create
 a generic pie chart would be a great help.

Split the problem up into different stages.

1. Read in your data form the csv file.
?read.csv

2. Plot something.
?pie if you really must, but a bar chart will almost certainly be better. 
See ?barplot and ?barchart (lattice package).

3. Export the graph to PDF.
?pdf

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] displaying percentage in bar plot

2009-04-24 Thread Richard . Cotton
 I have a following data
 
  AIS LEvel
  1  23
 body regionA  10  15   20
  B  15  25  15
 
 Now I want to plot a barplot and in each bar (corresponding a body 
region),
 I need a percentage of AIS level 1 displayed in the plot. Is there an 
easy
 way to do this ?

Try this:
x - matrix(c(10,15,15,25,20,15), ncol=3)
pct - apply(x, 1, function(x) 100*x[1]/sum(x))
custompanelfn - function(...)
{
   panel.barchart(...)
   panel.text(x=c(5,5), y=c(1,2),paste(format(pct, digits=3),%, sep=))
}
library(lattice)
barchart(x, panel=custompanelfn)

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plots - several pages per pdf - quality/size issue

2009-04-23 Thread Richard . Cotton
 I hope that question will not be too redundant (sorry if it is) but 
 i don't seem able to find the answer i need in the archives...
 
 I try to create a file which would have 1.several pages and 2.
 several plots by page. I know how to make a pdf file this way, but 
 my problem is that the pdf size gets way too big and cannot be open.
 
 So my question is:
 - is there a way to diminish the pdf size/quality?
 - does any other file format exists that would be of smaller quality
 but still be able to print plots on several pages for the same file?

You could try postscript instead. e.g.

p1 - xyplot(runif(1)~rnorm(1)|sample(letters[1:3], 1, 
replace=TRUE), layout=c(1,1))

pdf(test.pdf)
print(p1)
dev.off()
# File size 623kb

postscript(test.ps)
print(p1)
dev.off()
# File size 257kb

The other alternative is to print to a raster format, e.g. png.  this 
creates different files for different pages, but you could always find a 
way to view multiple plot files at once.  Some possibilities:
1. Use, e.g.,  Picasa to browse your plot files.
2. Create a simple webpage containing each image

# R code
png(test%d.png)
print(p1)
dev.off()

# Web page code
!DOCTYPE html PUBLIC -//W3C//DTD XHTML 1.0 Strict//EN 
http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd;
html xmlns=http://www.w3.org/1999/xhtml;
head
/head
body
img alt=first plot src=test1.png / 
img alt=second plot src=test2.png /
img alt=third plot src=test3.png /
/body

# File sizes 3*13kb + 1kb

Open and save this file in a word processor of your choice, and you can 
probably drop the file size even further.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with a defined function which cannot access a function defined outside of the function

2009-04-16 Thread Richard . Cotton
  i have a problem with a function that i defined:
 
  the function needs to use a function which is defined outside the 
function
 but
  i realised that this is not working. a friend told me that this must be 
a
 problem with
  hidden parameters.
 
  a workaround works when i just define all parameters instead of using 
the
 function
  itself. now the problem is that i want to integrate the function and 
i'm
 not
  sure how to do it with this workaround since i have a plot of the 
function
 and
  a big vektor with all points of the plot on the x-axis only.

I suggest you read up a little on the search path and how R does lexical 
scoping (sections 6.3.5 and 10.7 in R-Intro).  The short version of that 
is that unless you specifically name the package that the function is 
from, R will look inside the current function, then in the global 
environment, then some other places. 

If you have problems with names of things being hidden, a quick hack is to 
rename the hidden item.

If you want a less abstract answer then please post a commented, minimal, 
self-contained, reproducible code example.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] geometric mean to handle large number and negative values

2009-04-15 Thread Richard . Cotton
 I have created two functions to compute geometric means. Method 1 can
 handle even number of negative values but not large number, vice versa
 for method 2. How can I merge both functions so that both large number
 and negative values can be handled ?
 
  geometric.mean1 - function(x) prod(x)^(1/length(x))
  geometric.mean2 - function(x) exp(mean(log(x)))
 
  geometric.mean1(1:1000)
 [1] Inf
  geometric.mean2(1:1000)
 [1] 3678798
 
  geometric.mean1(c(-5,-4,4,5))
 [1] 4.472136
  geometric.mean2(c(-5,-4,4,5))
 [1] NaN
 Warning message:
 In log(x) : NaNs produced

Geometric mean is usually restricted to positive inputs, because otherwise 
the answer can have an imaginary component.  If you really want the 
geometric mean of negative inputs, use the second method but convert the 
input to be a complex number first.

comp.x - as.complex(c(-5,-4,4,5))
geometric.mean2(comp.x)
# [1] 0+4.472136i

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] geometric mean to handle large number and negative values

2009-04-15 Thread Richard . Cotton
   geometric.mean1 - function(x) prod(x)^(1/length(x))
   geometric.mean2 - function(x) exp(mean(log(x)))
   geometric.mean1(c(-5,-4,4,5))
  [1] 4.472136
   geometric.mean2(c(-5,-4,4,5))
  [1] NaN
  Warning message:
  In log(x) : NaNs produced

 comp.x - as.complex(c(-5,-4,4,5))
 geometric.mean2(comp.x)
 # [1] 0+4.472136i

Obviously, there's a discrepancy between the answer of geometric.mean1 and 
geometric.mean2 with complex inputs.  Having thought about it a little 
more, I think the problem is with my solution.

The log of a complex number decomposes as log(z) = log(abs(z)) +1i*Arg(z). 
 When you sum the second components, you need to take the answer modulo 
2*pi, since the phase angles wrap around.

Here's an alternative geometric mean function that takes that into 
account.

geometric.mean3 - function(x) 
{
   a - mean(log(abs(x)))
   b - 1i/length(x) 
   c - sum(Arg(x))%%(2*pi)
   exp(a+b*c)
}

geometric.mean3(comp.x)
# [1] 4.472136+0i

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extending a vector to length n

2009-04-15 Thread Richard . Cotton
 In general, how can I increase a vector of length m ( n) to length n
 by padding it with m - n missing values, without losing attributes?
 The two approaches I've tried, using length- and adding missings with
 c, do not work in general:
 
  a - as.Date(2008-01-01)
  c(a, NA)
 [1] 2008-01-01 NA
  length(a) - 2
  a
 [1] 13879NA
 
 
  b - factor(a)
  c(b, NA)
 [1]  1 NA
  length(b) - 2
  b
 [1] aNA
 Levels: a

You can save the attributes to a new variable, increase the length, then 
reapply the attributes, e.g.

extend - function(x, n)
{
   att - attributes(x)
   length(x) - n
   attributes(x) - att
   x
}

a - as.Date(2008-01-01)
extend(a, 2)
# [1] 2008-01-01 NA

b - factor(a)
extend(b, 2)
# [1] aNA
# Levels: a

It would, perhaps, be nicer if length had an option to preserve 
attributes.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot with two y axis

2009-04-14 Thread Richard . Cotton
 I have some data which needs to be plotted with lattice.
 
 library(lattice)
 cars - c(0.1, 0.3, 0.6, 0.4, 0.9)
 trucks - c(0.2, 0.5, 0.4, 0.5, 0.1)
 drivers-c(121,145,167,200, 210)
 year-c(2005,2006,2007,2008,2009)
 type-c(local,local,foreign,foreign,foreign)
 xyplot(cars+trucks~year|type, data=df3, type=o)
 
 Basically, I need to show drivers as well on a secondary y-axis. Is 
this
 possible with lattice and xyplot?

The trick is to use a custom y-scale component to draw the second axis. 
You also need to manually make space between panels using the between 
argument.

#Your data, as before
library(lattice)
cars - c(0.1, 0.3, 0.6, 0.4, 0.9)
trucks - c(0.2, 0.5, 0.4, 0.5, 0.1)
drivers-c(121,145,167,200, 210)
year-c(2005,2006,2007,2008,2009)
type-c(local,local,foreign,foreign,foreign)

#Custom y-scale component
myyscale.component - function(...)
{
  ans - yscale.components.default(...)
  ans$right - ans$left
  foo - ans$right$labels$at
  ans$right$labels$labels - as.character(200*foo)  #200 is the 
scale factor difference between axes, adjust as necessary
  ans
} 

#The plot
xyplot(cars+trucks+drivers/200~year|type, 
   type=o,
   scales=list(
  x=list(alternating=FALSE), 
  y=list(relation=free, rot=0)),  # relation=free separates 
the panels
   yscale.component=myyscale.component,
   between=list(x=2),   #between creates more space 
between the panels - you may need to adjust this value
   par.settings=list(layout.widths=list(right.padding=6)), #this creates 
more space on the right hand side of the plot
   ylab=NULL, 
   key=simpleKey(text=c(cars, trucks, drivers))) 


Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] default print format for large numbers

2009-04-08 Thread Richard . Cotton
 Numbers like ``1239178547.653775 is inserted into a vector. I print
 the vector:
 
  route_9_80_end
  [1] 1239178522 1239178526 1239178524 1239178524 1239178524 
 1239178523 1239178524 1239178522 1239178521 1239178565 1239178566 
1239178566
 [13] 1239178565 1239178566 1239178566 1239178565 1239178566 1239178566
 
 and the decimals are suppressed. While R is wonderful and 
 wonderfully documented, I can only find some quasi relavent 
 information on options(digits) or some such as might apply here. How
 can I force R to print some part of the decimal i.e. back to 
 1239178547.653775?

When you create the vector, each number is stored as a double-precision 
floating point number.  When you type the name of the variable the print 
function is called, which displays the numbers on the console.

You can also manually call the print function, and set the digits argument 
to a high number to show all digits.

If you want to see all the digits every time any number is displayed, use 
options(digits).

For example:
x - 1239178547.653775
x
# [1] 1239178548
print(x)
# [1] 1239178548
print(x, digits=20)
# [1] 1239178547.653775
options(digits=20)
x
# [1] 1239178547.653775

Note that the accuracy of the floating point numbers is limited to 
.Machine$double.eps, which is typically 2.2e-16.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and .net/C#

2009-04-07 Thread Richard . Cotton

There seems to be a way for calling R from .net. However, is there anyway
for calling .net/C# code from R? Something similar to the RJava package
   for
.net?
   To the best of my knowledge, there isn't at the moment.  The latest release
   of MATLAB does though, so in the spirit of competitiveness, we really ought
   to include this feature in R.  (With support for Mono, for the Linux users
   too, while I'm making requests.)
   See this video for its use in MATLAB:
   http://www.mathworks.com/support/2009a/matlab/7.8/demos/New-MATLAB-External-
   Interfacing-Features-in-R2009a.html
   Does anyone have any idea how much effort it would take to implement this
   feature?
   Regards,
   Richie.
   Mathematical Sciences Unit
   [1]HSL
 _

   ATTENTION:



   This message contains privileged and confidential information intended for
   the addressee(s) only. If this message was sent to you in error, you must
   not disseminate, copy or take any action in reliance on it and we request
   that you notify the sender immediately by return email.



   Opinions expressed in this message and any attachments are not necessarily
   those held by the [2]Health and Safety Laboratory or any person connected
   with the organisation, save those by whom the opinions were expressed.



   Please note that any messages sent or received by the [3]Health and Safety
   Laboratory  email system may be monitored and stored in an information
   retrieval system.
 _
 _

   Scanned by MailMarshal - Marshal's comprehensive email content security
   solution. Download a free evaluation of MailMarshal at [4]www.marshal.com
 _

References

   1. http://www.hsl.gov.uk/contact-us.htm
   2. http://www.hsl.gov.uk/
   3. http://www.hsl.gov.uk/
   4. http://www.marshal.com/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference in client vs. server graphics defaults

2009-03-20 Thread Richard Cotton


lawnboy34 wrote:
 
 I am having trouble with the difference between default graphic settings
 on
 my client machine and the instance of R on our company's server. I created
 a
 script locally that output graphs, but when I run it on the server the
 output graphs have titles running past the margins, legends improperly
 placed, etc. I checked the default par() settings and found differences
 between my machine and the server's R instance in 6 parameters. I
 attempted
 to set the server-side par() settings to mimic my client machine, but I
 cannot change the three read-only settings of cra, cxy, and din, and the
 graphs still look distorted. Is there another way to make the two sets of
 defaults equal each other?
 
 I am using R 2.6.1 on the server and 2.7.0 on my client computer, but I am
 doubtful that this would have an effect. Any ideas?
 

Since we can't reproduce what you've done, it's hard to pinpoint the problem
exactly.  I suggest you start with a very simple plot, and create that on
your machine and the server.  Then gradually build up the plot until it
looks like the one you want.  That way you should be able to find the
offending line of code.

Also, if you write the graphs to file (using png, or whatever), then you
don't need to worry about the device or character  sizes, since you can
explicitly define them.

Finally, get your copy of R updated.  You should eliminate version issues
for your own sanity at least.

-
Regards,
Richie.

Mathematical Sciences Unit
HSL
-- 
View this message in context: 
http://www.nabble.com/Difference-in-client-vs.-server-graphics-defaults-tp22609493p22617103.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] who can give me some hint?

2009-03-12 Thread Richard . Cotton
  Hi All,
 
 
  act_2
  DateDtime Hour Min Second   Rep
 51 2006-02-22 14:52:18   14  52 18  useractivity_act
 52 2006-02-22 14:52:18   14  52 18 4
 55 2006-02-22 14:52:49   14  52 49 4
 57 2006-02-22 14:52:51   14  52 51 4
 58 2006-02-22 14:52:52   14  52 52 3
 60 2006-02-22 14:54:42   14  54 42 useractivity_idle
 
 I want to change act_2 to 
  DateDtime Hour Min Second   Rep
 
 51 2006-02-22 14:52:18   14  52 18  useractivity_act
 
 52 2006-02-22 14:52:18   14  52 18 4
 58 2006-02-22 14:52:52   14  52 52 3
 60 2006-02-22 14:54:42   14  54 42 useractivity_idle
 
 in other word, I want to keep 1st if there are many repeated value, 
 I made the program as:
 
 
 rm_r-function(act_2){
  dm-dim(act_2)[1]-1
  for(i in 2:dm){
 
  if(act_2$Rep[i+1]==act_2$Rep[i]){
act_2-act_2[-(i+1),]
}else{
act_2-act_2
}
  }
 return(act_2)
 }
 
 when it moved one row on 1st loop, i should still start 2 but it 
 become 3 at 2nd loop, if I add i-i-1, then i go to 1
 seems not reasonbale. How should I modify it`?

Please don't repeatedly post the same question - it is irritating, and 
you're not likely to get a favourable response.  Try explaining your 
problem more clearly.  What is the condition that you want to use to keep 
rows?  (In your example, each of the rows is different, yet you kept some 
and discarded others.)

If you just want to discard some rows form a data frame, you don't need a 
loop, e.g.

dfr - data.frame(x=1:10, y=runif(10))
dfr[c(1,3,5,5),]

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with loop

2009-03-12 Thread Richard . Cotton
 I'm trying to write a loop to sum my data in the following way:
 (the second - the first) + (the third - the second) + (the fourth - 
 the third) + ...
 for each column.

This is just sum(diff(x)), or even x[length(x)] - x[1].

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help

2009-03-09 Thread Richard . Cotton
 I am trying to excess the inbuit .Fortran and .C codes of R. Can any one
 help me in that. For example in kmeans clustering the algorithms are 
written
 in .Fortran I want to access them and  see the .Fortran syntax of the 
codes.
 Can any one help me how can I do that?

Download the R source code from your nearest CRAN mirror.  The Fortran/C 
code is in the src subfolder.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Normal Probability Plot - Time series

2009-03-09 Thread Richard . Cotton
 I have some time-series data and wish to plot a normal probability plot 
in
 R. How do I go about this?

In R, they are known as quantile-quantile plots.  Check out ?qqplot and 
?qqnorm.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding text to both grid and base graphs

2009-03-09 Thread Richard . Cotton
 I generate graphs using both the grid system (with lattice) and the base
 system. I'd like to be able to identify these graphs later on with a bit 
of
 identifying text (e.g. a date and some comments). Adding text to these
 graphs cannot be done using a common system if you want to save them as 
emf
 files. I now use:
 
 mtext(labelling
 text,outer=TRUE,at=0.5,adj=0.5,line=-1.1,side=1,cex=0.7,col=darkgrey)
 
 for base graphs and 
 
 grid.text(labelling 
text,x=0.5,y=0.005,just=bottom,gp=gpar(fontsize=8,
 col=darkgrey))
 
 for grid graphs.
 
 Unfortunately if I were to use mtext on a grid graph and copy to emf, I 
get
 the error:
 Error in dev.copy(win.metafile, pltfile) : invalid graphics state
 
 Is there a way to detect the current graphics state so I can choose 
which
 command to give automatically? (or is there an easier way to go about 
this?)

The easier way: use win.metafile to save your plots, e.g.

win.metafile(base.emf)
plot(runif(50))
mtext(labelling 
text,outer=TRUE,at=0.5,adj=0.5,line=-1.1,side=1,cex=0.7,col=darkgrey)
dev.off()

win.metafile(grid.emf)
xyplot(runif(50)~1:50)
grid.text(labelling text,x=0.5,y=0.005,just=bottom,gp=gpar(fontsize=8, 
col=darkgrey))
dev.off()

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dummy variable encoding

2009-03-06 Thread Richard . Cotton
  The best encoding depends upon which language you would like to 
manipulate 
  the variable in.  In R, genders are most naturally represented as 
factors. 
   That means that in an external data source (like a spreadsheet of 
data), 
  you should ideally have the gender recorded as human-understandable 
text 
  (male and female, or M and F).  Once the data is read into R, 
by 
  default R will convert the string to factors (keeping the human 
readable 
  labels).  This way you avoid having to remember that 1 means male (or 
  whatever).
 
  If you were manipulating the data in a different language that didn't 
have 
  factors, then it might be more appropriate to use an integer.  Which 
  integers you use doesn't matter, you need to have a look-up table to 
know 
  what each number refers to, whatever you choose.
 
 Yes, that's what I thought. However somebody told me that it is better
 to use 1/2 rather than 0/1 for a 2 level factor such as gender, and I've
 no idea why. I told them it didn't matter, but have since seen quite a
 few examples where they use 1/2 (admittedly in SPSS).

The only benefit that I can see of using 1/2 instead of 0/1 is fairly 
minor.

If you have cases where there are missing values, and you are working in a 
language that doesn't support NA values for integers (or factors; I'm 
thinking of something like C), then you could encode your genders as

0: not recorded
1: female
2: male

Then you can include logic like

if(gender)
{ 
   do something
}

The alternative encoding of 0/1, would be something like

-1: not recorded
0: female
1: male

This makes the code slightly less pretty.

if(gender != -1)
{ 
   do something
}

Again, none of this really applies to R, since you should be using factors 
for this sort of variable.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About warnings for non-matched items

2009-03-06 Thread Richard . Cotton
 I have many files in my directory. I want to transfer each data into
 one which is readable.
 They have so many possibilities, i have collected(manually and 
 visually) all possibilities and represent them as different numbers.
 
 Rep[grep('context_log',log1$Remain[1:length(log1$Date)]),]-2
 Rep[grep('gs',log1$Remain[1:length(log1$Date)]),]-5
 Rep[grep('ClockApp',log1$Remain[1:length(log1$Date)]),]-6
 Rep[grep('mce',log1$Remain[1:length(log1$Date)]),]-7
 ..
 I manually collect all possibilities contained in all files!!!
 (manually and visually: this process is so time-consuming, can i 
 have better ideal to collect all possibilities by computer rather 
 than by myself?)
 
 The programs are fine, but each file doesn't match all 
 possibilities, whenever there are non-matched items with above, then
 warning information come ups:
 31: In max(i) ... : no non-missing arguments to max; returning -Inf
 32: In max(i) ... : no non-missing arguments to max; returning -Inf
 33: In max(i) ... : no non-missing arguments to max; returning -Inf
 they returns non-matched items.
 
 How to vanish those warnings?

The error means that the input you provided to the max function is missing 
or NULL or of length 0.  At a guess, what has happened is that you've 
called grep, which didn't match anything (and so returned integer(0)), 
then used that as an input to max.

To get rid of the warning, check the input to max first.  e.g.

x1 - c(foo, bar, foo)
g1 - grep(foo, x1)
if(length(g1)) max(g1)

g2 - grep(baz, x1)
if(length(g2)) max(g1)

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dummy variable encoding

2009-03-05 Thread Richard . Cotton
can anyone tell me why an encoding of 1/2 for a dummy variable for
two groups (e.g. gender) seems to be preferred over 0/1?
It's been bugging me for a while, 0/1 seems more natural, but I have
been told (without explanation) that 1/2 is better. Why?

The best encoding depends upon which language you would like to manipulate 
the variable in.  In R, genders are most naturally represented as factors. 
 That means that in an external data source (like a spreadsheet of data), 
you should ideally have the gender recorded as human-understandable text 
(male and female, or M and F).  Once the data is read into R, by 
default R will convert the string to factors (keeping the human readable 
labels).  This way you avoid having to remember that 1 means male (or 
whatever).

If you were manipulating the data in a different language that didn't have 
factors, then it might be more appropriate to use an integer.  Which 
integers you use doesn't matter, you need to have a look-up table to know 
what each number refers to, whatever you choose.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The Origins of R AND CALCULUS

2009-02-05 Thread Richard . Cotton
 Does any student, or teacher for that matter care whether Newton or 
Leibntiz
 invented calculas.

Students or teachers may not care, but Newton and Leibniz themselves were 
pretty bitter about who should get credit for what.
http://en.wikipedia.org/wiki/Newton_v._Leibniz_calculus_controversy

I think this whole debate has gotten rather silly.  The original article 
was targetted at non-techies, and provided at reasonable summary of what R 
does for that audience.  The followup clarified on the history of R.  As 
for Ashley Vance's journalism, well, he's more qualified to write about 
software than most, being the former editor of The Register.  Hopefully, 
this storm-in-a-teacup will blow over soon.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Equivalent of Hold On MatLab Command

2009-01-30 Thread Richard . Cotton
 Does R have a graphic command equivalent of MatLab Hold On ?
 I am trying to sabve on a pdf file a composite drawing. I first 
 declare the canvas size, then I define the layout, finally I 
 generate the 4 plots according to layout order. Eventually I close 
 the pdf file (dev.off()).
 The resulting PDF file contains 2 pages. The first one shows only 
 the layout. The second one shows the whole
 composite drawing.
 MatLab has the command Hold On to work around similar 
 synchronization problems.
 Does R have have an equivalent command ?

If you are using base graphics (i.e. not lattice/ grid graphics), then you 
can mimic MATLAB's hold on feature by using the points/lines/polygons 
functions to add additional details to your plots without starting a new 
plot.  In the case of a multiplot layout, you can use par(mfg=...) to pick 
which plot you add things to.

#open the pdf device
pdf(test.pdf)

#setup the layout
par(mfrow=c(2,2))

#plot at each location
for(i in 1:4) plot(runif(1:10))

#return to the first plot
par(mfg=c(1,1))

#add a line to this plot (without replacing anything)
lines(seq(0.1,1,0.1))

#close the device to complete the plot
dev.off()

If you have an extra page in you PDF file, it's probably because you've 
called plot.new() somewhere where you didn't mean to (either explicitly or 
implicitly).  Without a reproducible example, it's hard to help further.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Text Outside Lattice Plot

2009-01-21 Thread Richard . Cotton
 I created the graph at the bottom using xyplot in the lattice package. I
 added a title using the main=Title  command in xyplot, however it is
 plotted too close to the legend for my liking.  To remedy this I 
increased
 the upper margin of the plot using plot(data, position = c(0,0,1,.9)) 
and
 attempted to move SNA upwards and to the right.  I have tried using a
 variety of text functions such as:
 
 trellis.focus(panel, 1, 1) 
 panel.text(x=11, y=10, labels=SNA) 
 trellis.unfocus() 
 
 panel.xyplot(...) 
 panel.text(x=11, y=10, labels=SNA) 
 
 library(grid) 
 ltext(grid.locator(), label='SNA') 
 
 The first two of these functions work but the text disappears once I 
specify
 a y coordinate  ymax.  The last function appears to work but requires 
me to
 click on the plot to specify the location (I need this to be 
pre-defined). 
 Does anyone know how I can do this?

A simple way to get more space for the title is to change the layout 
height parameter.  Here's an adaptation of the OrchardSprays example in 
?xyplot.

xyplot(decrease ~ treatment, OrchardSprays, groups = rowpos,
type = a,
auto.key =
list(space = top, points = FALSE, lines = TRUE),
main=Orchard sprays example,
par.settings=list(layout.heights=list(main=4)))

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filling blanks with NA

2009-01-21 Thread Richard . Cotton
 I do have a data set with some missing values that appear as blanks. I 
want
 to fill these  blanks with an NA. How can this be done? Thanks for your 
help

Please help us to help you.  What form are the data in?  Are they in a 
text file, or are they in R already?  What do you mean by 'blanks'?  If we 
can reproduce your problem then we can solve it more easily.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] WinBUGS with R

2009-01-20 Thread Richard . Cotton
 I am having some problems using R with WinBUGS using the R2WinBUGS
 package. Specifically, when I try to run bugs() I get the following
 message.
 
 Error in FUN(X[[1L]], ...) :
 .C(..): 'type' must be real for this format
 
 
 To give a little more context, my bugs() command (for a multilevel
 ordinal logit  similar to Gelman and Hill, Data Analysis Using
 Regression and Multilevel/Hierarchical Models p. 383 is:
 
 Wednesbury.data - list (n.judge, n, n.cut, y judge, ct,
 ra, lg)
 
 Wednesbury.inits - function(){
   list(C=matrix(0,39,2))
}
 Wednesbury.parameters - c(C, b1, b2, b3)

Debugging your BUGS model or dataset via R is a bit of a pain.  I find 
that the best way (or maybe least worst way) to weed out the problems when 
you get an error like this is to find the files (model/data/inits) that 
R2WinBUGS has created and open them in WinBUGS itself.  Run the Model 
Specification tool and you can more easily determine which part of the 
file the problem lies in.

Just looking at your Wednesbury.inits variable, you don't need to define 
it as a function
Wednesbury.inits - list(list(C=matrix(0,39,2)))
will do.

Also, I'm not sure if WinBUGS understands matrix data types (though I may 
be wrong).

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Deleting columns where the frequency of values are too disparate

2009-01-19 Thread Richard . Cotton
 Please consider the following toy data matrix example, called x 
 for simplicity. There are 20 different individuals  (ID), with 
 information about the alleles (A,T, G, C) at six different loci 
 (Locus1 -  Locus6) for each of these 20 individuals. At any 
 single locus (e.g., Locus1 or Locus2, ... or Locus6), the 
 individuals have either one allele (from the set of A,T,C,G) or one 
 other allele (from the set of A,T,C, G). For example, at Locus1 
 individuals have have either the A or T allele only; at Locus2 the 
 individuals can have either C or G only; at Locus3 the individuals 
 can have either T or G only.
 
 IDLocus1Locus2Locus3Locus4Locus5Locus6
 1AGTAAC
 2AGGACC
 3ACGGCC
 4ACGGCC
 5AGGGAC
 6TGGGCC
 7TCGGCC
 8TCGGAC
 9TGGGCC
 10TCGGCC
 11AGGGAC
 12ACGGCC
 13AGGGCC
 14AGGGAC
 15ACGGCC
 16TCGGCC
 17TGGGAC
 18TGGGCC
 19TGGGCC
 20TCGGAC
 
 I want to delete any columns from the dataset where the rarer of the
 two alleles has a frequency of ten percent or less. In other words, 
 I would like to delete Locus3, Locus4, and Locus6 in this data 
 matrix, because the frequency of the rare allele is not greater than
 ten percent (and conversely, the frequency of the common allele is 
 not less than ninety percent). Please note that the frequency of the
 rare allele in Locus6 is equal to zero (conversely, the frequency of
 the common allele is equal to one hundred percent).
 
 Would one of you know of simple way to write this sort of code? (In 
 my real dataset, there are 1096 loci, so this cannot be done easily by 
eye.)

Most of the problem is just organising the data into a sensible form.

# read in data
data - readLines(tc - textConnection(1AGTAAC
2AGGACC
3ACGGCC
4ACGGCC
5AGGGAC
6TGGGCC
7TCGGCC
8TCGGAC
9TGGGCC
10TCGGCC
11AGGGAC
12ACGGCC
13AGGGCC
14AGGGAC
15ACGGCC
16TCGGCC
17TGGGAC
18TGGGCC
19TGGGCC
20TCGGAC)); close(tc)

# retrieve the useful bit
loci - sub([[:digit:]]{1,2}, , data)

# you may also want this
ID - grep([[:digit:]]{1,2}, data)

# find out how many of each base occurs at each locus
freqs - list()
n - length(ID)
for(i in 1:6)
{
   assign(paste(locus, i, sep=), factor(substring(loci,i,i), 
levels=c(A,C,G,T)))
   freqs[[i]] - summary(get(paste(locus, i, sep=))) 
}
freqs

# remove loci with 90% or more cases of same base
loci.to.remove - sapply(freqs, function(x) any(x0.9*n))

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] LCA (e1071 package): error

2009-01-15 Thread Richard . Cotton
 I will use the lca method in the e1071 package. But I get the following 
error:
 Error in pas[j, ] - drop(exp(rep(1, nvar) %*% log(mp))) :
   number of items to replace is not a multiple of replacement length
 
 Does anybody know this error and knows what this means?

The error means that you are trying to assign a variable of one size to a 
variable of another fixed size.  An example that recreates it is:
x - matrix(1:6, nrow=3)
x[1,] - 1:10

x[1,] cannot be resized from 3 to 10 without affecting the rest of x, so 
an error is thrown.

In your example, the jth row of the matrix pas is a different size from 
drop(exp(rep(1, nvar) %*% log(mp))).

Since you haven't provided a reproducible example (tut tut, read the 
posting guide) you'll have to do the debugging yourself.

To get you started, type traceback() to see where the problem occurs.

Now try options(error=recover), and call the lca function again.  Now you 
can examine the values of nvar, mp and pas to see what is going wrong.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R2WinBUGS stopping execution

2009-01-12 Thread Richard . Cotton
Apologies if this isn't acceptable for the general help list.

I'm running OpenBUGS model via the R2WinBUGS package interface, under 
Windows.  Is it possible to terminate running models, short of using the 
Windows Task Manager to forcibly exit the program?

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Saving plots as byte streams

2009-01-09 Thread Richard . Cotton
Is it possible to save plots as byte streams?  For example, if I want the 
bytes for a PNG plot, I could use

#Write the plot to a PNG file
png(test.png)
plot(1:10)
dev.off()

#Read the bytes back in from the file
plotbytes - readBin(test.png, raw, n=2000)

Ideally, I'd like to avoid having to bother writing to the file in the 
first place, and simply store the bytes in a variable.  How do I do this?

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop and if problem

2009-01-06 Thread Richard . Cotton
 I'm heaving difficulties with a dataset containing gene names and 
positions
 of those genes.
 Not such a big problem, but each gene has multiple exons so it's hard to 
say
 where de gene starts and where it ends. I want the starting and ending
 position of each gene in my dataset.
 Attached is the dataset:
 http://www.nabble.com/file/p21312449/genlistchrompos.csv 
genlistchrompos.csv 
 Column 'B' is the gene name, 'G' is the starting position and 'H' is the
 stop position.
 You can load the dataset by using: data-read.csv(genlistchrompos.csv,
 sep=;)
 I hope someone can help me, it's giving me headaches for a week now:-((.

which(diff(as.numeric(data$Gene))!=0)

will give you a vector of the last row in each gene.  The start position 
is obviously the next row after the previous end.

Also take a look at 

split(data, data$Gene)

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Borders for rectangles in lattice plot key

2008-12-23 Thread Richard . Cotton
Hopefully an easy question.  When drawing a rectangles in a lattice plot 
key, how do you omit the black borders?

Here is an example adapted from one on the xyplot help page:

bar.cols - c(red, blue)
key.list - list(
   space=top,
   rectangles=list(col=bar.cols),
   text=list(c(foo, bar))
)

barchart(
   yield ~ variety | site, 
   data = barley,
   groups = year, 
   layout = c(1,6),
   ylab = Barley Yield (bushels/acre),
   scales = list(x = list(abbreviate = TRUE, minlength = 5)),
   col=bar.cols,
   border=transparent,
   key=key.list
)

Notice the black borders around the rectangles in the key. 

I checked to see if there was an undocumented border component for the 
rectangles compoenent of key that I could set to transparent or FALSE, 
but no luck.  I also tried setting lwd=0 on the rectangle component but 
that didn't change anything either.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with Surv

2008-12-15 Thread Richard . Cotton
 I'm trying to estimate a tobit model with  survrec function. I use 
 the following code : 
 reg-survreg(Surv(crs_prod,crs_prod=1)~SOLVA+log(AF088)
 +HHI+ACTIONNA,data=dat,dist=gaussian)
 
 I get this error message with R 2.7.2 
 Error in survreg(Surv(crs_prod, crs_prod = 1) ~ SOLVA + log(AF088) +  : 

   Response must be a survival object

Take a look at Surv(crs_prod, crs_prod = 1).  Your example isn't 
reproducible (tut tut), but if I had to guess, I'd say that this is NULL, 
because crs_prod is column in dat, and Surv can't find it.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Legend and Main Title positioning

2008-12-05 Thread Richard . Cotton
 layout(matrix(c(1,2,3,4), nrow = 2, byrow = TRUE))
 plot(rnorm(100))
 plot(rnorm(200))
 plot(rnorm(300))
 plot(rnorm(400))
 
 Now, I'd like to create a legend below each plot and generate a common 
title.
 How can I do that?

If you are laying plots out in grids like this then lattice graphics are 
generally the way to go, but here's a solution based upon base graphics. 
The trick is to include extra potting space in your layout for the 
legends.  The code is messy, since it requires you to manually specify 
which cell of the layout to plot into, but I'm sure guven some thought you 
can automate this.

#4 space for plots, 4 for legends
layout(matrix(1:8, nrow = 4, byrow = TRUE), heights=rep(c(3,1),4))

#Check the layout looks suitable
layout.show(8)

#Avoid clipping problems, and create space for your title
par(xpd=TRUE, oma=c(0,0,2,0))

#First plot
plot(rnorm(100))

#Move down and plot the first legend
par(mfg=c(2,1))
legend(0,0, legend=foo, pch=1)

#Repeat for the other plots and legends
par(mfg=c(1,2))
plot(rnorm(200))
par(mfg=c(2,2))
legend(0,0, legend=bar, pch=1)

par(mfg=c(3,1))
plot(rnorm(300))
par(mfg=c(4,1))
legend(0,0, legend=baz, pch=1)

par(mfg=c(3,2))
plot(rnorm(400))
par(mfg=c(4,2))
legend(0,0, legend=quux, pch=1)

#Title for all the plots
title(main=4 plots, outer=TRUE)


Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] levels update

2008-12-05 Thread Richard . Cotton
 I hope this question is not too stupid. I would like to know how to 
update 
 levels after subsetting data from a data.frame.
 
 df - data.frame(factor(c(a,a,c,b,b)), c(4,5,6,7,8), 
c(9,1,2,3,4))
 names(df) - c(X1,X2,X3)
 
 my.sub - subset(df, X1 == a | X1 == b)
 levels(my.sub$X1)
 
 # still gives me a,b,c, though the subset does not contain entries 
with 
 c anymore

Two questions in one afternon; aren't I good to you!

levels(my.sub$X1[,drop=TRUE])
[1] a b
levels(factor(my.sub$X1))
[1] a b

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sink does not send graphs to sink file

2008-12-05 Thread Richard . Cotton
 I am using sink() to send the results of my analyses to a text file.
 Unfortunately my graphs do not become part of the file. Is there 
 anyway that I can have both the text and graphic output of my 
 analyses appear in a file?

You can create a latex document with text, graphs and R-code using 
?Sweave.  If you prefer to write to Open document format , there is an 
Odfweave package.

The other alternative is simply writing your graphs to files (or one file 
with all the graphs), e.g.
pdf(test.pdf)
plot(1:10)
hist(rnorm(100))
dev.off()

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with lattice-histograms or png within loops

2008-12-05 Thread Richard . Cotton
 I have a question concerning the use of lattice plots within for-loops.
 I want to create a png file containing a lattice histogram which works 
out
 fine (part 1).
 When I loop the whole code, the graphic file does not contain anything 
(part
 2). 
 I can fix it by wrapping the histogram function into a print command 
(part
 3).
 Why is that so? Why can I not loop it directly? 
 
 TIA,
 Mark
 
 attach(iris)
 
 ### part 1
 png(filename = graphic_1.png)
   histogram( ~ Sepal.Length | Species, data = iris)
 dev.off()
 
 ### part 2
 for (i in c(1))
 {
 png(filename = graphic_2.png)
   histogram( ~ Sepal.Length | Species, data = iris)
 dev.off()
 }
 
 ### part 3
 for (i in c(1))
 {
 png(filename = graphic_3.png)
   print(histogram( ~ Sepal.Length | Species, data = iris))
 dev.off()
 }

This is FAQ 7.22:
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-lattice_002ftrellis-graphics-not-work_003f

The help page on ?histogram also mentions that the function returns an 
object of class 'trellis' and the 'print' method (usually called by 
default) will plot it on an appropriate plotting device

It's a good idea to get into the habit of calling print any time you 
create a lattice graph.  If you reorder your code like

h1 - histogram( ~ Sepal.Length | Species, data = iris)
png(filename = graphic_h1.png)
  print(h1)
dev.off()

then you won't end up with an open device if drawing the plot fails.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question involving loops from intro level R programming class

2008-11-28 Thread Richard . Cotton
 a. Write a R function zerdiag.v1(m) using loop to output a square matrix
 whose diagonal elements are zero and the other elements are filled in by
 consecutive integers from 1 to m row-wise.
 
 For example,
 zerdiag.v1(6) =  [0, 1, 2]
   [3, 0, 4]
   [5, 6, 0]
 
 This function should have error checking ability. If the input m cannot 
form
 a square matrix, then the function will return an error message: Input
 number is incorrect.
 
 b. Write a R function zerdiag.v2(m) to produce the same output as in 
part
 (a) without using a loop.
 
 c. Test your functions in part (a) and (b) using m=12 and m=14 
respectively.
 
 I'd appreciate any help with this problem... I've spent a lot of time
 staring at it, and I'm still not sure where to start. Thanks!

As the posting guide (http://www.r-project.org/posting-guide.html) says
Basic statistics and classroom homework: R-help is not intended for 
these. 

Also, I find it a little bit cheeky that you''ve asked for help on 3 parts 
of your homework, and haven't provided any evidence that you've even 
attempted the problem yourself.  (If you've been staring at the problem 
for hours, you must have written something.)  Read the help pages ?matrix 
and ?for and take a look in the Intro to R manual.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] construct a vector

2008-11-26 Thread Richard . Cotton
 I have an unkown number of vectors (=2) all of the same length. Out
 of these, I want to construct a new one as follows:
 having vectors u,v and w, the resulting vector z should have entries:
 z[1] = u[1], z[2] = v[1], z[3] = w[1]
 z[4] = u[2], z[5] = v[2], z[6] = w[2]
 ...
 i.e. go through the vector u,v,w, take at each time the 1st, 2sd, ...
 elements and store them consecutively in z.
 Is there an efficient way in R to do this?

u - 1:5
v - (1:5) + 0.1
w - (1:5) + 0.2
as.vector(rbind(u,v,w))
# [1] 1.0 1.1 1.2 2.0 2.1 2.2 3.0 3.1 3.2 4.0 4.1 4.2 5.0 5.1 5.2

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] basic information defining functions

2008-11-25 Thread Richard . Cotton
 i am looking from some insights to define own R functions. so far i 
 found most basics in documentations that are around on the web. except 
 for one thing:
 
 I´d like to define some function, say:
 
 #assume my data matrix contains vectors like data$myColumn1,data 
 $myColumn2 etc.

Do you really mean data matrix, or do you mean data frame?

 getMyColumn - function (columnid){
 
 x-data$MyColumn?columnid?[data$indexone=1  data$index2=5]
 
 return(x)
 
 }

It's not terribly clear what you want to do with this function.  Question 
marks don't mean anything in the middle of a statement, so I don't know 
what 
data$MyColumn?columnid?[data$indexone=1  data$index2=5] 
is supposed to do.

Also, you don't need the return keyword - the last line of a function is 
the return value.

 Do I need to use assign or eval first ? I tried to use paste to 
 combine something like: paste(data$MyColumn,columnid,sep=) which 
 did not work.

paste returns a string, not a variable.  You can evaluate the contents of 
the string using 
eval(parse(text=paste(data$MyColumn,columnid,sep=)))
This is however a method of last resort; there is almost always a better 
way to do things than using eval.
 
 I am happy to get any help with the problem, but also thankful for 
 some useful link or guide on how to define own functions properly, 
 especially the dynamic naming and return part

You probably don't need to bother with dynamic naming, but take a look at 
FAW on R, q7.21.

Regards,
Richie.

Mathematical Sciences Unit
HSL


ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice contourplot background covers inward-facing ticks

2008-11-25 Thread Richard . Cotton
 I wish to have inward-pointing ticks on my contourplot graph, but the 
 colored background produced by the region=TRUE statement covers the 
 ticks up, is there any way around this? Sample code below.  --Seth
 
 library(lattice)
 
 model - function(a,b,c,d,e, f, X1,X2)  # provide model function 

 for contour plot
  {J - a + (b*X1) + (c*X2) + (d*X1*X2) + e*(X1^2) + f*(X2^2)
   pp - exp(J)/(1+exp(J))
   return(pp)}
 
 g - expand.grid(X1= seq(0.3,0.9,0.001), X2 = seq(0.3,1, 0.001))
 g$z - model(-29, -14, 52, 80, -3, -56, g$X1, g$X2) # Create 

 variable z using gridded data, model, and variables
 
 contourplot(z ~ X1 * X2, 
  data = g,
  region = TRUE, # Adds color to background
  cuts = 10, # Number of contour intervals...(and 
color 
 intervals!)
  scales = list(tck = c(-1,-1))  # ticks go inward 
  )   ### END

The easiest way is to make the contours semi-transparent by adding 
alpha.regions=0.7 (or whatever number you prefer) to the call to 
contourplot.  This has a couple of poosible disadvantages though:
1. It will change the colours you used
2. You can't print to some devices (windows metafile isn't supported)

Other solutions involve a lot of messing about.  You could manually draw 
ticks using grid.xaxis (and grid.yaxis) in the grid package, or stop 
plotting to the edges of the panel.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating sum of letter values

2008-11-24 Thread Richard . Cotton
 Thanks, that's almost exactly what I need...theres just a slight 
 difference with my requirement, in that I am looking for the actual 
 index value in the alphabetical sequence, so that instead of:
 
 as.numeric(factor(unlist(strsplit(XYZ,
 [1] 1 2 3
 
 I would expect to see
 
 [1] 24 25 26

A minor modeification of Mark's solution works in this case:

as.numeric(factor(unlist(strsplit(XYZ, )), levels=LETTERS))
# [1] 24 25 26

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R course in Scotland

2008-11-21 Thread Richard Cotton


pzs wrote:
 
 Several people have suggested that I just pick up R and give it a try. 
 My reluctance to do this is that I am already very familiar with my 
 current working method (Python + Numpy) and I worry that without a 
 course I will work in a Python-centric way, which won't be optimal.
 

I know this isn't what you asked, but as a sidenote, if you are used to
working with Python and Numpy, then have you considered Sage (
http://sagemath.org/ http://sagemath.org/ )?  It's a Python environment that
you can run R commands from (amongst other things).  That way you can keep
your current working style but let R do the hard stats for you.

-
Regards,
Richie.

Mathematical Sciences Unit
HSL
-- 
View this message in context: 
http://www.nabble.com/R-course-in-Scotland-tp20601268p20623414.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dequantizing

2008-11-20 Thread Richard . Cotton
 I have some data measured with a coarsely-quantized clock.  Let's say
 the real data are
 
   q- sort(rexp(100,.5))
 
 The quantized form is floor(q), so a simple quantile plot of one
 against the other can be calculated using:
 
   plot(q,type=l); points(floor(q),col=red)
 
 which of course shows the characteristic stair-step.  I would like to
 smooth the quantized form back into an approximation of the underlying
 data.
 
 The simplest approach I can think of adds a uniform random variable of
 the size of the quantization:
 
   plot(q,type=l); points(floor(q),col=red);
 points(floor(q)+runif(100,0,1),col=blue)
 
 This gives pretty good results for uniform distributions, but less
 good for others (like exponential).  Is there a better
 interpolation/smoothing function 

I'm not convinced that adding a random amount to the floor values to 'make 
up' the underlying data is very meaningful. 

If you know what the underlying distribution is, then you are best off 
using this distribution to generate plots and extra pretend data.

If you know the distribution is exponential, then you can estimate the 
rate by treating the true values as interval censored data, somewhere 
between floor and floor+1.

library(survival)
q - sort(rexp(100,.5))
qq - floor(q)
qq[qq==0] - 0.1 #survreg doesn't like values that are exactly 
zero
ss - Surv(qq, qq+1, type=interval2) 
model - survreg(ss ~ 1, dist=exponential)
summary(model)
rate - 1/exp(model$coefficients[(Intercept)]); rate   #hopefully 
something close to 0.5

If you don't know the underlying distribution either, then things get 
trickier.  Try a histogram/kernel density plot/boxplot to see what it 
looks like.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sequencially merge multiple files in a folder

2008-11-18 Thread Richard . Cotton
 Here I have a folder with more than 50 tab-delimited files. Each 
 file has a few hundreds of  thousands rows/subjects, and the number 
 of columns/variables of each file varies.The 1st row consists of all
 the variable names.
 
 Now I would like to merge all the files into one tab-delimited file 
 by a common column named Ident
 Is there any good way to sequencially merge all of them together?
 Here when I say  sequencially I mean merging file_1  and file_2 
 first and then merge the resulting data frame and file_3, and keep 
 going on and on till all files are merged.

Read each of the tab-delimited files into R using read.delim

dfr1 - read.delim(file1.txt)
dfr2 - read.delim(file2.txt)
dfr3 - read.delim(file3.txt)

(If the files are nicely named then you could do this in a loop.)

It's not entirely clear what you want to do next, since you haven't 
provided an example of your data.  Either

1. Use rbind to concatenate the data frames if each frame has the same 
columns

masterdfr - rbind(dfr1, dfr2)
masterdfr - rbind(masterdfr, dfr3)

2. Use merge to merge the data frames if they have different columns

masterdfr - merge(dfr1, dfr2)#you'll neeed to specify some other 
arguments

If you want a clearer answer, you'll have to read the posting guide and 
provide more details about your data.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Accessing Results from cenmle function in NADA package

2008-11-12 Thread Richard . Cotton
 I figured it out. In case anyone else ever has this question -- given 
the
 following output from cenmle:
 
 fit.cen - cenmle(obs, censored, groups)
  fit.cen
Value Std. Errorz p
 (Intercept)  1.19473 0.0772  15.4695  5.58e-54
 groups1  0.00208 0.0789   0.0263  9.79e-01
 Log(scale)  -0.40221 0.0168 -23.9034 2.82e-126
 
 Scale= 0.669 
 
 Log Normal distribution
 Loglik(model)= -3908.9   Loglik(intercept only)= -3908.9 
 Loglik-r:  0.0006265534 
 
 Chisq= 0 on 1 degrees of freedom, p= 0.98 
 Number of Newton-Raphson Iterations: 1 
 n = 1766 
 
 The p-value is (for example):
 
 p.value - summary(fit.cen)[[9]][[11]]

Or slightly more transparently
summary(fit.cen)$table[,'p']

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Barplot Labels Problem

2008-11-05 Thread Richard . Cotton
 I’m using barplot function. I pretend to create a horizontal barplot 
with
 two different information (side by side) for a species list. Well I can
 generate the graph easily, but the problem is that the labels with the
 species names are cut by device window!! I’ve tried lots of par 
functions
 none seems to work properly.
 
 Below are the script.
 
 
 
 Height is a matrix of species and two column of numerical data. I used
 par(las=) function to make the labels horizontal. When I do this they 
are
 cut by the device window. If I don’t do that the label stay vertical and
 make nonsense with the graph.
 
 
 
 library(xlsReadWrite)
 
 Spp-read.xls('SppPrincipaisFVeFT.xls',sheet=1,rowNames=T)
 
 Spp-as.matrix(Spp)
 
 
barplot(t(Spp),axis.lty='solid',horiz=T,beside=T,las=1,col=c('lightgray','bl
 ack'),main='Main Speceis')
 
 legend('topright',c('Living Fauna','Dead
 Fauna'),fill=c('lightgray','black'),bty='n')

I can't recreate your code, since you haven't made the file 
SppPrincipaisFVeFT.xls available.  I guess it's something like this:

x = c(monkey=25, giraffe=13, rhinocerous=46, squirrel=5, sting ray=36, 
cheetah=2, penguin=69, chicken=43)
par(las=1)
barplot(x)

On my machine, not all the animal names are shown when the device window 
opens.  You have several options to solve this.

1. Make the device bigger.  Either drag the corner of the window or, if 
you are writing directly to a file, try something like:
png(animal barplot.png, width=1200, height=800)
par(las=1)
barplot(x)
dev.off()

2. Make the axis text smaller, e.g.
barplot(x, cex.names=0.5)

3. Abbreviate the names, e.g.
barplot(x, names.arg=abbreviate(names(x)))

4. You can also get a little extra room by decreasing the plot margins and 
the amount of space between bars, e.g.
par(las=1, mar=c(2.5,3,0.5, 0.3))
barplot(x, space=0.05)

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential information intended
for the addressee(s) only. If this message was sent to you in error,
you must not disseminate, copy or take any action in reliance on it and
we request that you notify the sender immediately by return email.

Opinions expressed in this message and any attachments are not
necessarily those held by the Health and Safety Laboratory or any person
connected with the organisation, save those by whom the opinions were
expressed.

Please note that any messages sent or received by the Health and Safety
Laboratory email system may be monitored and stored in an information
retrieval system.



Scanned by MailMarshal - Marshal's comprehensive email content security
solution. Download a free evaluation of MailMarshal at www.marshal.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] counting run lengths

2008-10-27 Thread Richard . Cotton
  It works, but the for (i in ...) loop slows down the simulation a 
lot.
 
  Any suggestion on how to avoid this loop? (or in general, to speed up 

  this part of the simulation)

 Actually, I have not  specified the following:  i want to consider only
 the most recent sequence of zeros, that is the last part of the time
 series.
 
 That is, If  I have:
 
 [,1] [,2] [,3] [,4]
 [1,]0101
 [2,]1111
 [3,]1101
 [4,]1101
 [5,]1101
 [6,]0101
 [7,]0101
 [8,]0101
 [9,]0101
 
 I want to store the values (4 0 7 0) in unSpells. That is the values
 that represent the last sequence of zeros (that is I want to ignore
 other possible zeros like the ones I have inserted above).

This runs pretty quickly:
unSpells - nrow(Atr) - apply(Atr,2,function(x) max(which(x==1))) 
#c(4,0,7,0)

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Arrays of Trellis plots

2008-10-27 Thread Richard . Cotton
 the example below does not work. (i know it's not supposed, but it makes 
it
 clear what i'm trying to achieve)
 
 par(mfrow=c(2,1))
 xyplot(y~x2|x1,data=dataframe1,pch=20)
 xyplot(y~x2|x1,data=dataframe2,pch=20)
 
 i know i could probably merge the two datasets and do something like
 xyplot(y~x2|x1+dataset,data=merged)

par is a base graphics command, and doesn't work with grid/lattice 
graphics.  While it is possible to merge grid and base graphics using for 
example the gridBase package, I suspect what you want is to draw two 
lattice plots on the same figure. For this, you need to read up on 
viewports, and try an example like this:

pushViewport(viewport(layout=grid.layout(2,1)))
pushViewport(viewport(layout.pos.row=1))
topplot = xyplot(Sepal.Length ~ Petal.Length | Species, data = iris)
print(topplot, newpage=FALSE)
upViewport()
pushViewport(viewport(layout.pos.row=2))
bottomplot = xyplot(Sepal.Width ~ Petal.Width | Species, data = iris)
print(bottomplot, newpage=FALSE)
popViewport(2)

See also section 5.5 in Paul Murrell's book ('R Graphics').

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert matrix to dataframe with repeating row names

2008-10-20 Thread Richard . Cotton
  The row names on a data frame should be unique.  You can try
  as.data.frame(xx, row.names=FALSE) to convert zz to be a data frame. 
If
  you need the row name information, add it as a column in the data 
frame,
  e.g. mydataframe$rnames - rownames(zz).  (Note to R-Core: the
  documentation for as.data.frame doesn't mention the usage of
  row.names=FALSE to ignore row names, but it seems to work 
consistently.
  Does the help page for as.data.frame need updating?)
 
 No.  row.names=FALSE is not intended to work, and did you check every 
 single as.data.frame() method?
 
 It just so happens that for the matrix method invalid input for 
 'row.names' results in setting default row names.  Other methods may 
 differ.

row.names=FALSE seems a natural way of supressing existing row names to 
me, since it corresponds nicely to using row.names=FALSE in write.csv. 
Currently it seems that if a matrix has duplicate row names, then 
converting it to be a data frame requires

rnames - rownames(mymatrix)
rownames(mymatrix) - NULL
as.data.frame(mymatrix)
rownames(mymatrix) - rnames 

Ideally, three of these lines of code shouldn't really need to be there.

If you disagree that allowing row.names=FALSE is a good idea, or you don't 
want to change the function interface, then perhaps having as.data.frame 
check for duplicates and throwing a warning (rather than an error) would 
be preferable behaviour.  I do realise that there are dozens of 
as.data.frame methods, and the documentation does state that Few of the 
methods check for duplicated row names, but it would be beneficial from a 
user standpoint.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert matrix to dataframe with repeating row names

2008-10-20 Thread Richard . Cotton
 I have a matrix x with repeating row names. 

 zz-matrix(0,4,4)
 
 rownames(zz)=c(a,a,b,b)
 
 data.frame(zz) (?)

The row names on a data frame should be unique.  You can try 
as.data.frame(xx, row.names=FALSE) to convert zz to be a data frame.  If 
you need the row name information, add it as a column in the data frame, 
e.g. mydataframe$rnames - rownames(zz).  (Note to R-Core: the 
documentation for as.data.frame doesn't mention the usage of 
row.names=FALSE to ignore row names, but it seems to work consistently. 
Does the help page for as.data.frame need updating?)

 lm(as.formula(paste(final_dat[,5]~,paste(colnames(x),collapse=+))),x
 )
 
 this gives me a error

 Error in model.frame.default(formula =
 as.formula(paste(final_dat[,5]~,  : 
 
   'data' must be a data.frame, not a matrix or an array

I suspect that if you try class(x), it will be a matrix, not the requisite 
data frame.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using key.opts in Ecdf/labcurve (Hmisc package)

2008-10-17 Thread Richard . Cotton
I'm presumably missing something very obvious, but how does one use the 
key.opts argument in labcurve (via Ecdf)?

In this example, I want the key to be big and have a blue background, but 
it isn't and doesn't.

ch - rnorm(1000, 200, 40)
sex - factor(sample(c('female','male'), 1000, TRUE))
Ecdf(~ch, group=sex, label.curves=list(keys=c(f, m), 
key.opts=list(cex=3, background=blue)))

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to store lme/lmer fit result

2008-10-10 Thread Richard . Cotton
 I am building a hierarchical model on a large data set. It can take 
quite
 some time to finish one fit, I was just wondering whether it is possible 
to
 store the fit object (the result) to a file for later (offline) 
analysis. 
See ?save.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Space between bars in barplot

2008-10-10 Thread Richard . Cotton
 I played around with your suggestions to change the appearance of my 
graph,
 but there is still a problem I could not fix.
 The vector, which I want to plot, contains 60 entries. After 3 bars I 
want
 to have a large gap
 between the next 3 bars. But what always happens is, that R took at the
 beginning just 2 bars
 and then always 3 bars ending up with one bar at the end.
 So I have created a new vector v - c(0,0,vectorToPrint) plotting this
 vector with the following
 command: barplot(v, space=rep(c(0,0,2)) ). It works but getting always 
some
 warning messages.
 Instead of 0,0 I tried as well NA, NA but same result. So is there a way 
to
 achieve that without
 warning messages?

Seeing as it's late on a friday afternoon, could you give us a fighting 
chance of helping you by 
1) providing enough code to reproduce the problem, and
2) telling us what the warning said.
(As per the posting guide.)

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Space between bars in barplot

2008-10-09 Thread Richard . Cotton
 with the space parameter it is possible to change the gap / distance
 between the bars, but is it also
 possible to change the space after each 6th bar?
 So for example you have bars from 1 to 6 then a large gap and then the 
next
 six bars from 7 to 12

Try, for example

y - runif(13)
y[7] - NA
barplot(y)

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] histogram loses top row with alpha transparency under Windows

2008-10-09 Thread Richard Cotton


k.ponting wrote:
 
 Hello all. 
 
 Trying to use transparency for overlaid histogram plots I have come
 across an interesting inconsistency, possibly a bug when running under
 Windows. Originally noticed in R 2.7.1, it is still there in 2.8.0 beta.
 
 library(lattice)
 zz - function(n,alpha)
 {
   ranges - NULL
   for(ds in 1:n){
 ranges -
 rbind(ranges,data.frame(confidence=c(0,100),dataset=as.character(ds),cor
 rect=c(FALSE,TRUE)))
   }
   panel.colhist = function(x, group.number, col, ...) {
 panel.histogram(x, col=group.number+1, ...)
   }
   x - histogram(~confidence|dataset,data=ranges,alpha=alpha,
  
 panel=panel.superpose,panel.groups=panel.colhist,groups=correct)
   print(x)
 }
 zz(12,1) # works as expected, 12 identical plots
 zz(12,0.5) # top row of plots has no bars at all, lower rows are as
 expected
 zz(1,1) # two bars fine
 zz(1,0.5) # no bars at all
 

The rectangles being drawn extend higher than the top of the panel.  (Your y
axis ranges from 0 to 50, but the bars go up to 100.)

In the top row of plots, depending upon the shape of your device window, the
bars can extend beyond the range of the device window.  For some reason,
(take a look in panel.rect), when you specify alpha less than 1, this
prevents the bar being drawn.  

You need to add type=count to the call to histogram, or rescale the bar
heights.


-
Regards,
Richie.

Mathematical Sciences Unit
HSL
-- 
View this message in context: 
http://www.nabble.com/histogram-loses-top-row-with-alpha-transparency-under-Windows-tp19879687p19897501.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Exporting symnum() result from cor()

2008-10-09 Thread Richard Cotton



Michael Just wrote:
 
 Hello,
 I am trying to export the results from symnum() while maintain their
 readability. I tried using sink to text file and also copying and pasting
 but the results end up looking like this:
 
 symnum(c5.s)
   bC bED bEN bLP bLS bPA bPD bPR p
 bbContag  1
 bbED  +  1
 bbENN_MN  +  B   1
 bbLPI ,  ,   ,   1
 bbLSI +  B   B   ,   1
 bbPAFRAC  ,  *   *   ,   *   1
 bbPD  ,  B   B   ,   B   B   1
 bbPROX_MN +  B   B   ,   B   *   *   1
 pfor 1
 attr(,legend)
 
 
 Is there a way I can get these out of R with their readability intact so I
 can share them with non-R users? Is it possible to export the way things
 look in View() ?
 

write.csv works, e.g.
write.csv(symnum(c5.s), test.csv)

-
Regards,
Richie.

Mathematical Sciences Unit
HSL
-- 
View this message in context: 
http://www.nabble.com/Exporting-symnum%28%29-result-from-cor%28%29-tp19895075p19897840.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vectorization instead of using loop

2008-10-09 Thread Richard . Cotton
 I've sent this question 2 days ago and got response from Sarah. Thanks 
for
 that. But unfortunately, it did not really solve our problem. The main 
issue
 is that we want to use our own (manipulated) covariance matrix in the
 calculation of the mahalanobis distance. Does anyone know how to 
vectorize
 the below code instead of using a loop (which slows it down)?
 I'd really appreciate any help on this, thank you all in advance!
 Cheers,
 Frank
 
 This is what I posted 2 days ago:
 We have a data frame x with n people as rows and k variables as columns.
 Now, for each person (i.e., each row) we want to calculate a distance
 between  him/her and EACH other person in x. In other words, we want to
 create a n x n matrix with distances (with zeros in the diagonal).
 However, we do not want to calculate Euclidian distances. We want to
 calculate Mahalanobis distances, which take into account the covariance
 among variables.
 Below is the piece of code we wrote (covmat in the function below is 
the
 variance-covariance matrix among variables in Data that has to be fed 
into
 mahalonobis function we are using).
  mahadist = function(x, covmat) {
  dismat = matrix(0,ncol=nrow(x),nrow=nrow(x))
  for (i in 1:nrow(x)) {
dismat[i,] = mahalanobis(as.matrix(x), as.matrix(x[i,]), 
covmat)^.5
  }
  return(dismat)
 }
 
 This piece of code works, but it is very slow. We were wondering if it's 
at
 all possible to somehow vectorize this function. Any help would be 
greatly
 appreciated.

You can save a substantial time by calling as.matrix before the loop, e.g.

x - data.frame(runif(1000), runif(1000), runif(1000))
covmat - cov(x)

mahadist = function(x, covmat) #yours
{
   dismat = matrix(0,ncol=nrow(x),nrow=nrow(x))
   for (i in 1:nrow(x)) 
   {
 dismat[i,] = mahalanobis(as.matrix(x), as.matrix(x[i,]), 
covmat)^.5
   }
   return(dismat)
}

mahadist2 - function(x, covmat) #my modification
{
   n - nrow(x)
   dismat - matrix(0,ncol=n,nrow=n)
   matx - as.matrix(x)
   for (i in 1:n) 
   {
  dismat[i,] - mahalanobis(matx, matx[i,], covmat)^.5
   }
   dismat
}
system.time(mahadist(x, covmat))
#   user  system elapsed 
#   2.820.062.95 
system.time(mahadist2(x, covmat))
#   user  system elapsed 
#   1.390.041.45

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vectorization instead of using loop

2008-10-09 Thread Richard . Cotton
Frank said:
  This piece of code works, but it is very slow. We were wondering if 
it's 
 at
  all possible to somehow vectorize this function. Any help would be 
 greatly
  appreciated.

Richie said:
 You can save a substantial time by calling as.matrix before the loop

Patrick said:
 One thing that would speed it up is if you
 inverted 'covmat' once and then used
 'inverted=TRUE' in the call to 'mahalanobis'.

The timings before:
 system.time(mahadist(x, covmat))
 #   user  system elapsed 
 #   2.820.062.95 
 system.time(mahadist2(x, covmat))
 #   user  system elapsed 
 #   1.390.041.45

With Patrick's modification, and moving the square root out of the loop:
mahadist3 - function(x, covmat) #patrick's modification
{
   n - nrow(x)
   dismat - matrix(0,ncol=n,nrow=n)
   matx - as.matrix(x)
   icovmat - chol2inv(chol(covmat))
   for (i in 1:n) 
   {
  dismat[i,] - mahalanobis(matx, matx[i,], icovmat, inverted=TRUE)
   }
   dismat^.5
}
system.time(mahadist3(x, covmat))
#   user  system elapsed 
#   0.800.000.85

Not bad - a better than threefold speed up, without worrying about 
vectorization.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question from Braun/Murdoch book

2008-10-08 Thread Richard . Cotton
 I am looking at the Braun/Murdoch book,  A First Course in
 Statistical Programming in R, and I have a question about a function
 there.  It's on page 52, Example 4.5; the sieve of Erastosthenes.
 
 There is a line:
 primes - c()
 
 Is there a difference between using that and
 primes - NULL
 please?
 
 When you put in primes - c(), primes comes back as NULL.

They return the same thing
identical(c(), NULL)#TRUE

 Is one more efficient or is this just a matter of programming style, 
please?
system.time(for(i in 1:100) c())
#   user  system elapsed 
#   0.630.020.64 

system.time(for(i in 1:100) NULL)
#   user  system elapsed 
#   0.280.000.28

Using NULL appears to be quicker on my machine, but given that you can do 
a million of these assignments in a fraction of a second, it makes no 
practical difference.  NULL is perhaps more intuitive than c() for 
demonstrating that the variable is empty.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting weibull, exponential and lognormal distributions to left-truncated data.

2008-10-07 Thread Richard . Cotton
  I have several datasets, all left-truncated at x=1, that I am 
attempting
  to fit distributions to (lognormal, weibull and exponential).  I had
  been using fitdistr in the MASS package as follows:

 A possible solution is to use the survreg() in the survival package 
 without specifying the covariates, i.e.
 
 library(survival)
 survreg(Surv(..)~1, dist=weibull)
 
 where Surv(..) accepts information about times, censoring/truncation 
 variables and dist allows to specify alternative distributions.
 See ?Surv e ?survreg

The survival package is mostly targeted at right-censored data.  The NADA 
package provides wrappers for many of the survival routines so they work 
with left-censored data.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] horizontal boxplot + xlim

2008-10-06 Thread Richard . Cotton
 I get a strange behaviour of a boxplot with the following code. 
 There seems to 
 be a problem with the xlim-parameter. Did I do anything wrong? What 
 else can I 
 do to force the boxplot to have a defined x-range?
 
 x - rnorm(100)
 boxplot(x, notch=TRUE, xlab=parameter, xlim - c(-4,4), horizontal = 
TRUE)

Did you mean xlim = c(-4,4) in that statement, rather than '-'?

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] horizontal boxplot + xlim

2008-10-06 Thread Richard . Cotton
 True, I made a mistake here. Still, I have problems to visualize my data 
(not 
 the example code I used). I just see a flat line instead a proper 
plot...
 
 Another example code with creating a strange plot:
 
 x - rnorm(100) + 100
 maxval - max(x)
 boxplot(x, notch=TRUE, xlim = c(0,maxval), horizontal = TRUE)

When you use horizontal=TRUE, the meanings of the xlim and ylim arguments 
to boxplot are swapped.  Thus xlim affects the range of the vertical axis. 
 Take a look at 

boxplot(x, notch=TRUE, ylim = c(0,maxval))
and
boxplot(x, notch=TRUE, ylim = c(0,maxval), horizontal = TRUE)

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   >