[R] Tracing the top border of a histogram

2013-08-01 Thread Levi Waldron
I want to represent a histogram by the line along its top border, *without*
kernel smoothing (to show several histograms in the same plot). This works,
but is there simpler recommended way?

x - rnorm(1000)
tmp - hist(x, border=white)
for (i in 1:(length(tmp$breaks)-1)){
segments(x0=tmp$breaks[i], x1=tmp$breaks[i+1], y0=tmp$counts[i])
segments(x0=tmp$breaks[i+1], y0=tmp$counts[i], y1=tmp$counts[i+1])
}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tracing the top border of a histogram

2013-08-01 Thread Levi Waldron
Very nice - thank you, I didn't know about type='s'.


On Thu, Aug 1, 2013 at 3:43 PM, Greg Snow 538...@gmail.com wrote:

 lines( tmp$breaks, c(tmp$counts,tail(tmp$counts,1)), type='s',
 col='#00ff0077', lwd=5 )



 On Thu, Aug 1, 2013 at 1:30 PM, Levi Waldron 
 lwaldron.resea...@gmail.comwrote:

 I want to represent a histogram by the line along its top border,
 *without*
 kernel smoothing (to show several histograms in the same plot). This
 works,
 but is there simpler recommended way?

 x - rnorm(1000)
 tmp - hist(x, border=white)
 for (i in 1:(length(tmp$breaks)-1)){
 segments(x0=tmp$breaks[i], x1=tmp$breaks[i+1], y0=tmp$counts[i])
 segments(x0=tmp$breaks[i+1], y0=tmp$counts[i], y1=tmp$counts[i+1])
 }

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regularized Regressions

2013-04-17 Thread Levi Waldron
Perhaps I am wrong, but I think there are only a few packages supporting
Elastic Net, and none of them also perform Best Subsets.


On Wed, Apr 17, 2013 at 8:46 AM, Christos Giannoulis cgiann...@gmail.comwrote:

 Merhaba, Hello to you too Mehmet (Yasu ki sena)

 Thank you for your email and especially for sharing this package. I
 appreciate it.

 However, my feeling is that this package does not have the third component
 of Best Subsets (pls correct me if I am wrong). It uses only a combination
 of Ridge and Lasso.

 If you happen to know any other packages that uses all of them I would
 greatly welcome and appreciate if you were so kind and share it. I tried to
 search the cran lists but I am not sure I can found something like that.
 That's why I was asking the R-community

 Thank you again for your prompt response!

 Cheers

 Christos

 On Wed, Apr 17, 2013 at 8:16 AM, Suzen, Mehmet msu...@gmail.com wrote:

  Yasu,
 
  Try Elastic nets:
  http://cran.r-project.org/web/packages/pensim/index.html
 
  There some other packages supporting elastic nets: Just search the CRAN
 
  Cheers,
  Mehmet
 
 
  On 17 April 2013 13:19, Christos Giannoulis cgiann...@gmail.com wrote:
   Hi all,
  
   I would greatly appreciate if someone was so kind and share with us a
   package or method that uses a regularized regression approach that
  balances
   a regression model performance and model complexity.
  
   That said I would be most grateful is there is an R-package that
 combines
   Ridge (sum of squares coefficients), Lasso: Sum of absolute
 coefficients
   and Best Subsets: Number of coefficients as methods of regularized
   regression.
  
   Sincerely,
  
   Christos Giannoulis
  
   [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] text command - how to get a white background to cover grid lines

2012-09-06 Thread Levi Waldron
For posterity, I found the TeachingDemos::shadowtext option most
agreeable for this problem:

* legend puts a large box around the text which did not seem
possible to shrink, and does not accept vector x, y arguments
* plotrix::boxed.labels did not work with pos=4 (this moved the
text, but not the box)
* TeachingDemos::shadowtext puts an opaque shadow around the
text, rather than a box, which obscures a minimum of background while
still making the labels readable. col=black, bg=white options
produced this effect (default options are opposite this).

On Mon, Feb 6, 2012 at 10:56 PM, 538...@gmail.com wrote:

 An alternative is the shadowtext function in the TeachingDemos package.

 On Mon, Feb 6, 2012 at 1:19 AM, Jim Lemon j...@bitwrit.com.au wrote:
  On 02/06/2012 08:23 AM, Henry wrote:
 
  New to R - rookie question.
  I'm a mechanical engineer and enjoying using R to make high quality
  graphs.
 
  I've searched.
 
  I want to put text notation on graph plot areas and have the text
  background
  box white to cover over the grid lines.
 
  my command so far
  text(15,5200,Air Flow,cex=.8,col=blue, background=white) # this
  doesn't work...
 
  I've tried bg=white, background color=white and a number of other
  attempts.
 
  The text is getting placed on the chart where I want it.
 
  Hi Henry,
  have a look at the boxed.labels function in the plotrix package.
 
  Jim
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



 --
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] highlighting clusters in a heatmap

2011-06-30 Thread Levi Waldron
I would like to draw horizontal or vertical lines on a heatmap to
highlight the clusters at some specified cut depth of the dendrogram.
As a hacked example, the following code would work if I could set the
coordinates of the top and bottom of the false color image correctly
(ymin and ymax), but the correct values seem to depend on the output
device and its size.  I realize that heatmaps use a 2x2 layout which
makes the coordinate system non-obvious, but the result seems very
difficult to customize.  I would appreciate any suggestions for manual
or pre-made solutions.

Example:

set.seed(2)
x - matrix(rnorm(1000),ncol=10)  #obviously no real clusters here...

row.hclust - hclust(dist(x))
row.dendro - as.dendrogram(row.hclust)

heatmap(x,
Rowv=row.dendro)
row.cut - cutree(row.hclust,3)[row.hclust$order]
cutpoints - which(row.cut[-1]!=row.cut[-length(row.cut)])
ymin - par(usr)[3]#in general incorrect
ymax - par(usr)[4]   #in general incorrect
for (i in cutpoints){
  thisy - ymin + (ymax-ymin)*(i-1)/nrow(x)
  abline(h=thisy,lw=3)
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] difference between the meaning of MARGIN in sweep() and apply()

2009-12-16 Thread Levi Waldron
For example, subtracting 1:2 from the rows of a two-column matrix:

 t(apply(matrix(1:6,ncol=2),MARGIN=1,function(y) y - 1:2))
 [,1] [,2]
[1,]02
[2,]13
[3,]24
 sweep(matrix(1:6,ncol=2),MARGIN=2,1:2,FUN=-)
 [,1] [,2]
[1,]02
[2,]13
[3,]24

Is there a logic to this difference, or is it just a quirk of the history of
these functions?

I found one discussion on this topic, but without what I thought was a very
conclusive end:

http://finzi.psych.upenn.edu/Rhelp08/2009-March/191930.html


-- 
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7
(416)581-7453

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple question about error on CSV import

2009-09-01 Thread Levi Waldron
 By default for read.table,

comment.char = #

so the first line was being treated as a comment line, and when you
specified row.names=#, read.table couldn't find that column.

On Tue, Sep 1, 2009 at 12:07 PM, esawdust lan...@360vl.com wrote:



 esawdust wrote:
 
  Here's the contents of a simple test2.csv CSV file:
 
  #,Status,Project
  5842,New,Test
 
  snortalerts = read.table( /Users/lcox/Documents/test2.csv,
 header=TRUE,
  sep=,, row.names=#)
  Error in data[[rowvar]] : attempt to select less than one element
 
  Landon
 

 Figured out the answer, though it wasn't obvious (to me anyway).  The
 symbol
 # used as the first column label was the problem.   I changed that to be
 id and changed the read.table to be:

 snortalerts - read.table( /Users/lcox/Documents/test2.csv, header=TRUE,
 sep=,, row.names=id)

 and it worked fine.
 --
 View this message in context:
 http://www.nabble.com/Simple-question-about-error-on-CSV-import-tp25242899p25243159.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7
(416)581-7453

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two way joining vs heatmap

2009-08-31 Thread Levi Waldron
Hi Schalk,
the heatmap function does not implement two-way joining as far as I know.
It clusters rows and columns independently.  However if you find or program
a method that implements two-way joining, you could use the row and column
ordering it returns in your heatmap using the Rowv and Colv arguments to
heatmap.
-levi

On Mon, Aug 31, 2009 at 2:25 AM, Schalk Heunis
schalk.heu...@enerweb.co.zawrote:

 Hi

 STATISTICA has a function called Two-way joining (see
 http://www.statsoft.com/TEXTBOOK/stcluan.html#twotwo) and the
 reference material states that this is based on the method as
 published by Hartigan (found this paper:
 http://www.jstor.org/pss/2284710 through wikipedia).

 What is the relationship (if any) between the heatmap function in R
 and this technique? Is there an alternative function to use?

 Thanks for the help!
 Schalk Heunis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7
(416)581-7453

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] setting par(srt) according to plot aspect ratio

2009-08-28 Thread Levi Waldron
On Fri, Aug 28, 2009 at 1:48 AM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote:

 Note that the aspect ratio changes when you resize the plot but the angle
 of the plotted text will not.  So the only safe route is to set 'asp' and
 use that setting to select the angle.


That is true with screen output, although I can run the code block again
after re-sizing to correct the text angle along with the aspect ratio.  And
for my purpose which is writing to file, it isn't an issue, for example the
following plots which need different aspect ratios all work correctly:

png(example%d.png)
makeplot()  #as per example code
par(mfrow=c(2,1))
makeplot()
makeplot()
par(mfrow=c(1,2))
makeplot()
makeplot()
dev.off()


-- 
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7
(416)581-7453

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] setting par(srt) according to plot aspect ratio

2009-08-27 Thread Levi Waldron
How can I look up the aspect ratio of a plot, so I can use that to correctly
adjust the angle of text which is supposed to be parallel to a line in the
plot?

The following example code works for a 1:1 aspect ratio, but puts the text
at the wrong angle if the plot region is short and wide or tall and narrow.
 I can't find a par() component containing the plot aspect ratio.  It will
be for png() or postscript() output, if that matters.

f - function(x) x
g - function(x) 2*x
(f_angle - atan(1)*180/pi)
(g_angle - atan(2)*180/pi)
xpos - 0.2
plot(f)
plot(g,add=TRUE)
par(srt=f_angle)
text(xpos,f(xpos),label=y=x,pos=3)
par(srt=g_angle)
text(xpos,g(xpos),label=y=2x,pos=3)


-- 
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7
(416)581-7453

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] setting par(srt) according to plot aspect ratio

2009-08-27 Thread Levi Waldron
I frequently use R's help facilities and I know about the asp argument to
plot, but this doesn't answer my question.  I would like to allow the aspect
to be determined automatically but *query* the aspect ratio for future use.

I suppose one work-around would be to use the data ranges and plot region
dimensions to estimate an appropriate value for asp, but this more
complicated than I was hoping for.

Thanks,
Levi

On Thu, Aug 27, 2009 at 6:19 PM, Bert Gunter gunter.ber...@gene.com wrote:

 Use R's help facilities, please.

 help.search(aspect ratio)

 gets you to ?plot.window

 which then gets you to ?plot (actually plot.default() )

 Bert Gunter
 Genentech Nonclinical Biostatisics

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of Levi Waldron
 Sent: Thursday, August 27, 2009 1:54 PM
 To: r-help@r-project.org
 Subject: [R] setting par(srt) according to plot aspect ratio

 How can I look up the aspect ratio of a plot, so I can use that to
 correctly
 adjust the angle of text which is supposed to be parallel to a line in the
 plot?

 The following example code works for a 1:1 aspect ratio, but puts the text
 at the wrong angle if the plot region is short and wide or tall and narrow.
  I can't find a par() component containing the plot aspect ratio.  It will
 be for png() or postscript() output, if that matters.

 f - function(x) x
 g - function(x) 2*x
 (f_angle - atan(1)*180/pi)
 (g_angle - atan(2)*180/pi)
 xpos - 0.2
 plot(f)
 plot(g,add=TRUE)
 par(srt=f_angle)
 text(xpos,f(xpos),label=y=x,pos=3)
 par(srt=g_angle)
 text(xpos,g(xpos),label=y=2x,pos=3)


 --
 Levi Waldron
 post-doctoral fellow
 Jurisica Lab, Ontario Cancer Institute
 Division of Signaling Biology
 TMDT 9-304D
 101 College Street
 Toronto, Ontario M5G 1L7
 (416)581-7453

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7
(416)581-7453

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] setting par(srt) according to plot aspect ratio

2009-08-27 Thread Levi Waldron
For posterity's sake, here is the solution I figured out.  Putting the
following lines after the plot(f) command seems to set the angle correctly:

myasp -
(par(fin)[2]-par(mai)[1]-par(mai)[3])/(par(fin)[1]-par(mai)[2]-par(mai)[4])
(f_angle - atan(myasp)*180/pi)
(g_angle - atan(2*myasp)*180/pi)


-- 
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7
(416)581-7453

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multiple annotations on a heatmap

2009-04-02 Thread Levi Waldron
Can someone recommend a more sophisticated way to annotate heatmaps than the
ColSideColors argument of heatmap and heatmap.2?  In particular, I would
like to be able to annotate columns with more than one piece of information,
like in Figure 1 of the article at
http://www3.interscience.wiley.com/cgi-bin/fulltext/117905619/HTMLSTART /
doi:10.1002/hep.22256.

Some example data and a heatmap:

set.seed(1)
y - matrix(rnorm(100),nrow=20,ncol=5)
colnames(x) - LETTERS[1:5]
rownames(x) - paste(r,1:20,sep=)
set.seed(1)
annotation - matrix(sample(c(+,-),15,replace=TRUE),ncol=5)
colnames(annotation) - colnames(x)
rownames(annotation) - paste(annotation,1:3)

heatmap(x,Rowv=NA,
ColSideColors=sapply(annotation[1,],function(x)
switch(x,+=red,-=blue)))

This heatmap annotates the columns by the first of the three annotations
with a colored bar along the top of the heatmap, but ideally I would like to
put all three annotations on the heatmap by putting three rows of +/- values
between the top of the heatmap and the dendrogram, or even three colored
bars.  Specific or general suggestions would be welcome.

Thank you,
Levi


-- 
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
IBM Life Sciences Discovery Centre
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory space

2009-03-18 Thread Levi Waldron
Perhaps:

gc()


On Wed, Mar 18, 2009 at 8:26 AM, Abelian abelian1...@gmail.com wrote:

 Dear all
 when the program is runing, we can realize that the memory size will
 be asked more and more..
 Therefore, we could meet the significant problem, such as  out off
 memory size.
 However, even if i rm() some variables that i will not use it anymore,
 the memory size still not enough.
 By the way, i found that if i turn off the R-gui, the program can work
 again.
 For example, i generate a 800*45000 matrix four times. however, when i
 generate the fifth, the message tell me that the memory size is not
 enough. So, i guess that maybe there are some way that can release the
 memory  such that the memory size is ths same with the situation of
 starting. Does anyone be able to provide the function to clean the
 memory ? Thanks a lot.
 Sincerely

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difficulty Replacing a Row of a Data Frame

2009-03-17 Thread Levi Waldron
Your first example that works would just repeat the atomic vector to the
length of the row.  To replace a row with another data frame, you could use
rbind:

avGain - rbind(avGain[-j,],b)

Or if the positioning is important,
2) avGain - rbind(avGain[1:(j-1),],b,avGain[(j+1):nrow(avGain),])

I didn't pay close attention to how you formed b, but this of course will
only work if it has the same number of columns as avGain.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] breaking ties in order() based on many vectors

2009-03-17 Thread Levi Waldron
The order() function allows you to specify multiple vectors, which are used
successively to break ties.  If I want to use many vectors to break ties
(say, 25 or more), that are columns of a matrix or elements of a list, does
anyone know a shortcut to do this without passing 25 arguments to order()?

-- 
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
IBM Life Sciences Discovery Centre
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] breaking ties in order() based on many vectors

2009-03-17 Thread Levi Waldron
On Tue, Mar 17, 2009 at 12:56 PM, Patrick Burns pbu...@pburns.seanet.comwrote:

 Use 'do.call' as discussed in 'The R Inferno'
 and elsewhere.


Perfect, thanks.


-- 
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
IBM Life Sciences Discovery Centre
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7
(416)581-7453

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difficulty Replacing a Row of a Data Frame

2009-03-17 Thread Levi Waldron
On Tue, Mar 17, 2009 at 12:54 PM, Bill Cunliffe b...@elevationllc.netwrote:

  Thanks for your help.  Part of the problem was that I thought I was
 dealing with a data frame when it was actually an array.  I forced all to be
 matrices for consistency.  However, I see that



 avGain - rbind(avGain[1:(j-1),],b,avGain[(j+1):nrow(avGain),])



 would be a great solution as positioning is important.



This rbind solution will work, and if b has more than one row is possibly
the best way to do it.  But it will be very slow if you are doing it
iteratively, compared to the direct replacement it sounds like you've
figured out.
cheers,
Levi

-- 
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
IBM Life Sciences Discovery Centre
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] getting all pairwise combinations of elements in a character string

2009-02-12 Thread Levi Waldron
I'm able to do this as follows, but am wondering if anyone knows a 
simpler way which still avoids explicit loops?


 (mystring - letters[1:5])
[1] a b c d e
 unlist(sapply(mystring[-length(mystring)],
+   function(x) 
paste(x,mystring[(grep(x,mystring)+1):length(mystring)],sep=)))

 a1   a2   a3   a4   b1   b2   b3   c1   c2d
ab ac ad ae bc bd be cd ce de



--
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
IBM Life Sciences Discovery Centre
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting all pairwise combinations of elements in a character string

2009-02-12 Thread Levi Waldron

I like both of these solutions much better - thank you!
-Levi

--
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
IBM Life Sciences Discovery Centre
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looping multiple output values to dataframe

2009-02-12 Thread Levi Waldron

Stropharia wrote:

# START
R-CODE---
filenames - Sys.glob(/Users/Desktop/Test/*.csv)  # get names of files to
process # use * to get all

variables - data.frame(1:length(filenames)) # preallocate assuming multiple
values from each file # creates a dataframe with the same length of rows as
the number of .csv files to process

for (i in seq_along(filenames)){
input - read.csv(filenames[i], header=TRUE, na.strings=NA)  
data.frame(input)

attach(input)

result.A - x[2]*y[1]
result.B - y[2]-x[1]
result.C - x[3]+y[1]

results - c(result.A, result.B, result.C) # concatenate result vectors

variables[i] - results
} 


variables - as.data.frame(t(as.matrix(variables))) # turn result vectors
into a matrix, then transpose it and output as a data frame

# add column and row names
c.names - c(ResultA, ResultB, ResultC) # set names for result vectors
colnames(variables) - c.names
rownames(variables) - filenames

# export to csv file
write.csv(variables, file=/Users/Desktop/Test.csv) 
# END

R-CODE---
  

I think something like this should work better:

docalc - function(thisfile){
   input - read.csv(filenames[i], header=TRUE, na.strings=NA)
   attach(input)
   result.A - x[2]*y[1]
   result.B - y[2]-x[1]
   result.C - x[3]+y[1]
   results - c(result.A, result.B, result.C) # concatenate result vectors
   names(results) - c(ResultA, ResultB, ResultC) return(results)
}

variables - sapply(filenames,docalc)

--
Levi Waldron
post-doctoral fellow
Jurisica Lab, Ontario Cancer Institute
Division of Signaling Biology
IBM Life Sciences Discovery Centre
TMDT 9-304D
101 College Street
Toronto, Ontario M5G 1L7
(416)581-7453

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cumulative sum of within levels of a dataframe

2008-06-27 Thread Levi Waldron
This one should be easy but it's giving me a hard time mostly because tapply
puts the results in a list.  I want to calculate the cumulative sum of a
variable in a dataframe, but with the accumulation only within each level of
a factor.  For a very simple example, take:

 df -
data.frame(x=c(rep(1,5),rep(2,5),rep(3,5)),fac=gl(3,5,labels=letters[1:3]))
 df
   x fac
1  1   a
2  1   a
3  1   a
4  1   a
5  1   a
6  2   b
7  2   b
8  2   b
9  2   b
10 2   b
11 3   c
12 3   c
13 3   c
14 3   c
15 3   c

I'd like to create another column in the dataframe so it looks like this,
and make sure that the cumulative sums still match the right levels of the
factor.  I've included a willdo column that's just a cumulative sum, and
an ideal column that's the cumulative sum minus the current value - the
column headings are self explanatory.

 answer
   x fac willdo ideal
1  1   a  1 0
2  1   a  2 1
3  1   a  3 2
4  1   a  4 3
5  1   a  5 4
6  2   b  2 0
7  2   b  4 2
8  2   b  6 4
9  2   b  8 6
10 2   b 10 8
11 3   c  3 0
12 3   c  6 3
13 3   c  9 6
14 3   c 12 9
15 3   c 1512

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cumulative sum of within levels of a dataframe

2008-06-27 Thread Levi Waldron
Perfect, thanks all.  Unlist and ave will both be very handy functions to
know.  -levi



-- 
Levi Waldron, PhD
Treated Wood Research Group
Faculty of Forestry, University of Toronto
33 Willcocks Street, Toronto ON, M5S 3B3 CANADA
Tel 647-866-5384
Fax 416-978-3834

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] choosing an appropriate linear model

2008-06-06 Thread Levi Waldron
Perhaps this was too big a question, so I'll ask something shorter:

I have fit a linear model, and want to use its prediction intervals to
calculate the sum of many individual predictions.

1) Some of the lower prediction intervals are negative, which is
non-sensical.  Should I just set all negative predictions to zero, or is
there another way to require non-negative predictions only?
2) I am interested in the sum of many predictions based on the lm.  How can
I calculate the 95% prediction interval for the sum?  Should I calculate a
root mean square of the individual errors, or use a bootstrap method, or
something else?

ps. the data is attached to the end of this email.

On Thu, Jun 5, 2008 at 6:25 PM, Levi Waldron [EMAIL PROTECTED] wrote:

 I am trying to model the observed leaching of wood preservative chemicals
 from treated wood during an outdoor experiment where leaching is caused by
 rainfall events.  For each rainfall event, the amount of rainfall was
 recorded as well as the amount of preservative chemical leached.  A number
 of climatic variables were measured, but the most important is the amount of
 rainfall.

 I have tried a simple linear model, with zero intercept because zero
 rainfall cannot cause any leaching (leachdata dataframe is attached to this
 email).  The diagnostics show clearly non-normally distributed residuals
 with a simple linear regression, and I am trying to figure out what to do
 about it (see attached diagnostics.png).  This dataset contains measurements
 from 57 rainfall events on three replicate samples, for a total of 171
 measurements.

 Part of the problem is that physically, the leaching values can only be
 positive, so for the smaller rainfall amounts the residuals are all
 positive.  If I allow an intercept then it is significantly positive,
 possibly since the researcher wouldn't have collected measurements for very
 small rain events, but in terms of the model it doesn't make sense
 physically to have a positive intercept, particularly since lab experiments
 have shown that a certain amount of rain exposure is required to wet the
 wood before leaching begins.

 I can get more normally distributed residuals by log-transforming the
 response, or using the optimal box-cox transformation of lambda = 0.21,
 which produces nicer-looking residuals but unsatisfactory prediction which
 is the main goal of the model (also attached).

 Any advice on how to create a better predictive model?  I presume it has
 something to do with glm, especially since I have repeated rainfalls on
 replicate samples, but any advice on the approach to take would be much
 appreciated.  The code I used to produce the attached plots is included
 below.


 leach.lm - lm(leachate~rainmm-1,data=leachdata)

 png(dianostics.png,height=1200,width=700)
 par(mfrow=c(3,2))
 plot(leachate~rainmm,data=leachdata,main=Data and fitted line)
 abline(leach.lm)
 plot(predict(leach.lm)~leachdata$leachate,main=predicted vs. observed
 leaching amount,xlim=c(0,12),ylim=c(0,12),xlab=observed
 leaching,ylab=predicted leaching)
 abline(a=0,b=1)
 plot(leach.lm)
 dev.off()

 library(MASS)
 boxcox(leach.lm,plotit=T,lambda=seq(0,0.4,by=0.01))

 boxtran - function(y,lambda,inverse=F){
   if(inverse)
 return((lambda*y+1)^(1/lambda))
   else
 return((y^lambda-1)/lambda)
 }

 png(boxcox-dianostics.png,height=1200,width=700)
 par(mfrow=c(3,2))
 logleach.lm - lm(boxtran(leachate,0.21)~rainmm-1,data=leachdata)
 plot(leachate~rainmm,data=leachdata,main=Data and fitted line)
 x - leachdata$rainmm
 y - boxtran(predict(logleach.lm),0.21,T)
 xy - cbind(x,y)[order(x),]
 lines(xy)
 plot(y~leachdata$leachate,xlim=c(0,12),ylim=c(0,12),main=predicted vs.
 observed leaching amount,xlab=observed leaching,ylab=predicted
 leaching)
 abline(a=0,b=1)
 plot(logleach.lm)
 dev.off()


`leachdata` -
structure(list(rainmm = c(19.68, 36.168, 18.632, 2.74, 0.822,
9.864, 7.124, 29.592, 4.384, 11.508, 1.37, 3.288, 9.042, 2.74,
18.906, 4.932, 0.274, 3.836, 1.918, 4.384, 16.714, 5.754, 12.604,
2.466, 13.014, 2.74, 14.796, 5.754, 4.93, 5.21, 0.548, 1.644,
3.014, 6.028, 18.358, 1.918, 3.014, 18.358, 0.274, 1.918, 54.2,
43.4, 11.2, 1.6, 3.8, 70.2, 0.2, 24.4, 25.8, 13, 7.124, 10.96,
7.672, 3.562, 3.288, 6.02, 17.54, 19.68, 36.168, 18.632, 2.74,
0.822, 9.864, 7.124, 29.592, 4.384, 11.508, 1.37, 3.288, 9.042,
2.74, 18.906, 4.932, 0.274, 3.836, 1.918, 4.384, 16.714, 5.754,
12.604, 2.466, 13.014, 2.74, 14.796, 5.754, 4.93, 5.21, 0.548,
1.644, 3.014, 6.028, 18.358, 1.918, 3.014, 18.358, 0.274, 1.918,
54.2, 43.4, 11.2, 1.6, 3.8, 70.2, 0.2, 24.4, 25.8, 13, 7.124,
10.96, 7.672, 3.562, 3.288, 6.02, 17.54, 19.68, 36.168, 18.632,
2.74, 0.822, 9.864, 7.124, 29.592, 4.384, 11.508, 1.37, 3.288,
9.042, 2.74, 18.906, 4.932, 0.274, 3.836, 1.918, 4.384, 16.714,
5.754, 12.604, 2.466, 13.014, 2.74, 14.796, 5.754, 4.93, 5.21,
0.548, 1.644, 3.014, 6.028, 18.358, 1.918, 3.014, 18.358, 0.274,
1.918, 54.2, 43.4, 11.2, 1.6, 3.8, 70.2, 0.2, 24.4, 25.8, 13,
7.124, 10.96, 7.672

[R] lw in legend also changes thickness of characters in the legend??

2008-05-14 Thread Levi Waldron
Here's a simple example:

x - 1:5
plot(x,x^2)
lines(x,x^2)
points(x,x,cex=2)
lines(x,x,lw=3)
legend(topleft,legend=c(y=x^2,y=x),pch=1,pt.cex=1:2,lw=c(1,3))

The thickness of the circles in the legend changes with lw.  If you change
the lw argument in legend to lw=c(1,1) then the thickness of the circles
goes back to normal.  How can I make the above legend so that both the lines
and the plotting characters match what is in the graph?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] customization of pairwise comparison plots

2008-05-01 Thread Levi Waldron
I am wondering how to customize a pairwise comparisons plot of a factorial
ANOVA, without doing a lot of manual manipulation of a TukeyHSD object.  The
customizations I'd like are:

1.  The aov used log-transformed response data, but I'd like to plot the
intervals on their original, untransformed scales
2.  Plot all the main and interaction effects together, rather than in a
separate window for each.
3.  Omit certain comparisons, specifically interactions without some common
level.  ie plot A1:B1 - A1:B2, but not A1:B2 - A3:B4.

For example (I know not a valid one since the factor wool isn't significant,
and it doesn't need to be log-transformed):

 data(warpbreaks)
 summary(fm1 - aov(log(breaks) ~ wool*tension, data = warpbreaks))
 Df Sum Sq Mean Sq F value   Pr(F)
wool  1 0.3125  0.3125  2.2344 0.141511
tension   2 2.1762  1.0881  7.7792 0.001185 **
wool:tension  2 0.9131  0.4566  3.2642 0.046863 *
Residuals48 6.7138  0.1399
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The general features of plot(fm1) are good but I'd like to plot all the
comparisons together in one plot, on the original scale, skipping the
comparisons B:M-A:L, B:H-A:L, A:M-B:L, A:H-B:L, B:H-A:M, A:H-B:M while
keeping the rest of the interaction comparisons.  Advice on any of these, if
not all, would be much appreciated.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] restricting pairwise comparisons of interaction effects

2008-04-28 Thread Levi Waldron
I'm interested in restricting the pairwise comparisons of interaction
effects in a multi-way factorial ANOVA, because I find comparisons of
interactions between all different variables different to interpret.
For example (supposing a p0.10 cutoff just to be able to use this
example):

 summary(fm1 - aov(breaks ~ wool*tension, data = warpbreaks))
 Df Sum Sq Mean Sq F valuePr(F)
wool  1  450.7   450.7  3.7653 0.0582130 .
tension   2 2034.3  1017.1  8.4980 0.0006926 ***
wool:tension  2 1002.8   501.4  4.1891 0.0210442 *
Residuals48 5745.1   119.7
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 TukeyHSD(fm1,conf.level=0.90)
  Tukey multiple comparisons of means
90% family-wise confidence level

Fit: aov(formula = breaks ~ wool * tension, data = warpbreaks)

$wool
 diff   lwruprp adj
B-A -5.78 -10.77183 -0.7837284 0.058213

$tension
  diff   lwr   upr p adj
M-L -10.00 -17.66710 -2.332900 0.0228554
H-L -14.72 -22.38932 -7.055122 0.0005595
H-M  -4.72 -12.38932  2.944878 0.4049442

$`wool:tension`
   difflwrupr p adj
B:L-A:L -16.333 -30.112566  -2.554101 0.0302143
A:M-A:L -20.556 -34.334788  -6.776323 0.0029580
B:M-A:L -15.778 -29.557010  -1.998545 0.0398172
A:H-A:L -20.000 -33.779233  -6.220767 0.0040955
B:H-A:L -25.778 -39.557010 -11.998545 0.0001136
A:M-B:L  -4.222 -18.001455   9.557010 0.9626541
B:M-B:L   0.556 -13.223677  14.334788 0.978
A:H-B:L  -3.667 -17.445899  10.112566 0.9797123
B:H-B:L  -9.444 -23.223677   4.334788 0.4560950
B:M-A:M   4.778  -9.001455  18.557010 0.9377205
A:H-A:M   0.556 -13.223677  14.334788 0.978
B:H-A:M  -5.222 -19.001455   8.557010 0.9114780
A:H-B:M  -4.222 -18.001455   9.557010 0.9626541
B:H-B:M -10.000 -23.779233   3.779233 0.3918767
B:H-A:H  -5.778 -19.557010   8.001455 0.8705572

It would seem to make sense (and please correct me if I'm wrong) to
restrict the pairwise comparisons of wool:tension to terms like
B:L-A:L, and A:M-A:L, and not calculate or try to interpret
differences like B:M-A:L.  How can I accomplish this (note that there
are actually 5 factors in the experiment I'm analyzing)?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] adding device size-independent y=0 line to a lattice plot

2008-03-31 Thread Levi Waldron
Using the following lattice plot as an example, I would like to add
horizontal lines where y=0:

library(lattice)
library(grid)
fac - gl(4,12)
x - letters[rep(1:3,16)]
y - runif(48,min=0.0)
dotplot(y~x|fac)

I've tried it with grid.lines using npc and native units, which works
fine unless I change the size of the output device - then the lines
are in the wrong place.  Is there a way to do this that is independent
of the output device size?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spreading out a numeric vector

2008-03-24 Thread Levi Waldron
*Thank you* Greg, this function works perfectly!  I had imagined that
the ideal solution would iteratively modify the vector to fix new
violations of mindiff created by each subsequent spreading of tight
clusters, but couldn't figure how to do it.  A small note, the vector
x must be sorted before using this function, which is a reasonable
requirement.

Thanks also to Jim Lemon for pointing out the very useful plotrix
package - the spread.labels function didn't work so well for this
application because the timelines looked strange with the labels
spread evenly, but I like a number of the functions provided by the
package.

-levi

On Mon, Mar 24, 2008 at 12:43 PM, Greg Snow [EMAIL PROTECTED] wrote:
 Levi,

  Here is one possible function:

  spread - function(x, mindiff) {
   df - x[-1] - x[-length(x)]
   i - 1
   while (any(df  mindiff)) {
 x[c(df  mindiff, FALSE)] - x[c(df  mindiff, FALSE)] - mindiff/10
 x[c(FALSE, df  mindiff)] - x[c(FALSE, df  mindiff)] + mindiff/10
 df - x[-1] - x[-length(x)]
 i - i + 1
 if (i  100) {
break
 }
   }
   x
  }

  I have tried experimenting with using optim to minimize a function of
  the distances between new points and old points penealized for being to
  close, but it sometimes gave me some weird results, the above has worked
  fairly well for me (the above is based on some of the code inside the
  triplot function in the TeachingDemos package).

  Hope this helps,

  --
  Gregory (Greg) L. Snow Ph.D.
  Statistical Data Center
  Intermountain Healthcare
  [EMAIL PROTECTED]
  (801) 408-8111

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spreading out a numeric vector

2008-03-24 Thread Levi Waldron
Would this function be worth submitting to some existing cran library?
 I'd be willing to document it and add error-checking if there is
somewhere that it belongs (with credit to Greg of course, unless Greg
wants to submit it himself).

I am using it in conjunction with a function I wrote which draws tics
on the timeline and draws a line from the tic to the label when they
aren't aligned.


I have tried experimenting with using optim to minimize a function of
the distances between new points and old points penealized for being to
close, but it sometimes gave me some weird results, the above has worked
fairly well for me (the above is based on some of the code inside the
triplot function in the TeachingDemos package).
  
Hope this helps,
  
--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] spreading out a numeric vector

2008-03-22 Thread Levi Waldron
I am creating a timeline plot, but running into a problem in positions
where the values to plot are too close together to print the text
without overlap.  The simplest way I can think of to solve this
(although there may be other ways?) is to create a new numeric vector
whose values are as close as possible to the original vector, but
spread out to a given minimum difference.  For example, take the
following vector:

 x - c(1,2,seq(2.1,2.3,0.1),3,4)
 x
[1] 1.0 2.0 2.1 2.2 2.3 3.0 4.0


However, suppose I don't want to plot any of these points with less
than 0.5 between them.  The problem could be solved by a function that
behaved something like this:

 x2 - spread(x,mindiff=0.5)
 x2
[1] 0.5 1.0 1.5 2.0 2.5 3.0 4.0

Or for a minimum difference of 0.2,
 x2 - spread(x,mindiff=0.2)
 x2
[1] 1.0 1.8 2.0 2.2 2.4 3.0 4.0

Thus, if there is a cluster of close values, spreading the values may
require changing values which previously weren't too close to their
nearest neighbors.

Then I could use segments() to draw lines from the timeline to where
the shifted text is printed, labeling tics on the timeline accurately
but keeping the text from overlapping.

Any ideas on how to program this or how to do it using built-in
functions would be greatly appreciated.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spreading out a numeric vector

2008-03-22 Thread Levi Waldron
On Sat, Mar 22, 2008 at 9:58 PM, jim holtman [EMAIL PROTECTED] wrote:
 Is this close to what you want?

   spread - function(x, mindiff=0.5){
  + unique(as.integer(x / mindiff) * mindiff)
  + }

 
   x - c(1,2,seq(2.1,2.3,0.1),3,4)
   x
  [1] 1.0 2.0 2.1 2.2 2.3 3.0 4.0
   spread(x)
  [1] 1 2 3 4
  
   spread(x,.2)
  [1] 1.0 2.0 2.2 3.0 4.0


There are a couple problems with this solution:
   1.  The main problem is that it causes closely spaced numbers to
vanish, rather than spread out
   2.  The vectors I'm actually working with are large integers, for example:
 x
 [1]   342951  1491880  5447637  6335019  7368604  8960148 10635802 16907621
 [9] 18071422 18747461 19379960 21767641 21850851 23143537 26693431 35298190

with mindiff = 50.  I could scale these numbers to make your
solution work, except for problem #1.

Interesting idea though.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] differentiating a numeric vector

2008-03-05 Thread Levi Waldron
What functions exist for differentiating a numeric vector (in my case
spectral data)?  That is, experimental data without an analytical
function.  ie,

 x - seq(1,10,0.1)
 y=x^3+rnorm(length(x),sd=0.01) #although the real function would be 
 nothing simple like x^3...
 derivy - 

I know I could just use diff(y) but it would be nice to estimate
derivatives at the endpoints, calculate higher-order derivatives, and
maybe have some smoothing options ie by differentiating a spline or
something like that.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] summarizing replicates with multiple treatments

2008-03-04 Thread Levi Waldron
I have a dataframe with several different treatment variables, and
would like to calculate the mean and standard deviation of the
replicates for each day and treatment variable.  It seems like it
should be easy, but I've only managed to do it for one treatment at a
time using subset and tapply.  Here is an example dataset:

 `exampledata` -
structure(list(day = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), treat = structure(c(1L, 1L,
1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L
), .Label = c(a, b), class = factor), replicate = c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L), height = c(1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 2.1, 2.2, 2.3,
2.4, 2.5, 2.6, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6), weight = c(11.1,
11.2, 11.3, 11.4, 11.5, 11.6, 12.1, 12.2, 12.3, 12.4, 12.5, 12.6,
13.1, 13.2, 13.3, 13.4, 13.5, 13.6)), .Names = c(day, treat,
replicate, height, weight), class = data.frame, row.names = c(NA,
-18L))

 exampledata
   day treat replicate height weight
11 a 11.1   11.1
21 a 21.2   11.2
31 a 31.3   11.3
41 b 11.4   11.4
51 b 21.5   11.5
61 b 31.6   11.6
72 a 12.1   12.1
82 a 22.2   12.2
92 a 32.3   12.3
10   2 b 12.4   12.4
11   2 b 22.5   12.5
12   2 b 32.6   12.6
13   3 a 13.1   13.1
14   3 a 23.2   13.2
15   3 a 33.3   13.3
16   3 b 13.4   13.4
17   3 b 23.5   13.5
18   3 b 33.6   13.6

I would like to combine the replicates and get a dataframe like:

day treat   height.mean height.sd   weight.mean weight.sd
1   a   1.2 0.1 11.20.1
1   b   1.5 0.1 11.50.1
2   a   2.2 0.1 12.20.1
2   b   2.5 0.1 12.50.1
3   a   3.2 0.1 13.20.1
3   b   3.5 0.1 13.50.1

or two dataframes, one with means and the other with standard deviations.

Thus far I have been doing it a piece at a time, like below (extra
verbose since tapply doesn't accept the data= argument!), but would
like to do it for all the measurement columns and all the treatments
in one go.  Thanks!

 tapply(exampledata[exampledata$treat==a,]$height,exampledata[exampledata$treat==a,]$day,mean)
  1   2   3
1.2 2.2 3.2


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] summarizing replicates with multiple treatments

2008-03-04 Thread Levi Waldron
On Tue, Mar 4, 2008 at 5:32 PM, Gabor Grothendieck
[EMAIL PROTECTED] wrote:
 Try this:

  library(doBy)
  summaryBy(. ~ day + treat, exampledata, FUN = c(mean, sd))

Outstanding, so much better.  Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.