[R] snowfall

2011-01-12 Thread Santosh Srinivas
Hello,

Just wondering why I am unable to run this in parallel.
A dput of my dataset is attached at the end. Please use to create my data
object.

I want to run this function in parallel (not sure if this is an efficient
implementation):

#Function to calculate the time to maturity for the option
require(fCalendar,quietly=TRUE) #Trying to calculate the trading days
require(fractalrock,quietly=TRUE) #Just to calculate the trading days
myFinCenter=Asia/Singapore

getTimeToMaturity - function(x){   
tryCatch({
toDt - as.Date(as.character(x['EXPIRY_DT']), %Y-%m-%d)
#Expiry Date
fromDt - as.Date(as.character(x['TIMESTAMP']), %Y-%m-%d)
#Trade Timestamp
NoOfDays - NROW(getTradingDates(toDt,fromDt))
return(NoOfDays/252)
},
error = function (ex){
#print (paste(Error in,toDt,fromDt))
NoOfDays - 0
return(NoOfDays/252)
}
)
}


Question: The following two lines work but the third and parallel one
doesn't ... why?

1)  apply(dNiftyOpt,1,getTimeToMaturity) #Works
 1  2  3  4  5  6  7
8  9 10 11 12 13 14
15 16 17 18 19 20 
0.02380952 0.01984127 0.07936508 0.02380952 0.01984127 0.01190476 0.0278
0.02380952 0.01984127 0.01190476 0.02380952 0.01984127 0.02380952 0.02380952
0.01984127 0.02380952 0.01984127 0.02380952 0.02380952 0.0278


library(snowfall)
2)  sfInit()
snowfall 1.84 initialized: sequential execution, one CPU.

 sfApply(dNiftyOpt,1,getTimeToMaturity) #Works
 1  2  3  4  5  6  7
8  9 10 11 12 13 14
15 16 17 18 19 20 
0.02380952 0.01984127 0.07936508 0.02380952 0.01984127 0.01190476 0.0278
0.02380952 0.01984127 0.01190476 0.02380952 0.01984127 0.02380952 0.02380952
0.01984127 0.02380952 0.01984127 0.02380952 0.02380952 0.0278 

 sfStop()


DOESN'T WORK: 3) 
 sfInit( parallel=TRUE, cpus=4 );
 sfApply(dNiftyOpt,1,getTimeToMaturity) #Added the time to maturity.
DOESN'T WORK?
 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
 sfStop();



My dataset:
dput(dNiftyOpt)

structure(list(INSTRUMENT = c(OPTIDX, OPTIDX, OPTIDX, OPTIDX, 
OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, 
OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, 
OPTIDX, OPTIDX), SYMBOL = c(NIFTY, NIFTY, NIFTY, NIFTY, 
NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, 
NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, 
NIFTY, NIFTY), EXPIRY_DT = c(2004-01-29, 2004-01-29, 
2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 
2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 
2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 
2004-01-29, 2004-01-29, 2004-01-29), STRIKE_PR = c(1780, 
1780, 1800, 1800, 1800, 1800, 1800, 1800, 1800, 1800, 1820, 1820, 
1820, 1830, 1830, 1830, 1830, 1840, 1840, 1850), OPTION_TYP = c(PE, 
PE, CE, CE, CE, CE, PE, PE, PE, PE, CE, CE, 
PE, CE, CE, PE, PE, CE, PE, CE), SETTLE_PR = c(27.4, 
5.7, 152.95, 28.6, 70.45, 111.35, 14.75, 39.2, 8.6, 2.35, 20.4, 
54.2, 50.15, 18.35, 47.25, 51.75, 15.5, 14.95, 57.95, 26.3), 
TIMESTAMP = c(2004-01-22, 2004-01-23, 2004-01-02, 2004-01-22, 
2004-01-23, 2004-01-27, 2004-01-21, 2004-01-22, 2004-01-23, 
2004-01-27, 2004-01-22, 2004-01-23, 2004-01-22, 2004-01-22, 
2004-01-23, 2004-01-22, 2004-01-23, 2004-01-22, 2004-01-22, 
2004-01-21), Underlying = c(1770.5, 1847.55, 1946.05, 1770.5, 
1847.55, 1904.7, 1824.6, 1770.5, 1847.55, 1904.7, 1770.5, 
1847.55, 1770.5, 1770.5, 1847.55, 1770.5, 1847.55, 1770.5, 
1770.5, 1824.6), UnderlyingVol = c(0.293906144944403, 0.331877179605752,

0.129552369208600, 0.293906144944403, 0.331877179605752, 
0.348918971622834, 0.276334860399362, 0.293906144944403, 
0.331877179605752, 0.348918971622834, 0.293906144944403, 
0.331877179605752, 0.293906144944403, 0.293906144944403, 
0.331877179605752, 0.293906144944403, 0.331877179605752, 
0.293906144944403, 0.293906144944403, 0.276334860399362)), .Names =
c(INSTRUMENT, 
SYMBOL, EXPIRY_DT, STRIKE_PR, OPTION_TYP, SETTLE_PR, 
TIMESTAMP, Underlying, UnderlyingVol), row.names = c(1, 
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 
14, 15, 16, 17, 18, 19, 20), class = data.frame)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] From polynomial to function

2011-01-12 Thread Alaios
Dear all.
I would like to use legendre polynomials which is something pretty easy in R.

x-legendre.polynomials(2)[[3]]
 x
-0.5 + 1.5*x^2 

 str(x)
Class 'polynomial'  num [1:3] -0.5 0 1.5

As you can see from the code above str(x) returns that x is of class 
polynomial. I want to use that polynomial as a function. The reason for that is 
that I would be grateful if I can feed that kind of function inside 
integrate(f,lower=,upper=)

Could you please inform if it is possible to do that in R?

I would like to thank you in advance for your help

Best Regards
Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ThinkCell type waterfall charts in R?

2011-01-12 Thread Jim Lemon

On 01/12/2011 05:54 AM, ang wrote:


Hi Jim,

I looked through the plotrix documentation, and the waterfall plot comes
from the stackpoly function right?  I'm not sure if I can modify the
stackpoly to create the plot I want, since stackpoly is a line plot and
fills the area under with color.

I haven't played with all the options yet, but what I was looking for was
more similar to staircase.plot, but instead of horizontal bars, they would
be vertical columns.  Would you happen to know any packages or existing
plots that could be easily modified to do this?


Hi Adrian,
Try using the dir argument as e or w.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] From polynomial to function

2011-01-12 Thread Karl Ove Hufthammer
Alaios wrote:

 x-legendre.polynomials(2)[[3]]
 x
 -0.5 + 1.5*x^2
 
 str(x)
 Class 'polynomial'  num [1:3] -0.5 0 1.5
 
 As you can see from the code above str(x) returns that x is of class
 polynomial. I want to use that polynomial as a function. The reason for
 that is that I would be grateful if I can feed that kind of function
 inside integrate(f,lower=,upper=)

You can use the ‘as.function’ function. But if you’re just going to 
integrate the polynomial anyway, why don’t you just use the ‘integral’ 
function? It’s much more accurate than using numerical integration.

BTW, to see which functions handle ‘polynomial’ objects, use
methods(class=polynomial)

Output:
 [1] as.character.polynomial* as.function.polynomial*  coef.polynomial*
 [4] deriv.polynomial*GCD.polynomial*  integral.polynomial*
 [7] LCM.polynomial*  lines.polynomial*Math.polynomial*
[10] Ops.polynomial*  plot.polynomial* points.polynomial*  
[13] predict.polynomial*  print.polynomial*solve.polynomial*   
[16] summary.polynomial*  Summary.polynomial* 

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot: skip a range of axis

2011-01-12 Thread Jim Lemon

On 01/12/2011 03:46 PM, Yuan Jian wrote:

Hi,
I am using plot to show scatter points in 2_D.
in my data, there is no data between -1 and +1 in x-axis.
I want to skip this region, i.e. x axis becomes [-Inf:-1, 1:Inf].
can any one tell me how to do?


Hi Yu,
Try the gap.plot function in the plotrix package.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] vector or list of matrices corresponding to an observation

2011-01-12 Thread SL
Dear all,

I observe for each observation several joint distributions of two
multinomial random variables (5x5 matrices). Right now, data are
arranged so that I have 20 columns for each joint distribution by
observation, which is not practical.

I would like to work with matrices that would be indexed by
observation, just as I work with vectors and matrices for regression
like analysis.

How should I proceed? With a list of list of matrices and how to build
it in a systematic way (n=3000 and I observe up to 10 joint
distributions by observation)?

I'm not familiar with such procedure and I don't know where I should
start reading.

Thanks!

Best,
SL

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] list concatenation

2011-01-12 Thread Georg Otto
Bert Gunter gunter.ber...@gene.com writes:

 Lists are (isomorphic to) trees with (possibly) labelled nodes. A
 completely general solution in which two trees have possibly different
 topologies and different labels would therefore involve identifying
 the paths to leaves on each tree, e.g. via depth first search using
 recursion, and unioning leaves with the same paths (which could be
 quickly found in R via match() on the paths). This is a standard
 exercise in a data structures course.

 Considerable simplification could be effected if tree topologies
 and/or labels are identical or have other restrictions on them.
 However, you have not made it clear in your post whether this is the
 case (it is in your example).


Thanks so much to all of you for your very helpful suggestions, that
helped me solve my problem.

The tree topologies are indeed identical, so the suggested solutions did
work, but just for me to learn: Can somebody point me to how a general
solution mentioned by Bert would look like?

Cheers,

Georg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems creating a PNG file for a dendrogram: Error in plot.window(...) : need finite 'xlim' values

2011-01-12 Thread Richard Vlasimsky

 That was a simple solution.  Turns out you were correct, plot(p) was the 
problem.  Simply removing it and everything worked perfectly.  

Thanks Bill, Peter and David for your help. 

On Jan 11, 2011, at 8:29 PM, David Winsemius wrote:


On Jan 11, 2011, at 9:27 PM, David Winsemius wrote:

 
 On Jan 11, 2011, at 7:01 PM, Richard Vlasimsky wrote:
 
 
 Has anyone successfully created a PNG file for a dendrogram?
 
 I am able to successfully launch and view a dendrogram in Quartz.  However, 
 the dendrogram is quite large (too large to read on a computer screen), so I 
 am trying to save it to a file (1000x4000 pixels) for viewing in other apps. 
  However, whenever I try to initiate a PNG device, I get a need finitite 
 'xlim' values error.
 
 
 
 Here is some example code to illustrate my point:
 
 cor.matrix - cor(mydata,method=pearson,use=pairwise.complete.obs);
 distance - as.dist(1.0-cor.matrix);
 hc - hclust(distance);
 p - plot(hc);
 plot(p);
 #This works!  Plot is generated in quartz no problem.
 
 
 #Now, try this:
 png(filename=delme.png,width=4000,height=1000);
 cor.matrix - cor(mydata,method=pearson,use=pairwise.complete.obs);
 distance - as.dist(1.0-cor.matrix);
 hc - hclust(distance);
 p - plot(hc);
 plot(p);
 #Error in plot.window(...) : need finite 'xlim' values
 #In addition: Warning messages:
 #1: In min(x) : no non-missing arguments to min; returning Inf
 #2: In max(x) : no non-missing arguments to max; returning -Inf
 #3: In min(x) : no non-missing arguments to min; returning Inf
 #4: In max(x) : no non-missing arguments to max; returning -Inf
 
 I'm not sure the other two answers address the problems I found. When I try 
 to set up a png file with the parameters width=4000,height=1000, on a Mac I 
 intially got no plot with what is an otherwise valid command. But after 
 successfully getting plotting to a png device the logjam appear broken. Try:
 
 graphics.off()
 dev.list()
 #NULL
 png(filename=delme.png,width=4000,height=1000);
 plot(hc)
 dev.off()
 
 
 (Of course I used dev.off() which you did not, but even adding dev.off() was 
 not enough to get success, at least initially. I don't understand the 
 suggestion to get rid of plot(hc) or the suggestion that hclust() returns 
 NULL. That's certainly not how I read the help page and examples for hclust.)

I guess it's true to say that I misunderstood when Venables and Langfelder 
_didn't_ say either of those things.  They said that plot(plot(hc)) was not 
needed and I have now seen the light. The missing ingredient in the OP's 
frustrations is still dev.off().

 
 This is the exact same code, only a prior call to png() causes the seemingly 
 unrelated xlim to fail.  Why is this?
 
 Thanks,
 Richard Vlasimsky
 
 
 David Winsemius, MD
 West Hartford, CT

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to change strip text of effect plot

2011-01-12 Thread Deepayan Sarkar
On Wed, Jan 12, 2011 at 10:36 AM, Joshua Wiley jwiley.ps...@gmail.com wrote:
 Hmm, I felt like it was the clearest, most direct way.  Then again,
 things of this nature (overide defaults using arguments rather than
 changing what you feed the functions) do seem to be a common request
 both for lattice and ggplot2.  In any case, my memory of the lattice
 book and a quick search of the archives suggest that the typical way
 to do this when dealing with lattice functions such as xyplot() would
 be to define a custom strip function.  For example:

 strip = function(..., factor.levels) strip.default(..., factor.levels
 = c(low, medium, high))

 however, if I am not mistaken, you are plotting an object of class
 efflist, which dispatches plot.efflist which eventually dispatches
 plot.eff, in the bowels of plot.eff is the call to xyplot(), with its
 own strip function already defined.  I did not see a way to override
 this, nor did I see an option to pass an argument to it.

 It is trivial to edit the function's code to work with such a change.
 You could probably create a local copy of it, make the necessary
 changes, and then just ensure that your object gets dispatched to your
 revised function rather than the packages original.  Even easier, I
 suppose:

 debug(plot)
 plot(eff.cowles, 'neuroticism:ex2',factor.names=F)
 ## follow along until it has just created the object plot
 ## then overwrite that with your desired changes (factor.levels)
 ## before it is printed to the screen

    plot - xyplot(eval(parse(text = paste(fit ~, predictors[x.var],
        |, paste(predictors[-x.var], collapse = *, strip =
 function(..., factor.levels) strip.default(..., factor.levels =
 c(low, medium, high),
        strip.names = c(factor.names, TRUE)), panel = function(x,
        y, subscripts, x.vals, rug, lower, upper, has.se, ...) {
        llines(x, y, lwd = 2, col = colors[1], ...)
        if (rug)
            lrug(x.vals)
        if (has.se) {
            llines(x, lower[subscripts], lty = 2, col = colors[2])
            llines(x, upper[subscripts], lty = 2, col = colors[2])
        }
        if (has.thresholds) {
            panel.abline(h = thresholds, lty = 3)
            panel.text(rep(current.panel.limits()$xlim[1], length(thresholds)),
                thresholds, threshold.labels, adj = c(0, 0),
                cex = 0.75)
            panel.text(rep(current.panel.limits()$xlim[2], length(thresholds)),
                thresholds, threshold.labels, adj = c(1, 0),
                cex = 0.75)
        }
    }, ylim = ylim, ylab = ylab, xlab = if (missing(xlab))
        predictors[x.var]
    else xlab, x.vals = x.vals, rug = rug, main = main, lower = x$lower,
        upper = x$upper, has.se = has.se, data = x, scales = list(y =
 list(at = tickmarks$at,
            labels = tickmarks$labels), alternating = alternating),
        ...)

 alternately still, you could rename the factor levels in the object
 eff.cowles, which is sort of inbetween changing your data and
 changing the strip label defaults.

 I can understand why it seems like there should be a simpler solution,
 but I honestly do not see one.  The factor.levels argument does not
 even work inside xyplot(), it needs to be in the strip function, and
 that is nested far away from plot(effobject).  I do not see any
 documentation that suggests a way, nor do I see any formal arguments
 to facilitate it.  Then again, I'm not an expert in lattice or the
 effects package.


One generally useful fact is that if you can get hold of the trellis
object being plotted, you may be able to change the strip labels
before plotting it. For example,

 foo - trellis.last.object()

or in this case

 foo - plot(eff.cowles[['neuroticism:ex2']],factor.names=F)

(this returns the object, but also forces a plot), followed by

 dimnames(foo)
$ex2
[1] (1.98,8.99] (8.99,16]   (16,23]
 dimnames(foo)$ex2 - c(low, medium, high)
 foo

But I would tend to agree with Josh, that specifying labels in the
cut() call is the more natural approach. It is not the job of graphics
functions to modify (characteristics of) your data.

-Deepayan


 Best regards,

 Josh

 On Tue, Jan 11, 2011 at 8:06 PM, Wincent ronggui.hu...@gmail.com wrote:
 Sure that is one way to go.
 Since it is possible to pass lattice arguments to that high level
 function, and there should be ways to relabel/custom the strip text in
 the lattice plotting system, I would think such an indirect method a
 last resort.

 Thank you.

 Ronggui

 On 12 January 2011 11:51, Joshua Wiley jwiley.ps...@gmail.com wrote:
 Hi,

 I am guessing this is not what you meant by on the fly, but I think
 it will be by far the easiest way.  Plotting an effects object is a
 high level plot with a lot of defaults and automation built in to make
 your life simple.  The cost is that it is less flexible---you work its
 way, not vice versa.  If you want the factor named high, just label it
 that way to begin with.  If you think it makes the graphs more
 

Re: [R] Degrees of freedom

2011-01-12 Thread Dennis Murphy
Hi:

Look at the links in the following blog entry:
http://blog.lib.umn.edu/moor0554/canoemoore/2010/09/lmer_p-values_lrt.html

and this discussion, found on the R wiki:
http://rwiki.sciviews.org/doku.php?id=guides:lmer-tests

Also see Ben Bolker's GLMM wiki page, which discusses many of the unresolved
foundational issues in (generalized) linear mixed models:
http://glmm.wikidot.com/faq

Welcome to the jungle :)

HTH,
Dennis



On Tue, Jan 11, 2011 at 9:09 PM, Umit Tokac u...@fsu.edu wrote:

 Hello,

 I have a little problem about degree of freedom in R.
 if you can help me, I will be happy.
 I used nlme function to analyze my data and run the linear mixed
 effects model in R.
 I did the linear mixed effect analysis in SAS and SPSS as well.
 However, R gave the different degrees of freedom than SAS and SPSS did.
 Can you help me to learn what the reason is to obtain different
 degrees of freedom from R?

 Thanks

 Umit Tokac
 Graduate Student
 Measurement and Statistics
 Florida State University
 Tel:(850)345-7487

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R syntax for 95% prediction interval on a left-truncated normal variable

2011-01-12 Thread Fabio Colombo
Hi all,
I am searching for a R procedure to calculate a 95% prediction interval
(i.e. the interval in which following observations of the variable will
occur, not a confidence interval) on a variable that is the natural
logarithm of a ratio that  is always equal ore superior to 1, so the
natural-log-transformed variable is left truncated from 0 and I think that
it is expected to be normal; anyone can help? ThankYou,
Fabio 

Dott. Fabio Colombo
Coordinatore tecnico
Università degli Studi di Milano, Dipartimento di Scienze e Tecnologie
Veterinarie per la Sicurezza Alimentare, Laboratorio di Identificazione di
Specie (LIS)
Via A. Grasselli, 7, 20131 Milano, Italy
Phone: +390250318504
Fax: +390250318501
E.mail fabio.colo...@unimi.it

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to change strip text of effect plot

2011-01-12 Thread John Fox
Dear Deepayan, Josh, and Ronggui,

I've recently changed plot.eff() so that it returns an object, normally 
printed by print.plot.eff(). You can therefore manipulate the lattice object, 
as Deepayan suggests. This change is currently in the development version of 
the effects package on R-Forge, not yet on CRAN. 

I agree with both Josh and Deepayan that it would be simpler for you just to 
name the levels of the factor with the labels you want.

Best,
 John


John Fox
Senator William McMaster
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Deepayan Sarkar
 Sent: January-12-11 7:26 AM
 To: Joshua Wiley
 Cc: r help
 Subject: Re: [R] how to change strip text of effect plot
 
 On Wed, Jan 12, 2011 at 10:36 AM, Joshua Wiley jwiley.ps...@gmail.com
 wrote:
  Hmm, I felt like it was the clearest, most direct way.  Then again,
  things of this nature (overide defaults using arguments rather than
  changing what you feed the functions) do seem to be a common request
  both for lattice and ggplot2.  In any case, my memory of the lattice
  book and a quick search of the archives suggest that the typical way
  to do this when dealing with lattice functions such as xyplot() would
  be to define a custom strip function.  For example:
 
  strip = function(..., factor.levels) strip.default(..., factor.levels
  = c(low, medium, high))
 
  however, if I am not mistaken, you are plotting an object of class
  efflist, which dispatches plot.efflist which eventually dispatches
  plot.eff, in the bowels of plot.eff is the call to xyplot(), with its
  own strip function already defined.  I did not see a way to override
  this, nor did I see an option to pass an argument to it.
 
  It is trivial to edit the function's code to work with such a change.
  You could probably create a local copy of it, make the necessary
  changes, and then just ensure that your object gets dispatched to your
  revised function rather than the packages original.  Even easier, I
  suppose:
 
  debug(plot)
  plot(eff.cowles, 'neuroticism:ex2',factor.names=F) ## follow along
  until it has just created the object plot
  ## then overwrite that with your desired changes (factor.levels) ##
  before it is printed to the screen
 
 plot - xyplot(eval(parse(text = paste(fit ~, predictors[x.var],
 |, paste(predictors[-x.var], collapse = *, strip =
  function(..., factor.levels) strip.default(..., factor.levels =
  c(low, medium, high),
 strip.names = c(factor.names, TRUE)), panel = function(x,
 y, subscripts, x.vals, rug, lower, upper, has.se, ...) {
 llines(x, y, lwd = 2, col = colors[1], ...)
 if (rug)
 lrug(x.vals)
 if (has.se) {
 llines(x, lower[subscripts], lty = 2, col = colors[2])
 llines(x, upper[subscripts], lty = 2, col = colors[2])
 }
 if (has.thresholds) {
 panel.abline(h = thresholds, lty = 3)
 panel.text(rep(current.panel.limits()$xlim[1],
  length(thresholds)),
 thresholds, threshold.labels, adj = c(0, 0),
 cex = 0.75)
 panel.text(rep(current.panel.limits()$xlim[2],
  length(thresholds)),
 thresholds, threshold.labels, adj = c(1, 0),
 cex = 0.75)
 }
 }, ylim = ylim, ylab = ylab, xlab = if (missing(xlab))
 predictors[x.var]
 else xlab, x.vals = x.vals, rug = rug, main = main, lower =
  x$lower,
 upper = x$upper, has.se = has.se, data = x, scales = list(y =
  list(at = tickmarks$at,
 labels = tickmarks$labels), alternating = alternating),
 ...)
 
  alternately still, you could rename the factor levels in the object
  eff.cowles, which is sort of inbetween changing your data and
  changing the strip label defaults.
 
  I can understand why it seems like there should be a simpler solution,
  but I honestly do not see one.  The factor.levels argument does not
  even work inside xyplot(), it needs to be in the strip function, and
  that is nested far away from plot(effobject).  I do not see any
  documentation that suggests a way, nor do I see any formal arguments
  to facilitate it.  Then again, I'm not an expert in lattice or the
  effects package.
 
 
 One generally useful fact is that if you can get hold of the trellis
 object being plotted, you may be able to change the strip labels before
 plotting it. For example,
 
  foo - trellis.last.object()
 
 or in this case
 
  foo - plot(eff.cowles[['neuroticism:ex2']],factor.names=F)
 
 (this returns the object, but also forces a plot), followed by
 
  dimnames(foo)
 $ex2
 [1] (1.98,8.99] (8.99,16]   (16,23]
  dimnames(foo)$ex2 - c(low, medium, high) foo
 
 But I would tend to agree with Josh, that specifying 

[R] graphics: 3D regression plane

2011-01-12 Thread Federico Bonofiglio
Hello Masters,
wishing you all a great 2011 I was also going to ask if anyone knows a quick
and efficient way to plot a regression plane (z~x*y).

I have tried the regr2.plot{HH} function but it is only an educational tool
and has poor graphical properties.

I also tried to run the following script on a fictitious longitudinal
problem, with poor results

set.seed(1234)

id-c(rep(1,3),rep(2,4),rep(3,2)) # subjects
y-rchisq(9,df=20) #response
k-rnorm(9,4,2) # x
time-as.Date(c(03/07/1981,15/11/1981,03/04/1983,08/12/1979,
30/12/1979,08/03/1980,12/08/1980,12/08/1973,28/03/1975),
format=%d/%m/%Y)
fac-c(m,m,m,f,f,f,f,m,m)# sex


d1-as.vector(by(time,factor(id),min))
t0-as.Date(d1,origin=as.Date(1970-01-01));t0

A-data.frame(id=c(1,2,3),t=t0)
B-data.frame(id=id,tempo=time)
C-merge(A,B);C

rd-as.vector(C$tempo-C$t);rd #time centered on sbj specific first
occurrence


mod-lm(y~rd*k)
newax- expand.grid(
days = giorni-seq(min(rd),max(rd), length=100),
expl= esplic- seq(min(k), max(k), length=100)
)
fit - predict(mod,data.frame(rd=giorni,k=esplic))
graph - persp(x=giorni, y=esplic,fit,
 expand=0.5, ticktype=detailed, theta=-45)  #error : z argument not valid

 I would be grateful if someone would give me some suggestions.
Thank u again and happy new year

Federico Bonofiglio

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Issue loading and executing own function.R with JRI, any ideas?

2011-01-12 Thread nikson

Hey,

how did you manage to load and call your own function with JRI?

I managed to execute the build-in R functions with JRI, but when I call
r.eval(load(path-to-file)) or r.eval(source(path-to-file)), but my java
program terminates :(

Thanks for your help!
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Issue-loading-and-executing-own-function-R-with-JRI-any-ideas-tp3213756p3213756.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multilevel pseudo maximum likelihood

2011-01-12 Thread n4538

Caterina,

Did you get an answer to this question? I'm trying to do something similar.

Jason
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Multilevel-pseudo-maximum-likelihood-tp878413p3213583.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sum by column

2011-01-12 Thread Peter Francis
Dear List,

I have a question of convenience,

I am looking to sum the values of one column based on another column - a 
example may help explain better!

ED  ECOCODE
21.809467   AA0101
36.229566   PA1201
51.861284   PA1201
11.36232PA1201
27.264634   PA1201
12.261986   PA1201
46.519313   PA1201
7.815376PA1201
2.810428PA1201
13.478372   PA1201
35.670182   PA1301
27.128715   AT0801
19.010294   AT1201
15.475368   AT1201
18.597983   AT0101
29.292615   AT0101
6.749846AT0101
14.981488   AT0101
14.93511AT0101
14.93511AT0101
21.040785   AT0101
8.271615AT0101
12.94232AT0101
6.749846AT0101
15.484412   AT0101
29.644494   AT0101
43.211212   AT0101

So for AA0101 it would be = 21.809467
AT1201 it would be = 19.010294+15.475368

etc

I would then like to be able to output a table with ECOCODE in one column and 
the sum of ED in the other.

This is stored in a dataframe called ecoregion, i understand people like having 
code to change but i have none as i am a relative beginner! Sorry in advance!

Thanks 

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot: skip a range of axis

2011-01-12 Thread Yuan Jian
thanks Jim,
I found gap.plot seperates x axis or y axis into two boxes.
do you know any plot tool that can skip a range in x-axis or y-axis without 
lines?
 
regards
YU

--- On Wed, 12/1/11, Jim Lemon j...@bitwrit.com.au wrote:


From: Jim Lemon j...@bitwrit.com.au
Subject: Re: [R] plot: skip a range of axis
To: Yuan Jian jayuan2...@yahoo.com
Cc: r-help@r-project.org
Received: Wednesday, 12 January, 2011, 10:26 AM


On 01/12/2011 03:46 PM, Yuan Jian wrote:
 Hi,
 I am using plot to show scatter points in 2_D.
 in my data, there is no data between -1 and +1 in x-axis.
 I want to skip this region, i.e. x axis becomes [-Inf:-1, 1:Inf].
 can any one tell me how to do?

Hi Yu,
Try the gap.plot function in the plotrix package.

Jim




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Weighted Likelihood Estimation of NIG Dist.

2011-01-12 Thread n4538

Hi All,

I've put together some script which gives me the parameters of a modified
NIG distribution. Now I'd like to include a weighted vector within the
maximization function. The code below gives me parameter estimates which are
implicitly equally weighted.


 nig.par.fit - try(optim(vega, negllh, hessian = se, pdf = density.nig,
   tmp.data = data, transf = transform, const.pars =
vars[!opt.pars],
   silent = silent, par.names = names(vars), ...))

vega are the parameters to be fitted.

This seems to work reasonably for my equally weighted observations, but I
have not been able to find/adjust the function to include a weighted vector
(w_i). I'm trying not to alter the density function.

Is there a way of doing this with optim, or another optimizer I should be
looking at?

Many thanks,

J
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Weighted-Likelihood-Estimation-of-NIG-Dist-tp3213608p3213608.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] non-parametric discriminant analysis

2011-01-12 Thread Walter Durka

Hi,

I used linear discriminant analysis (lda) to classify and to 
cross-validate samples of two plant species based on morphometric data 
and to identify the variables that best discriminate between the two 
species.
Because some of the variables are not, and can not be transformed to be, 
normally distributed, I would like to use a non-parametric method.


Are there any R packages that provide methods for non-parametric 
discriminant analysis? Would randomForest 
(http://cran.r-project.org/web/packages/randomForest) be appropriate and 
recommended?


Thanks for help and best regards

Walter Durka


--

*
Dr. Walter Durka
Department Biozönoseforschung
Department of community ecology

Helmholtz-Zentrum für Umweltforschung GmbH - UFZ
Helmholtz Centre for Environmental Research - UFZ
Theodor-Lieser-Str. 4 / 06120 Halle / Germany

walter.du...@ufz.de / http://www.ufz.de/index.php?en=798
phone +49 345 558 5314 / Fax +49 345 558 5329

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum by column

2011-01-12 Thread David Winsemius

There are two functions you need to become familiar with:

?tapply
?ave

If you wanted these summed values to be placed in another column of  
the same dataframe, you would use ave. If you wanted a new structure  
(somewhat shorter) you would use tapply with sum as the function. E. g:


tapply(ecoregion$ED, ecoregion$ECOCODE, sum)

--
David.

On Jan 12, 2011, at 5:38 AM, Peter Francis wrote:


Dear List,

I have a question of convenience,

I am looking to sum the values of one column based on another column  
- a example may help explain better!


ED  ECOCODE
21.809467   AA0101
36.229566   PA1201
51.861284   PA1201
11.36232PA1201
27.264634   PA1201
12.261986   PA1201
46.519313   PA1201
7.815376PA1201
2.810428PA1201
13.478372   PA1201
35.670182   PA1301
27.128715   AT0801
19.010294   AT1201
15.475368   AT1201
18.597983   AT0101
29.292615   AT0101
6.749846AT0101
14.981488   AT0101
14.93511AT0101
14.93511AT0101
21.040785   AT0101
8.271615AT0101
12.94232AT0101
6.749846AT0101
15.484412   AT0101
29.644494   AT0101
43.211212   AT0101

So for AA0101 it would be = 21.809467
AT1201 it would be = 19.010294+15.475368

etc

I would then like to be able to output a table with ECOCODE in one  
column and the sum of ED in the other.


This is stored in a dataframe called ecoregion, i understand people  
like having code to change but i have none as i am a relative  
beginner! Sorry in advance!


Thanks

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum by column

2011-01-12 Thread Joshua Wiley
Hi Peter,

R has some fairly flexible ways of passing values of some variable (X)
by another (the INDEX) to different FUNctions.  Here is an example
using your data:

## your email data, in convenient form
dat - structure(list(ED = c(21.809467, 36.229566, 51.861284, 11.36232,
27.264634, 12.261986, 46.519313, 7.815376, 2.810428, 13.478372,
35.670182, 27.128715, 19.010294, 15.475368, 18.597983, 29.292615,
6.749846, 14.981488, 14.93511, 14.93511, 21.040785, 8.271615,
12.94232, 6.749846, 15.484412, 29.644494, 43.211212), ECOCODE = structure(c(1L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 3L, 4L, 4L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(AA0101,
AT0101, AT0801, AT1201, PA1201, PA1301), class = factor)),
.Names = c(ED,
ECOCODE), class = data.frame, row.names = c(NA, -27L))
## look at the structure of the data
str(dat)

## inside of dat (to avoid typing its name repeatedly)
## find the sum of ED at each level of ECOCODE
with(dat, tapply(X = ED, INDEX = ECOCODE, FUN = sum, na.rm = TRUE))

## should give something like
   AA0101AT0101AT0801AT1201PA1201PA1301
 21.80947 236.83684  27.12871  34.48566 209.60328  35.67018

For documentation, look at:

?tapply
## similar in many ways though sometimes slightly more/less convenient
?by

Hope that helps,

Josh

On Wed, Jan 12, 2011 at 2:38 AM, Peter Francis peterfran...@me.com wrote:
 Dear List,

 I have a question of convenience,

 I am looking to sum the values of one column based on another column - a 
 example may help explain better!

 ED                      ECOCODE
 21.809467       AA0101
 36.229566       PA1201
 51.861284       PA1201
 11.36232        PA1201
 27.264634       PA1201
 12.261986       PA1201
 46.519313       PA1201
 7.815376        PA1201
 2.810428        PA1201
 13.478372       PA1201
 35.670182       PA1301
 27.128715       AT0801
 19.010294       AT1201
 15.475368       AT1201
 18.597983       AT0101
 29.292615       AT0101
 6.749846        AT0101
 14.981488       AT0101
 14.93511        AT0101
 14.93511        AT0101
 21.040785       AT0101
 8.271615        AT0101
 12.94232        AT0101
 6.749846        AT0101
 15.484412       AT0101
 29.644494       AT0101
 43.211212       AT0101

 So for AA0101 it would be = 21.809467
            AT1201 it would be = 19.010294+15.475368

 etc

 I would then like to be able to output a table with ECOCODE in one column and 
 the sum of ED in the other.

 This is stored in a dataframe called ecoregion, i understand people like 
 having code to change but i have none as i am a relative beginner! Sorry in 
 advance!

 Thanks

 Peter

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] flexmix: predictions on new data from flexmix object

2011-01-12 Thread xmilhaud
 
Dear R Users, R Core Team,

I currently wonder how to predict the probability of an event with new data 
resulting from a finite mixture.
I read the documentation of the flexmix package and the examples of 
applications provided on CRAN but I could not find how to predict (except 
manually but I am looking for a simpler solution) the final probability of 
the mixture (for each individual) with new data (I mean data different from the 
ones I used to build the model).

I should have missed something but basically, I am fitting a 2-components 
mixture model with logistic weights and logistic components (and different 
explanatory variables, no identifiability problem). The flexmix object is then 
used in predict() function with 'newdata' in argument, and the predictions with 
these new data are obviously different depending on the component which is 
assigned to new observations.

My question is: how can I access the information on clustering the new 
observations (if I understood well, predictions are the predictions of the 
event probability for each individual and each component, but these are not 
predictions for cluster assignment...) ?

Thank you very much in advance for your answers,
Sincerely,
Xavier M.


Une messagerie gratuite, garantie à vie et des services en plus, ça vous 
tente ?
Je crée ma boîte mail www.laposte.net

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Integrate and subdivisions limit

2011-01-12 Thread Alaios
Dear all,
I have some issues with integrate in R thus I would like to request your help.
I am trying to calculate the integral of f(x)*g(x).

The f(x) is a step function while g(x) is a polynomial.
If f(x) (step function) changes its value only few times (5 or 6 'steps') 
everything is calulated ok(verified results in scrap paper) but if f(x) takes 
like 800 different values I receive the error

Error in integrate number of subdivisions reached

I did some checks on the internet and I found that I can increase the number of 
subdivisions (as this is a parameter in integrate().  Thus I raised it from 100 
to 1000 (and then to 1).

A. Does this makes the error produced higher or does it only stress the 
computer?

B. When the number was raised to 10.000 I started getting the error message   
roundoff error was detected


What do you think I should do to solve that?

I would like to thank u in advance for your help

Best Regards
Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: vector or list of matrices corresponding to an observation

2011-01-12 Thread SL
You're  right Jim. I have now started to work with list of matrices.
I'm not sure that it is a clean and nice code but it works.

I was wondering if there was a synthetic but advanced tutorial on
list() and associated functions?

Stephane

-- Forwarded message --
From: jim holtman jholt...@gmail.com
Date: 2011/1/12
Subject: Re: [R] vector or list of matrices corresponding to an observation
To: SL sl...@yahoo.fr


An actual example of your data would be useful along with how you
might want to access it.  You can create a matrix of 'list' objects
and these could contain your matrices, but not knowing what the data
looks like, or how you intend to use it, make it hard to provide a
solution.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Outputting csv file from dataframe with columns in a particular order

2011-01-12 Thread analys...@hotmail.com
I have a dataframe with columns ID,'date,estimate,actual (but
not necessarily in that order - I do a merge somewhere and that
somehow messes up the order of the columns).

How can I output it to a csv file with the columns in the order that I
want?

Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] debug biglm response error on bigglm model

2011-01-12 Thread Mike Harwood
Thank you, Greg.  The issue was in the simulation logic, where one of
the values was not changing correctly for some iterations...

On Jan 10, 3:20 pm, Greg Snow greg.s...@imail.org wrote:
 Not sure, but one possible candidate problem is that in your simulations one 
 iteration ended up with fewer levels of a factor than the overall dataset and 
 that caused the error.

 There is no recode function in the default packages, there are at least 6 
 recode functions in other packages, we cannot tell which you were using from 
 the code below.

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111





  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
  project.org] On Behalf Of Mike Harwood
  Sent: Monday, January 10, 2011 6:29 AM
  To: r-h...@r-project.org
  Subject: [R] debug biglm response error on bigglm model

  G'morning

  What does the error message Error in x %*% coef(object) : non-
  conformable arguments indicate when calculating the response values
  for
  newdata with a model from bigglm (in package biglm), and how can I
  debug it?  I am attempting to do Monte Carlo simulations, which may
  explain the loop in the code that follows.  After the code I
  have included the output, which shows that the simulations are
  changing the response and input values, and that there are not any
  atypical values for the
  factors in the seventh iteration.  At the end of the output is the
  aforementioned error message.  Finally, I have included the model from
  biglm.

  Thanks in advance!

  Code:
  ===
  iter - nrow(nov.2010)
  predict.nov.2011 - vector(mode='numeric', length=iter)
  for (i in 1:iter) {
      iter.df - nov.2010
      ##-- Update values of dynamic variables --
      iter.df$age - iter.df$age + 12
      iter.df$pct_utilize -
          iter.df$pct_utilize + mc.util.delta[i]

      iter.df$updated_varname1 -
          ceiling(iter.df$updated_varname1 + mc.varname1.delta[i])

      if(iter.df$state==WI)
          iter.df$varname3 - iter.df$varname3 + mc.wi.varname3.delta[i]
      if(iter.df$state==MN)
          iter.df$varname3 - iter.df$varname3 + mc.mn.varname3.delta[i]
      if(iter.df$state==IL)
          iter.df$varname3 - iter.df$varname3 + mc.il.varname3.delta[i]
      if(iter.df$state==US)
          iter.df$varname3 - iter.df$varname3 + mc.us.varname3.delta[i]

      ##--- Bin Variables --
      iter.df$bin_varname1 - as.factor(recode(iter.df$updated_varname1,
          300:499 = '300 - 499';
           500:549 = '500 - 549';
           550:599 = '550 - 599';
           600:649 = '600 - 649';
           650:699 = '650 - 699';
           700:749 = '700 - 749';
           750:799 = '750 - 799'; 800:849 = 'GE 800'; else    =
  'missing';
           ))
      iter.df$bin_age - as.factor(recode(iter.df$age,
          0:23   = '  24mo.';
           24:72  = '24 - 72mo.';
           72:300 = '72 - 300mo'; else   = 'missing';
           ))
      iter.df$bin_util - as.factor(recode(iter.df$pct_utilize,
          0.0:0.2 = '  0 - 20%';
           0.2:0.4 = '  20 - 40%';
           0.4:0.6 = '  40 - 60%';
           0.6:0.8 = '  60 - 80%';
           0.8:1.0 = ' 80 - 100%';
           1.0:1.2 = '100 - 120%'; else    = 'missing';
           ))
      iter.df$bin_varname2 - as.factor(recode(iter.df$varname2_prop,
          0:70 = '     70%';
           70:85 = ' 70 - 85%';
           85:95 = ' 85 - 95%';
           95:110 = '95 - 110%'; else  =  'missing';
           ))
      iter.df$bin_varname1 - relevel(iter.df$bin_varname1, 'missing')
      iter.df$bin_age - relevel(iter.df$bin_age, 'missing')
      iter.df$bin_util - relevel(iter.df$bin_util, 'missing')
      iter.df$bin_varname2 - relevel(iter.df$bin_varname2, 'missing')

  #~     print(head(iter.df))
      if (i=6  i=8){
           print('-')
           browser()
           print(i)
           print(table(iter.df$bin_varname1))
           print(table(iter.df$bin_age))
           print(table(iter.df$bin_util))
           print(table(iter.df$bin_varname2))
  #~         debug(predict.nov.2011[i] -
  #~              sum(predict(logModel.1, newdata=iter.df,
  type='response')))
       }

      predict.nov.2011[i] -
           sum(predict(logModel.1, newdata=iter.df, type='response'))

      print(predict.nov.2011[i])

    }

  Output
  ==
  [1] 36.56073
  [1] 561.4516
  [1] 4.83483
  [1] 5.01398
  [1] 7.984146
  [1] -
  Called from: top level
  Browse[1]
  [1] 6

    missing 300 - 499 500 - 549 550 - 599 600 - 649 650 - 699 700 - 749
  750 - 799    GE 800
        842       283       690      1094      1695      3404
  6659     18374     21562

     missing     24mo. 24 - 72mo. 72 - 300mo
          16       2997      19709      31881

     missing    0 - 20%   20 - 40%   40 - 60%   60 - 80%  80 - 100% 100
  - 120%
       17906     

Re: [R] extracting more information from optim in R?

2011-01-12 Thread Bart Joosen

I have no experience with writing C code, but if I have such problems in R
code, I add a line to my function which prints the values to the console:
eg:
fr - function(x) {   ## Rosenbrock Banana function
x1 - x[1]
x2 - x[2]
cat (paste(x1, x2, \n))
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
optim(c(-1.2,1), fr)

If the same goes for C, I don't know.

Bart
-- 
View this message in context: 
http://r.789695.n4.nabble.com/extracting-more-information-from-optim-in-R-tp3213439p3214066.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum by column

2011-01-12 Thread Peter Francis
David and Josh,

Thanks very much for your help, it is much appreciated.

Peter


On 12 Jan 2011, at 14:28, David Winsemius wrote:

There are two functions you need to become familiar with:

?tapply
?ave

If you wanted these summed values to be placed in another column of the same 
dataframe, you would use ave. If you wanted a new structure (somewhat shorter) 
you would use tapply with sum as the function. E. g:

tapply(ecoregion$ED, ecoregion$ECOCODE, sum)

-- 
David.

On Jan 12, 2011, at 5:38 AM, Peter Francis wrote:

 Dear List,
 
 I have a question of convenience,
 
 I am looking to sum the values of one column based on another column - a 
 example may help explain better!
 
 EDECOCODE
 21.809467 AA0101
 36.229566 PA1201
 51.861284 PA1201
 11.36232  PA1201
 27.264634 PA1201
 12.261986 PA1201
 46.519313 PA1201
 7.815376  PA1201
 2.810428  PA1201
 13.478372 PA1201
 35.670182 PA1301
 27.128715 AT0801
 19.010294 AT1201
 15.475368 AT1201
 18.597983 AT0101
 29.292615 AT0101
 6.749846  AT0101
 14.981488 AT0101
 14.93511  AT0101
 14.93511  AT0101
 21.040785 AT0101
 8.271615  AT0101
 12.94232  AT0101
 6.749846  AT0101
 15.484412 AT0101
 29.644494 AT0101
 43.211212 AT0101
 
 So for AA0101 it would be = 21.809467
   AT1201 it would be = 19.010294+15.475368
 
 etc
 
 I would then like to be able to output a table with ECOCODE in one column and 
 the sum of ED in the other.
 
 This is stored in a dataframe called ecoregion, i understand people like 
 having code to change but i have none as i am a relative beginner! Sorry in 
 advance!
 
 Thanks
 
 Peter
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bootstrapping to Correct Standard Errors in Two-Stage Least Square Estimation

2011-01-12 Thread Thanaset

Dear friends

I want to estimate an equation using two-stage least square but suspect that
the model suffers from autocorrelation.  Can someone please advise how to
implement bootstrapping method in order to calculate the correct standard
errors in R?  Thank you.

Kind regards
Thanaset

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Bootstrapping-to-Correct-Standard-Errors-in-Two-Stage-Least-Square-Estimation-tp3214080p3214080.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Basic Stars Plot - help ..

2011-01-12 Thread JP
Hi there Rers

I am trying a very basic stars plot:

x-matrix(c(1,4,3,1.1,2,3,4,3,1,1,5,2), ncol = 3, byrow = TRUE,
 dimnames=list(c(a,b,c,d),c(x,y,z)))

 stars(x, draw.segments = TRUE, radius=TRUE)


Can anyone explain what I am seeing there - EACH of my plots should have 3
coloured sectors no ? - for x, y and z.  How come I am seeing only two
sectors?  Also the length of the radii show the ratio compared to the other
values of the columns (e.g. x), correct ? (And there is no direct
relationship between the values of a single row, or is there?)

I am using R 1.12.1 for Linux,

Many Thanks
JP

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to define values for color distribution in the package gplots and function heatmap.2

2011-01-12 Thread Fredrik Alsin
Hi,

This question is about the package gplots and the function heatmap.2. I'm
not a programmer and I did not understand the answers I found when I googled
it. Thank you in advance.

A similar question was asked with title heatmap color distribution in 2005
and got the answer to use breaks. When I try to use the example code, I get
the error Error in image.default(1:nc, 1:nr, x, xlim = 0.5 + c(0, nc), ylim
= 0.5 +  : must have one more break than colour.

Basically, I have several sets of gene expression data and want to be able
to compare these between patients.

An example:

When I use heatmap.2 with one gene set (with the colors red-black-green),
green is 3 and red is 9 and in between the colors fade to black.

When I then use another gene set, green is 5 and red is 11.

I need to be able do define the limits to compare between heatmaps.

Thank you again,

Fredrik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Outputting csv file from dataframe with columns in a particular order

2011-01-12 Thread Ivan Calandra

Hi!

Let's say your data.frame is called df and that you want column 1, then 
column 4, then 3 and then 2:
df - data.frame(ID=LETTERS[1:5], date=rnorm(5), estimate=rnorm(5), 
actual=rnorm(5))

write.csv(df[c(1,4,3,2)], file=df.csv)

HTH,
Ivan

Le 1/12/2011 16:16, analys...@hotmail.com a écrit :

I have a dataframe with columns ID,'date,estimate,actual (but
not necessarily in that order - I do a merge somewhere and that
somehow messes up the order of the columns).

How can I output it to a csv file with the columns in the order that I
want?

Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] snowfall

2011-01-12 Thread Uwe Ligges

You forgot to load the required packages on the client nodes by

 sfLibrary(fCalendar)
 sfLibrary(fractalrock)

and you really should not tryCatch without evaluating the errors for 
yourself.


Best wishes,
Uwe Ligges

On 12.01.2011 09:47, Santosh Srinivas wrote:

Hello,

Just wondering why I am unable to run this in parallel.
A dput of my dataset is attached at the end. Please use to create my data
object.

I want to run this function in parallel (not sure if this is an efficient
implementation):

#Function to calculate the time to maturity for the option
require(fCalendar,quietly=TRUE) #Trying to calculate the trading days
require(fractalrock,quietly=TRUE) #Just to calculate the trading days
myFinCenter=Asia/Singapore

getTimeToMaturity- function(x){ 
tryCatch({
toDt- as.Date(as.character(x['EXPIRY_DT']), %Y-%m-%d)
#Expiry Date
fromDt- as.Date(as.character(x['TIMESTAMP']), %Y-%m-%d)
#Trade Timestamp
NoOfDays- NROW(getTradingDates(toDt,fromDt))
return(NoOfDays/252)
},
error = function (ex){
#print (paste(Error in,toDt,fromDt))
NoOfDays- 0
return(NoOfDays/252)
}
)
}


Question: The following two lines work but the third and parallel one
doesn't ... why?

1)  apply(dNiftyOpt,1,getTimeToMaturity) #Works
  1  2  3  4  5  6  7
8  9 10 11 12 13 14
15 16 17 18 19 20
0.02380952 0.01984127 0.07936508 0.02380952 0.01984127 0.01190476 0.0278
0.02380952 0.01984127 0.01190476 0.02380952 0.01984127 0.02380952 0.02380952
0.01984127 0.02380952 0.01984127 0.02380952 0.02380952 0.0278


library(snowfall)
2)  sfInit()
snowfall 1.84 initialized: sequential execution, one CPU.


sfApply(dNiftyOpt,1,getTimeToMaturity) #Works

  1  2  3  4  5  6  7
8  9 10 11 12 13 14
15 16 17 18 19 20
0.02380952 0.01984127 0.07936508 0.02380952 0.01984127 0.01190476 0.0278
0.02380952 0.01984127 0.01190476 0.02380952 0.01984127 0.02380952 0.02380952
0.01984127 0.02380952 0.01984127 0.02380952 0.02380952 0.0278


sfStop()



DOESN'T WORK: 3)

sfInit( parallel=TRUE, cpus=4 );
sfApply(dNiftyOpt,1,getTimeToMaturity) #Added the time to maturity.

DOESN'T WORK?
  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sfStop();




My dataset:
dput(dNiftyOpt)

structure(list(INSTRUMENT = c(OPTIDX, OPTIDX, OPTIDX, OPTIDX,
OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX,
OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX, OPTIDX,
OPTIDX, OPTIDX), SYMBOL = c(NIFTY, NIFTY, NIFTY, NIFTY,
NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY,
NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY, NIFTY,
NIFTY, NIFTY), EXPIRY_DT = c(2004-01-29, 2004-01-29,
2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29,
2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29,
2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29, 2004-01-29,
2004-01-29, 2004-01-29, 2004-01-29), STRIKE_PR = c(1780,
1780, 1800, 1800, 1800, 1800, 1800, 1800, 1800, 1800, 1820, 1820,
1820, 1830, 1830, 1830, 1830, 1840, 1840, 1850), OPTION_TYP = c(PE,
PE, CE, CE, CE, CE, PE, PE, PE, PE, CE, CE,
PE, CE, CE, PE, PE, CE, PE, CE), SETTLE_PR = c(27.4,
5.7, 152.95, 28.6, 70.45, 111.35, 14.75, 39.2, 8.6, 2.35, 20.4,
54.2, 50.15, 18.35, 47.25, 51.75, 15.5, 14.95, 57.95, 26.3),
 TIMESTAMP = c(2004-01-22, 2004-01-23, 2004-01-02, 2004-01-22,
 2004-01-23, 2004-01-27, 2004-01-21, 2004-01-22, 2004-01-23,
 2004-01-27, 2004-01-22, 2004-01-23, 2004-01-22, 2004-01-22,
 2004-01-23, 2004-01-22, 2004-01-23, 2004-01-22, 2004-01-22,
 2004-01-21), Underlying = c(1770.5, 1847.55, 1946.05, 1770.5,
 1847.55, 1904.7, 1824.6, 1770.5, 1847.55, 1904.7, 1770.5,
 1847.55, 1770.5, 1770.5, 1847.55, 1770.5, 1847.55, 1770.5,
 1770.5, 1824.6), UnderlyingVol = c(0.293906144944403, 0.331877179605752,

 0.129552369208600, 0.293906144944403, 0.331877179605752,
 0.348918971622834, 0.276334860399362, 0.293906144944403,
 0.331877179605752, 0.348918971622834, 0.293906144944403,
 0.331877179605752, 0.293906144944403, 0.293906144944403,
 0.331877179605752, 0.293906144944403, 0.331877179605752,
 0.293906144944403, 0.293906144944403, 0.276334860399362)), .Names =
c(INSTRUMENT,
SYMBOL, EXPIRY_DT, STRIKE_PR, OPTION_TYP, SETTLE_PR,
TIMESTAMP, Underlying, UnderlyingVol), row.names = c(1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
14, 15, 16, 17, 18, 19, 20), class = data.frame)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and 

Re: [R] Outputting csv file from dataframe with columns in a particular order

2011-01-12 Thread Peter Ehlers

On 2011-01-12 07:16, analys...@hotmail.com wrote:

I have a dataframe with columns ID,'date,estimate,actual (but
not necessarily in that order - I do a merge somewhere and that
somehow messes up the order of the columns).

How can I output it to a csv file with the columns in the order that I
want?


Let's say that your data.frame is DF.
mynames - c(ID, date, estimate, actual)
write.csv(DF[, mynames], )

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: Outputting csv file from dataframe with columns in a particular order

2011-01-12 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 12.01.2011 16:16:16:

 I have a dataframe with columns ID,'date,estimate,actual (but
 not necessarily in that order - I do a merge somewhere and that
 somehow messes up the order of the columns).

If you have datafreme with column order a, b, c, d and you want b, c, d, a 
just order columns accordingly

write.table(df[,c(b, c, d, s)], )

Regards
Petr


 
 How can I output it to a csv file with the columns in the order that I
 want?
 
 Thanks.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum by column

2011-01-12 Thread Felipe Carrillo
Or with ddply :
library(plyr)
dat - structure(list(ED = c(21.809467, 36.229566, 51.861284, 11.36232,
27.264634, 12.261986, 46.519313, 7.815376, 2.810428, 13.478372,
35.670182, 27.128715, 19.010294, 15.475368, 18.597983, 29.292615,
6.749846, 14.981488, 14.93511, 14.93511, 21.040785, 8.271615,
12.94232, 6.749846, 15.484412, 29.644494, 43.211212), ECOCODE = structure(c(1L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 3L, 4L, 4L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(AA0101,
AT0101, AT0801, AT1201, PA1201, PA1301), class = factor)),
.Names = c(ED,
ECOCODE), class = data.frame, row.names = c(NA, -27L))
dat
ddply(dat,ECOCODE,summarise,EDsummed=sum(ED))
 
Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA
http://www.fws.gov/redbluff/rbdd_jsmp.aspx




- Original Message 
 From: Peter Francis peterfran...@me.com
 To: r-help@r-project.org
 Sent: Wed, January 12, 2011 2:38:19 AM
 Subject: [R] Sum by column
 
 Dear List,
 
 I have a question of convenience,
 
 I am looking to sum the values of one column based on another column - a 
example may help explain better!
 
 ED            ECOCODE
 21.809467    AA0101
 36.229566    PA1201
 51.861284    PA1201
 11.36232    PA1201
 27.264634    PA1201
 12.261986    PA1201
 46.519313    PA1201
 7.815376    PA1201
 2.810428    PA1201
 13.478372    PA1201
 35.670182    PA1301
 27.128715    AT0801
 19.010294    AT1201
 15.475368    AT1201
 18.597983    AT0101
 29.292615    AT0101
 6.749846    AT0101
 14.981488    AT0101
 14.93511    AT0101
 14.93511    AT0101
 21.040785    AT0101
 8.271615    AT0101
 12.94232    AT0101
 6.749846    AT0101
 15.484412    AT0101
 29.644494    AT0101
 43.211212    AT0101
 
 So for AA0101 it would be = 21.809467
         AT1201 it would be = 19.010294+15.475368
 
 etc
 
 I would then like to be able to output a table with ECOCODE in one column and 
the sum of ED in the other.
 
 This is stored in a dataframe called ecoregion, i understand people like 
 having 
code to change but i have none as i am a relative beginner! Sorry in advance!
 
 Thanks 
 
 Peter
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graphics: 3D regression plane

2011-01-12 Thread David Winsemius


On Jan 12, 2011, at 6:10 AM, Federico Bonofiglio wrote:


Hello Masters,
wishing you all a great 2011 I was also going to ask if anyone knows  
a quick

and efficient way to plot a regression plane (z~x*y).


There are many. There are limitations to using the ?? operator in that  
it only brings up functions that are installed on your machine but  
when I enter:


??3D

... on my machine it nominates a variety of functions from these  
packages:


ca
car
emdbook
grDevices
HH
igraph
lattice
locfit
misc3d
plotrix
raster
rgl
rpanel
scatterplot3d
sm
sna
spatstat
spancs
TeachingDemos
vcdExtra

If you installed the sos package you would have search access to all  
of the functions in CRAN packages (and maybe more).


There are also a variety of graphic galleries:

http://research.stowers-institute.org/efg/R/
http://addictedtor.free.fr/graphiques/allgraph.php
http://rgm2.lab.nig.ac.jp/RGM2/images.php?show=allpageID=1108



I have tried the regr2.plot{HH} function but it is only an  
educational tool

and has poor graphical properties.


Ah, a critic. And a very non-specific one at that.



I also tried to run the following script on a fictitious longitudinal
problem, with poor results


Because of poor programming and failure to read the manuals.



set.seed(1234)

id-c(rep(1,3),rep(2,4),rep(3,2)) # subjects
y-rchisq(9,df=20) #response
k-rnorm(9,4,2) # x
time-as.Date(c(03/07/1981,15/11/1981,03/04/1983,08/12/1979,
30/12/1979,08/03/1980,12/08/1980,12/08/1973,28/03/1975),
format=%d/%m/%Y)
fac-c(m,m,m,f,f,f,f,m,m)# sex


d1-as.vector(by(time,factor(id),min))
t0-as.Date(d1,origin=as.Date(1970-01-01));t0

A-data.frame(id=c(1,2,3),t=t0)
B-data.frame(id=id,tempo=time)
C-merge(A,B);C

rd-as.vector(C$tempo-C$t);rd #time centered on sbj specific first
occurrence


mod-lm(y~rd*k)
newax- expand.grid(
   days = giorni-seq(min(rd),max(rd), length=100),
   expl= esplic- seq(min(k), max(k), length=100)
   )
fit - predict(mod,data.frame(rd=giorni,k=esplic))
graph - persp(x=giorni, y=esplic,fit,
expand=0.5, ticktype=detailed, theta=-45)  #error : z argument not  
valid


I would be grateful if someone would give me some suggestions.


First suggestion would be to re-read the predict help page:

You are throwing together symbols in a manner not expected by predict.  
The argument to  newdata is invalid because you did not construct your  
newax dataframe correctly, resulting in only 100 predicted points (at  
the original data).
newax should have had column names that match the variables in the  
model. This is what you got:

str(newax)
'data.frame':   1 obs. of  2 variables:
 $ days: num  0 6.45 12.91 19.36 25.82 ...
 $ expl: num  0.499 0.499 0.499 0.499 0.499 ...
 - attr(*, out.attrs)=List of 2
  ..$ dim : Named int  100 100
  .. ..- attr(*, names)= chr  days expl
  ..$ dimnames:List of 2
  .. ..$ days: chr  days=  0.00 days=  6.454545 days=  
12.909091 days= 19.363636 ...
  .. ..$ expl: chr  expl=0.4985331 expl=0.5615784  
expl=0.6246238 expl=0.6876691 ...


Generally is is a bad idea to use - inside data.frame(). I'm not  
sure if it's illegal, but it certainly is confusing.  And the predict  
result might have had the correct length had you had used the newax  
dataframe, but it needed to be passed to persp as a properly  
dimensioned  matrix:


?persp

At the end of your constructed example try this instead:

 mod-lm(y~rd*k)
 newax- expand.grid(
   rd = seq(min(rd), max(rd), length=100),
   k = seq(min(k), max(k), length=100)
   )
fit - predict(mod,newax)
graph - persp(x=seq(min(rd), max(rd), length=100),
   y=seq(min(k), max(k), length=100),
   z= matrix(fit, 100, 100),
   expand=0.5, ticktype=detailed, theta=-45)

persp is not a lattice plotting function, so it does its plotting by  
side-effects. It does return a value but it is only a transformation  
matrix and I do not see that you have intentions to use i the graph  
object, but who knows.




Thank u again and happy new year

Federico Bonofiglio

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Basic Stars Plot - help ..

2011-01-12 Thread Uwe Ligges



On 12.01.2011 15:53, JP wrote:

Hi there Rers

I am trying a very basic stars plot:

x-matrix(c(1,4,3,1.1,2,3,4,3,1,1,5,2), ncol = 3, byrow = TRUE,

dimnames=list(c(a,b,c,d),c(x,y,z)))


stars(x, draw.segments = TRUE, radius=TRUE)



Can anyone explain what I am seeing there - EACH of my plots should have 3
coloured sectors no ? - for x, y and z.  How come I am seeing only two
sectors?  Also the length of the radii show the ratio compared to the other
values of the columns (e.g. x), correct ? (And there is no direct
relationship between the values of a single row, or is there?)

I am using R 1.12.1 for Linux,


I guess you forgot the auto-scaling is done scaling the radii to [0,1] 
for the three columns of the matrix.


Uwe Ligges




Many Thanks
JP

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] adonis, amova and haplotype frequency

2011-01-12 Thread Simon Frost
Dear All,

I'd like to perform adonis (from the vegan package) rather than amova
(in ade4) on some haplotype data, as I have crossed factors. Is there a
simple way to tweak the source to allow weights (haplotype frequencies)
in a similar way to amova?

Best
Simon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R not recognized in command line

2011-01-12 Thread Aaditya Nanduri
Im sorry for the late reply.

The output to echo %PATH% :

C:\GTK\bin; C:\Program Files\MiKTeX 2.8\miktex\bin ;C:\Windows\system32
;C:\Windows
;C:\Windows\System32\Wbem; C:\Windows\System32\WindowsPowerShell\v1.0\;
C:\Program
 Files\MATLAB\R2008a\bin; C:\Program Files\MATLAB\R2008a\bin\win32;
C:\Program Fil
es\QuickTime\QTSystem\; C:\MinGW\bin;c:\Program Files\Microsoft SQL
Server\100\To
ols\Binn\; c:\Program Files\Microsoft SQL Server\100\DTS\Binn\; C:\Program
Files\
R\R-2.12.1\bin\; C:\Program Files\Python27; C:\Program Files\SSH
Communications Security\SSH Secure Shell;C:\Python26;C:\Python26\Scripts

2011/1/8 Uwe Ligges lig...@statistik.tu-dortmund.de

 OK, then let's see if you managed to change the PATH or let is see what is
 incorrect there. Therefore, please do the following:

 1. open a Windows command shell (what you call a DOS window)
 2. type echo %PATH%
 3. Send us the output.

 Uwe Ligges



 On 08.01.2011 19:29, Aaditya Nanduri wrote:

 Mr. Gregory : I may have to resort to a roundabout method like yours. I
 just
 cant seem to make it work. Thank you for your help.

 Mr. Spector : Everytime I change the path, I closed all the DOS windows.
 Yet, R is not recognized as a command. Also, I just want to say, you have
 an
 awesome last name.




-- 
Aaditya Nanduri
aaditya.nand...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Formatted output with alternating format at different rows

2011-01-12 Thread Ray Laymon
thx a lot Jim. sprintf solved my problem.

Ray


On Mon, Jan 3, 2011 at 3:40 PM, jim holtman jholt...@gmail.com wrote:
 'sprintf' if your friend:

 dummy3 = c(1.1, 2.2, 3.3)
 dummy4 = c(4.4, 5.5, 6.6, 7.7)
 dummy2 = c(8.8, 9.9)

 cat(sprintf(%5.1f%6.2f%7.3f\n, dummy3[1], dummy3[2], dummy3[3]))
  1.1  2.20  3.300
 cat(sprintf(%5.2f%6.3f%7.4f%8.5f\n, dummy4[1], dummy4[2], dummy4[3], 
 dummy4[4]))
  4.40 5.500 6.6000 7.7
 cat(sprintf(%5.3f%6.4f\n, dummy2[1], dummy2[2]))
 8.8009.9000




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R not recognized in command line

2011-01-12 Thread Daniel Nordlund
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Aaditya Nanduri
 Sent: Wednesday, January 12, 2011 8:44 AM
 To: Uwe Ligges
 Cc: r-help@r-project.org; spec...@stat.berkeley.edu
 Subject: Re: [R] R not recognized in command line
 
 Im sorry for the late reply.
 
 The output to echo %PATH% :
 
 C:\GTK\bin; C:\Program Files\MiKTeX 2.8\miktex\bin ;C:\Windows\system32
 ;C:\Windows
 ;C:\Windows\System32\Wbem; C:\Windows\System32\WindowsPowerShell\v1.0\;
 C:\Program
  Files\MATLAB\R2008a\bin; C:\Program Files\MATLAB\R2008a\bin\win32;
 C:\Program Fil
 es\QuickTime\QTSystem\; C:\MinGW\bin;c:\Program Files\Microsoft SQL
 Server\100\To
 ols\Binn\; c:\Program Files\Microsoft SQL Server\100\DTS\Binn\; C:\Program
 Files\
 R\R-2.12.1\bin\; C:\Program Files\Python27; C:\Program Files\SSH
 Communications Security\SSH Secure Shell;C:\Python26;C:\Python26\Scripts
 

For your system, the path to R should probably be

   C:\Program Files\R\R-2.12.1\bin\i386

Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R not recognized in command line

2011-01-12 Thread Uwe Ligges



On 12.01.2011 18:13, Daniel Nordlund wrote:

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Aaditya Nanduri
Sent: Wednesday, January 12, 2011 8:44 AM
To: Uwe Ligges
Cc: r-help@r-project.org; spec...@stat.berkeley.edu
Subject: Re: [R] R not recognized in command line

Im sorry for the late reply.

The output to echo %PATH% :

C:\GTK\bin; C:\Program Files\MiKTeX 2.8\miktex\bin ;C:\Windows\system32
;C:\Windows
;C:\Windows\System32\Wbem; C:\Windows\System32\WindowsPowerShell\v1.0\;
C:\Program
  Files\MATLAB\R2008a\bin; C:\Program Files\MATLAB\R2008a\bin\win32;
C:\Program Fil
es\QuickTime\QTSystem\; C:\MinGW\bin;c:\Program Files\Microsoft SQL
Server\100\To
ols\Binn\; c:\Program Files\Microsoft SQL Server\100\DTS\Binn\; C:\Program
Files\
R\R-2.12.1\bin\; C:\Program Files\Python27; C:\Program Files\SSH
Communications Security\SSH Secure Shell;C:\Python26;C:\Python26\Scripts



For your system, the path to R should probably be

C:\Program Files\R\R-2.12.1\bin\i386



Right, additionally, the output from above is wrong anyway:
Entries must not end on \, and blanks before or after ; are not allowed.

Uwe






Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Require

2011-01-12 Thread Gene Leynes
I think that the quietly argument in require isn't working

 require('JumboShrimp', quietly=TRUE)
Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
logical.return = TRUE,  :
  there is no package called 'JumboShrimp'


By the way, the behavior is the same with options(warn=0) or options(warn=1)
I'm using R 2.12 (2010-10-15) on a windows 7 machine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Require

2011-01-12 Thread Uwe Ligges



On 12.01.2011 18:53, Gene Leynes wrote:

I think that the quietly argument in require isn't working


require('JumboShrimp', quietly=TRUE)

Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
logical.return = TRUE,  :
   there is no package called 'JumboShrimp'



?require says:
If TRUE, no message confirming package loading is printed, and most 
often, no errors/warnings are printed if package loading fails.


It does not say that is keeps quiet if the package does not even exist 
on your machine. If you really want to suppress such important warnings, use

 suppressWarnings(require('JumboShrimp', quietly=TRUE))



By the way, the behavior is the same with options(warn=0) or options(warn=1)


... which I would expect from reading ?options.
You have to use options(warn=-1) to suppress.

Best,
Uwe Ligges





I'm using R 2.12 (2010-10-15) on a windows 7 machine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Require

2011-01-12 Thread Matthew Vernon
Gene Leynes gleyne...@gmail.com writes:

 I think that the quietly argument in require isn't working
 
  require('JumboShrimp', quietly=TRUE)
 Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
 logical.return = TRUE,  :
   there is no package called 'JumboShrimp'
 

Isn't quietly meant to suppress a message if loading is successful? 

Matthew

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Don´t know what test i have to use

2011-01-12 Thread gaiarrido

Hello,
I´m starting with my PhD and I have to stop because i got a little knowledge
in R and statistics.
I´ve got a model of this kind: 
binary response variable: prevalence of infection (0/1)
3 categorical independent variables: sex, month and name of the area  

I was trying with a full model like this, before the simplification

model-aov(prevalencia~sex*month*area)

but the Fligner test told that i haven´t got homoscedascity, so I suppose I
should trying with glm, with a model

model2-glm(prevalencia~edad*sexo*mes*zona,binomial)

is that correct? where I must put the link (logit) ?

Thnks very much
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Don-t-know-what-test-i-have-to-use-tp3214491p3214491.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Metafor vs Meta vs Spreadsheet: wrong numbers

2011-01-12 Thread Serge-Étienne Parent
Hello,

I experimented the Metafor and Meta packages in the scope of replacing  Excel 
for meta-analysis. I performed the first working example provided  in Michael 
Borenstein's book Introduction to Meta-Analysis with Excel,  Metafor and 
Meta. 
The numbers given by my spreadsheet, which I  validated from Borenstein's book, 
conrespond quite closely to those  given by Meta, but are different from those 
obtained using Metafor. For  the fixed effect, I infer that the differences are 
related to numerical  issues, but for the random effect, the numbers are 
considerably  different. Unfortunately, I could not find where I made it wrong. 
I  would be grateful if someone would have a look at my calculations.

Here are the meta-analysis commands:

### USING METAFOR
library(metafor)
( dat-escalc(m1i=m1i, sd1i=sd1i, n1i=n1i, m2i=m2i, sd2i=sd2i,  n2i=n2i, 
measure=SMD, data=metaData, append=T) ) # COMPUTE EFFECT SIZE
( res-rma.uni(yi,vi,data=dat,method=HE, level=95) ) ### RANDOM EFFECT
( res-rma.uni(yi,vi,data=dat,method=FE, level=95) )  ### FIXED EFFECT

### USING META
( res-metacont(metaData[,3], metaData[,1], metaData[,2], metaData[,6], 
metaData[,4], metaData[,5],
studlab=rownames(metaData),sm=SMD,
level = 0.95, level.comb = 0.95,
comb.fixed=TRUE, comb.random=TRUE,
label.e=Experimental, label.c=Control,
bylab=rownames(metaData)) )

The whole R script is temporarly available at http://bit.ly/eYesbZ
The spreadsheet is temporarly available at http://bit.ly/fAYWPo

Kind regards, 

S.-É. Parent, Eng., Ph.D.
Department of Soils and Agrifood Engineering, Université Laval
Canada



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R not recognized in command line

2011-01-12 Thread Gene Leynes
Although it could easily be user error, I never got Rpy or Rpy2 working an
any sort of reliable way.

However I did learn a couple of things about the Windows PATH

First, (as others have mentioned) it's easiest to modify the PATH through
the Windows GUI that comes up when you right click My Computer, and then
you can modify the Environmental Variables somewhere.

Second, here are examples of the syntax for adding things to the PATH from a
DOS command line
path = %PATH%;C:\Program Files\R\R-2.12.1\bin
path = %PATH%;C:\Rtools\bin
path = %PATH%;C:\Rtools\MinGW\bin
path = %PATH%;C:\Rtools\perl\bin

set R_HOME = C:\Program Files\R\R-2.12.1\

Third, I found this R command useful for checking the path from within R:
Sys.getenv()[['PATH']]



On Wed, Jan 12, 2011 at 10:43 AM, Aaditya Nanduri aaditya.nand...@gmail.com
 wrote:

 Im sorry for the late reply.

 The output to echo %PATH% :

 C:\GTK\bin; C:\Program Files\MiKTeX 2.8\miktex\bin ;C:\Windows\system32
 ;C:\Windows
 ;C:\Windows\System32\Wbem; C:\Windows\System32\WindowsPowerShell\v1.0\;
 C:\Program
  Files\MATLAB\R2008a\bin; C:\Program Files\MATLAB\R2008a\bin\win32;
 C:\Program Fil
 es\QuickTime\QTSystem\; C:\MinGW\bin;c:\Program Files\Microsoft SQL
 Server\100\To
 ols\Binn\; c:\Program Files\Microsoft SQL Server\100\DTS\Binn\; C:\Program
 Files\
 R\R-2.12.1\bin\; C:\Program Files\Python27; C:\Program Files\SSH
 Communications Security\SSH Secure Shell;C:\Python26;C:\Python26\Scripts

 2011/1/8 Uwe Ligges lig...@statistik.tu-dortmund.de

  OK, then let's see if you managed to change the PATH or let is see what
 is
  incorrect there. Therefore, please do the following:
 
  1. open a Windows command shell (what you call a DOS window)
  2. type echo %PATH%
  3. Send us the output.
 
  Uwe Ligges
 
 
 
  On 08.01.2011 19:29, Aaditya Nanduri wrote:
 
  Mr. Gregory : I may have to resort to a roundabout method like yours. I
  just
  cant seem to make it work. Thank you for your help.
 
  Mr. Spector : Everytime I change the path, I closed all the DOS windows.
  Yet, R is not recognized as a command. Also, I just want to say, you
 have
  an
  awesome last name.
 
 


 --
 Aaditya Nanduri
 aaditya.nand...@gmail.com

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A question on dummy variable

2011-01-12 Thread Bogaso Christofer
Thanks Gabor and other for their input. I admit that I must have placed some
reproducible codes on what I wanted. However it was actually in my mind
however I restrained because it was not any R related query rather a general
Statistics related.

Here I am using dummy variables in ***Time series context***. Please assume
following artificial TS along with the quarterly dummies:

library(zoo)
# my time series
MyTimeSeries - zooreg(101:126, start=as.yearqtr(as.Date(2005-01-01)),
frequency=4)
# creation of quarterly dummy
### dummy1
dummy1 - zooreg(Reduce(rbind, rep(list(diag(4)), 7)),
start=as.yearqtr(as.Date(2005-01-01)), frequency=4) 
dummy1 - merge(dummy1, MyTimeSeries, all=F)[,1:4]
colnames(dummy1) - paste(dummy, 1:4, sep=)
### dummy2
dummy2 - dummy1 - 1/4
### dummy3
dummy3 - dummy1
dummy3[dummy3 ==0] = -1/(4-1)
# Time series with quarterly dummy
TS_with_dummy1 - cbind(MyTimeSeries, dummy1[,-4])
TS_with_dummy2 - cbind(MyTimeSeries, dummy2[,-4])
TS_with_dummy3 - cbind(MyTimeSeries, dummy3[,-4])
TS_with_dummy1
TS_with_dummy2
TS_with_dummy3

Here you see, as my previous post, there are 3 types of dummies: dummy1,
dummy2, and dummy3 (quarterly dummies). I used to use dummy1 declaration for
all my time series analysis. However later in the vars package I noticed
the 2nd type of definition for dummy. And 3rd definition I have come across
from somewhere in net (which I cant just recall at this time.) Here my
question was: which is the centred dummy variable (according to help page of
vars package 2nd one is the centred dummy)?

However I am searching for the definition of centred dummy variables in time
series analysis context. Therefore I would want to know, why 2nd one is
called centred dummy? Why people prefer for it, not the Standard dummy
definition (i.e. dummy1).

Can you please explain?

Thanks and regards,

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: 12 January 2011 05:47
To: Christofer Bogaso
Cc: r-help@r-project.org
Subject: Re: [R] A question on dummy variable

On Tue, Jan 11, 2011 at 3:18 PM, Christofer Bogaso
bogaso.christo...@gmail.com wrote:
 Dear all, I would like to ask one question related to statistics, for 
 specifically on defining dummy variables. As of now, I have come 
 across 3 different kind of dummy variables (assuming I am working with 
 Seasonal dummy, and number of season is 4):

 dummy1 - diag(4)
 for(i in 1:3) dummy1 - rbind(dummy1, diag(4))
 dummy1 - dummy1[,-4]

 dummy2 - dummy1
 dummy2[dummy2 == 0] = -1/(4-1)

 dummy3 - dummy1 - 1/4

 head(dummy1)
     [,1] [,2] [,3]
 [1,]    1    0    0
 [2,]    0    1    0
 [3,]    0    0    1
 [4,]    0    0    0
 [5,]    1    0    0
 [6,]    0    1    0
 head(dummy2)
           [,1]       [,2]       [,3]
 [1,]  1.000 -0.333 -0.333
 [2,] -0.333  1.000 -0.333
 [3,] -0.333 -0.333  1.000
 [4,] -0.333 -0.333 -0.333
 [5,]  1.000 -0.333 -0.333
 [6,] -0.333  1.000 -0.333
 head(dummy3)
      [,1]  [,2]  [,3]
 [1,]  0.75 -0.25 -0.25
 [2,] -0.25  0.75 -0.25
 [3,] -0.25 -0.25  0.75
 [4,] -0.25 -0.25 -0.25
 [5,]  0.75 -0.25 -0.25
 [6,] -0.25  0.75 -0.25
 Now I want to know which type of dummy definition is called Centered 
 dummy and why it is called so? Is it equivalent to use any of the 
 above definitions (atleast 2nd and 3rd?) It would really be very 
 helpful if somebody point any suggestion and clarification.


The contrasts of your dummy1 matrix are contr.SAS contrasts in R.
(The default contrasts in R are contr.treatment which are the same as
contr.SAS except contr.SAS uses the last level as the base whereas treatment
contrasts use the first level as the base.)

   options(contrasts = c(contr.SAS, contr.poly))
   f - gl(4, 1, 16)
   M - model.matrix( ~ f )
   all( M[, -1] == dummy1) # TRUE

Centered contrasts are ones which have been centered -- i.e. the mean of
each column has been subtracted from that column.  This is equivalent to
saying that the column sums are zero.

The means of the three columns of dummy1 are c(1/4, 1/4, 1/4) so if we
subtract 1/4 from dummy1 we get a centered contrasts matrix. That is
precisely what you did to get dummy3.  We can check that dummy3 is
centered:

   colSums(dummy3) # 0 0 0

dummy2 is just a scaled version of dummy3.  In fact dummy2 equals
dummy3 / .75 so its not fundamentally different.  Its columns still sum to
zero so its still centered.

   all( dummy2 == dummy3 / .75) # TRUE
   colSums(dummy2) # 0 0 0 except for floating point error

--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Require

2011-01-12 Thread Gene Leynes
I read the help first, and read it again now, but I still don't see why the
warning was generated.

The help seems to state clearly that the user can suppress warnings
most often, no errors/warnings are printed if package loading fails.

If the package doesn't exist, then it would fail to load, right?  When that
failure happens, I thought I could suppress the warning message with the
quietly argument.

Thank you for pointing out the suppressWarnings function, that one is new to
me, and I suppose it will work here.  However it seems that it shouldn't be
necessary with quietly=TRUE.

By the way, I wanted to suppress the confusing message generated by R, and
put in a simple recommendation that the user should try installing the
package.


2011/1/12 Uwe Ligges lig...@statistik.tu-dortmund.de



 On 12.01.2011 18:53, Gene Leynes wrote:

 I think that the quietly argument in require isn't working

  require('JumboShrimp', quietly=TRUE)

 Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
 logical.return = TRUE,  :
   there is no package called 'JumboShrimp'



 ?require says:
 If TRUE, no message confirming package loading is printed, and most often,
 no errors/warnings are printed if package loading fails.

 It does not say that is keeps quiet if the package does not even exist on
 your machine. If you really want to suppress such important warnings, use
  suppressWarnings(require('JumboShrimp', quietly=TRUE))



 By the way, the behavior is the same with options(warn=0) or
 options(warn=1)


 ... which I would expect from reading ?options.
 You have to use options(warn=-1) to suppress.

 Best,
 Uwe Ligges




  I'm using R 2.12 (2010-10-15) on a windows 7 machine

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Don´t know what test i have to use

2011-01-12 Thread David Winsemius


On Jan 12, 2011, at 12:51 PM, gaiarrido wrote:



Hello,
I´m starting with my PhD and I have to stop because i got a little  
knowledge

in R and statistics.
I´ve got a model of this kind:
binary response variable: prevalence of infection (0/1)
3 categorical independent variables: sex, month and name of the area

I was trying with a full model like this, before the simplification

model-aov(prevalencia~sex*month*area)

but the Fligner test told that i haven´t got homoscedascity, so I  
suppose I

should trying with glm, with a model

model2-glm(prevalencia~edad*sexo*mes*zona,binomial)

is that correct? where I must put the link (logit) ?


Why not read the help page regarding binomial that is on the help page  
for glm. There you will learn what the default link is for binomial.


--
David



Thnks very much
--
View this message in context: 
http://r.789695.n4.nabble.com/Don-t-know-what-test-i-have-to-use-tp3214491p3214491.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Don´t know what test i have to use

2011-01-12 Thread Joshua Wiley
Hi,

That is basically correct.  You can specify the link as logit (see my
example), but that is the default so you do not strictly need to in
this case.  II would encourage you to keep your variables
(prevalencia, edad, sexo, mes) stored in a data frame, in which case
you would add the data = argument to glm().

model2 - glm(prevalencia ~ edad * sexo * mes * zona,
  family = binomial(link = logit),
  data = your_dataframe)

Also, you might take a look at ?predict.glm  it has some examples with
binomial data based off the wonderful book by Drs. Venables and
Ripley.  Oh, and finally, if you have 12 levels of months, ? levels of
zones, and 2 levels of sex, you might not want the 4way interactions
that you will get by default from using the '*' operator inside a
formula.  Unless you have a theory that there is an additional effect
of being a middle aged female in the month of July for zone 8, but
not

Cheers,

Josh

On Wed, Jan 12, 2011 at 9:51 AM, gaiarrido gaiarr...@usal.es wrote:

 Hello,
 I´m starting with my PhD and I have to stop because i got a little knowledge
 in R and statistics.
 I´ve got a model of this kind:
 binary response variable: prevalence of infection (0/1)
 3 categorical independent variables: sex, month and name of the area

 I was trying with a full model like this, before the simplification

 model-aov(prevalencia~sex*month*area)

 but the Fligner test told that i haven´t got homoscedascity, so I suppose I
 should trying with glm, with a model

 model2-glm(prevalencia~edad*sexo*mes*zona,binomial)

 is that correct? where I must put the link (logit) ?

 Thnks very much
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Don-t-know-what-test-i-have-to-use-tp3214491p3214491.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integrate and subdivisions limit

2011-01-12 Thread Hans W Borchers
 Dear all,
 I have some issues with integrate in R thus I would like to request
 your help. I am trying to calculate the integral of f(x)*g(x).
 The f(x) is a step function while g(x) is a polynomial.
 If f(x) (step function) changes its value only few times (5 or 6 'steps')
 everything is calulated ok(verified results in scrap paper) but if f(x)
 takes like 800 different values I receive the error
 
 Error in integrate number of subdivisions reached
 
 I did some checks on the internet and I found that I can increase the
 number of subdivisions (as this is a parameter in integrate().
 Thus I raised it from 100 to 1000 (and then to 1).
 
 A. Does this makes the error produced higher or does it only stress
the computer?
 B. When the number was raised to 10.000 I started getting the error message
roundoff error was detected
 
 What do you think I should do to solve that?
 I would like to thank u in advance for your help
 
 Best Regards
 Alex

There's obviously a more numerically stable approach. If g(x) is a polynomial
you do know its polynomial antiderivative. Take that and sum up all intervals
where the step function is constant.

Example:  g(x) = 1 constant, integrate x^2 over 0..1 in 1 subdivisions.
  The antiderivative is 1/3 * x^3, thus

g - function(x) 1
x - seq(0, 1, len=10001)
sum((x[2:10001]^3 - x[1:1]^3)*g(2:10001))/3  #= 0.333

The antiderivative of a polynomial a_1 x^n + ... + a_{n+1} given as vector
c(a1,...) can also be determined automatically, no manual work is necessary.

Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 2d plot with modification of plotting symbol to indicate third dimension.

2011-01-12 Thread John Sorkin
I would like to plot 3-dimensional data on a two-dimensional scatter-plot.
Is there a way I can automatically modify the plot symbol (e.g. changing size 
or color) to indicate the value of a third variable? E.g. How can I plot weight 
vs. age and indicate the value of muscle mass for each value weight-age pair by 
making the plot point proportional to the subject's muscle mass?
Thanks,
John


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 2d plot with modification of plotting symbol to indicate third dimension.

2011-01-12 Thread Sarah Goslee
You don't give an example, but in general you can use a vector for cex with
the values proportional to the third variable.

Same goes for color: col can be a vector, not just a single value.

This has been discussed before on-list, and fairly recently.

Sarah

On Wed, Jan 12, 2011 at 2:19 PM, John Sorkin
jsor...@grecc.umaryland.edu wrote:
 I would like to plot 3-dimensional data on a two-dimensional scatter-plot.
 Is there a way I can automatically modify the plot symbol (e.g. changing size 
 or color) to indicate the value of a third variable? E.g. How can I plot 
 weight vs. age and indicate the value of muscle mass for each value 
 weight-age pair by making the plot point proportional to the subject's muscle 
 mass?
 Thanks,
 John

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] metafor/ meta-regression

2011-01-12 Thread Fernanda Melo Carneiro
Hi I have tryed to do the meta-regression in metafor package, but I
would like to get the standardized coefficients for each variable, however in
command:

 

Ø  res-rma.uni (yi, vi,
method=REML, mods=~cota+DL+uso+gadiente+idade, data= turbidez)

 

I just have the coefficients no standardized (estimate) of the multiple
regression.

What I need to do?

Thanks

Fernanda Melo Carneiro contato: (62) 3521-1480 e 8121-7374www.ecoevol.ufg.br
Laboratório de Ecologia Teórica e Síntese (UFG)  


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Don´t know what test i have to use

2011-01-12 Thread gaiarrido

Thanks very much both.
I´m starting playing with it, i was a little afaid because it was part of my
job, but now i've found it very funny.
Josh, I've got just data for 3 representatives months, and it's not a priori
rejectable that could be differences in  the ratio of changes along the
months between the 2 sexes.

Thanks again 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Don-t-know-what-test-i-have-to-use-tp3214491p3214638.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame subset too slow

2011-01-12 Thread Duke
Sorry for the late response. I was away for vacation and was unable to 
keep on working on the codes.


Anyway, I was unable to provide *str* of that specific data since they 
are all in a big package with lots of inputs/outputs. Quickly gazing 
through the code, I narrowed them down (and made a bad guess) to data 
frame. But it turned out that data frame was not the reason. After 
carefully check through the package, I found out that there is a double 
for loop. I replaced that double for loop and now instead of running ~ 
13hrs, the package now runs ~ 13min for a similar dataset.


Thanks for all your helps,

D.

On 12/30/10 11:40 AM, jim holtman wrote:

If you want the data in the first column of the dataframe, then you
should be using '[['.  Notice what comes back in each of these cases:


str(dat)

'data.frame':   8 obs. of  5 variables:
  $ sample.1.200..n..TRUE.: int  25 199 70 124 93 157 49 137 192 57 ...
  $ runif.n.  : num  0.7725 0.0263 0.0728 0.7594 0.2792 ...
  $ runif.n..1: num  0.4304 0.8608 0.0882 0.5666 0.1721 ...
  $ runif.n..2: num  0.3797 0.1191 0.0481 0.3297 0.0649 ...
  $ runif.n..3: num  0.0895 0.0441 0.0403 0.9679 0.3986 ...

str(dat[1])

'data.frame':   8 obs. of  1 variable:
  $ sample.1.200..n..TRUE.: int  25 199 70 124 93 157 49 137 192 57 ...

str(dat[[1]])

  int [1:8] 25 199 70 124 93 157 49 137 192 57 ...

str(dat$sample.1.200..n..TRUE)

  int [1:8] 25 199 70 124 93 157 49 137 192 57 ...

  str(dat[,1])

  int [1:8] 25 199 70 124 93 157 49 137 192 57 ...

You will get different classes of values.  We would really need to see
the output of 'str' on your data structures to see what might be
happening.  Your data is not that big and most subsetting/extractions
should be in less than a second unless there is something funny in
your data.  So provide the 'str' so we can see.


On Thu, Dec 30, 2010 at 11:28 AM, Dukeduke.li...@gmx.com  wrote:

Hi Jim,

Is this really a problem for me to use [1] instead of [[1]]? Will this make
it run slower? Also, if I use dat$V1 %in% list$V1, will it be fine?

Anyway, my data and list are basically gene lists (tab delimited):

$ head test.txt
Xkr4chr1-32045623661579320610236614293
  3204562,3411782,3660632,3207049,3411982,3661579,
Rp1chr1-42809264399322428306143992684
  4280926,4341990,4342282,4399250,4283093,4342162,4342918,4399322,
Rp1_2chr1-43335874350395433468043429064
  4333587,4341990,4342282,4350280,4340172,4342162,4342918,4350395,
Sox17chr1-44810084486494448179644834875
  4481008,4483180,4483852,4485216,4486371,
  4482749,4483547,4483944,4486023,4486494,
Mrpl15chr1-47632784775807476453247757585
  4763278,4767605,4772648,4774031,4775653,
  4764597,4767729,4772814,4774186,4775807,
Mrpl15_2chr1-47632784775807477580747758074
  4763278,4767605,4772648,4775653,4764597,4767729,4772814,4775807,
$ head list.txt
GeneNamesChrStartEnd
0610007C21Rikchr53135101231356996
0610007L01Rikchr5130695613130719635
0610007L01Rik_2chr5130698204130719635
0610007P08Rikchr136391662764001609
0610007P08Rik_2chr136391664163970963
0610007P14Rikchr128715640487165495

Thanks,

D.

On 12/30/10 11:13 AM, jim holtman wrote:

You should be using dat[[1]].  Here is an example with 8 rows that
take about 0.02 seconds to get the subset.

Provide an 'str' of what your data looks like


n- 8  # rows to create
dat- data.frame(sample(1:200, n, TRUE), runif(n), runif(n), runif(n),
runif(n))
lst- data.frame(sample(1:100, n, TRUE), runif(n), runif(n), runif(n),
runif(n))
str(dat)

'data.frame':   8 obs. of  5 variables:
  $ sample.1.200..n..TRUE.: int  39 116 69 163 51 125 144 32 28 4 ...
  $ runif.n.  : num  0.519 0.793 0.549 0.77 0.272 ...
  $ runif.n..1: num  0.691 0.89 0.783 0.467 0.357 ...
  $ runif.n..2: num  0.705 0.254 0.584 0.998 0.279 ...
  $ runif.n..3: num  0.873 1 0.678 0.702 0.455 ...

str(lst)

'data.frame':   8 obs. of  5 variables:
  $ sample.1.100..n..TRUE.: int  38 83 38 70 77 44 81 55 32 1 ...
  $ runif.n.  : num  0.0621 0.7374 0.074 0.4281 0.0516 ...
  $ runif.n..1: num  0.879 0.294 0.146 0.884 0.58 ...
  $ runif.n..2: num  0.648 0.745 0.825 0.507 0.799 ...
  $ runif.n..3: num  0.2523 0.1679 0.9728 0.0478 0.0967 ...

system.time({

+ dat.sub- dat[dat[[1]] %in% lst[[1]],]
+ })
user  system elapsed
0.020.000.01

str(dat.sub)

'data.frame':   39803 obs. of  5 variables:
  $ sample.1.200..n..TRUE.: int  39 69 51 32 28 4 69 3 48 69 ...
  $ runif.n.  : num  0.5188 0.5494 0.2718 0.5566 0.0893 ...
  $ runif.n..1: num  0.691 0.783 0.357 0.619 0.717 ...
  $ runif.n..2: num  0.705 

Re: [R] 2d plot with modification of plotting symbol to indicate third dimension.

2011-01-12 Thread Greg Snow
Look at the symbols function for some options of doing what you suggest (you 
can also do a search for bubble plot for a couple of other implementations).  
If you want to go a bit further than what symbols does for you then look at the 
my.symbols function in the TeachingDemos package.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of John Sorkin
 Sent: Wednesday, January 12, 2011 12:19 PM
 To: r-help@r-project.org
 Subject: [R] 2d plot with modification of plotting symbol to indicate
 third dimension.
 
 I would like to plot 3-dimensional data on a two-dimensional scatter-
 plot.
 Is there a way I can automatically modify the plot symbol (e.g.
 changing size or color) to indicate the value of a third variable? E.g.
 How can I plot weight vs. age and indicate the value of muscle mass for
 each value weight-age pair by making the plot point proportional to the
 subject's muscle mass?
 Thanks,
 John
 
 
 John David Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)
 
 Confidentiality Statement:
 This email message, including any attachments, is for
 th...{{dropped:6}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] aggredating date data

2011-01-12 Thread analys...@hotmail.com
I tried a date by date forecast of a time series and it seems to be
too wild.  How can I aggregate the date into weeks or months as
required?

Thanks.

The input looks like

ID datadate(-MM-DD)  value_for_day
-- ----
-- --   

and I want to be able to change it to

ID dataweek value_for_week

or

ID datamonth value_ for_ month

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Outputting csv file from dataframe with columns in a particular order

2011-01-12 Thread analys...@hotmail.com
Thanks to all who responded.

On Jan 12, 10:34 am, Peter Ehlers ehl...@ucalgary.ca wrote:
 On 2011-01-12 07:16, analys...@hotmail.com wrote:

  I have a dataframe with columns ID,'date,estimate,actual (but
  not necessarily in that order - I do a merge somewhere and that
  somehow messes up the order of the columns).

  How can I output it to a csv file with the columns in the order that I
  want?

 Let's say that your data.frame is DF.
 mynames - c(ID, date, estimate, actual)
 write.csv(DF[, mynames], )

 Peter Ehlers

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help in calculating ar on ranked vector

2011-01-12 Thread Raymond Wong
I was using ar(stats) to calculate autoregressive coefficient. It works on 
vector z, but it will not work on vector rz -rank (z, ties.method=average).  
What did I miss?
Any info will be greatly appreciated.  TIA


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with Data Transformation

2011-01-12 Thread Guy Jett
Hi John,
Thank you for your patience.  I was away for a State certification exam 
yesterday, so am just getting back to this.  

Reading through you response I believe I wasn't clear enough about what I'm 
trying to do.  Your description seems to rearrange the matrix without grouping 
the analytical results for a single sample onto a single line, as I had hoped.  
I may have confused things by attempting to send a truncated/simplified 
dataset.  

Restatement of needs:
* I have 863 individual samples.  The following columns contain invariant 
results for each sample:
- 
Transect,Offset,Location,fldsampid,CLP_ID,sacode,matrix,LTCCODE,
Northing,Easting,CRDUNITS,Event,LOGDATE,sbd,sed.
- Sorting can make use of fldsampid as these values are entirely unique 
to 
  each sample.
* Each sample is associates with one or more of the following 48 analytical 
parameters:
- 
AG,AL,ALK,ALKB,ALKC,AS,B,BA,BE,BR,CA,CDCL,CO,CR,CU,
  
DOC,FE,Hg,HG,HGACIDLAB,HGEXTINO,HGEXTORG,HGNONMOBHGSEMIMOB,K,
  
MEHG,MG,MN,MO,NH3N,NI,NO2N,NO3,NO3N,PBPO4,S,SB,SE,SO4,
SOLID,SSC,TL,TOC,V,Zn,ZN
- These are currently stored in the PARLABEL column.
* For each sample ID I would like to create a single line;
  - Extract each PARLABEL to use as a column name; and
  - Place the Result in the appropriate column.
* I can subset the data so that prccode, Lab, EXMCODE, Analysis, 
PARVQ, RL,
  EPA_FLAGS, and units are irrelevant to the issue.

The following snippet should illustrate the absolute minumum needs:
INPUT
fldsampid   |  PARLABEL|  Result
+--+-
fldsampid1  |  PARLABEL-a  |  value-8
fldsampid1  |  PARLABEL-b  |  value-5
fldsampid1  |  PARLABEL-x  |  value-2
fldsampid1  |  PARLABEL-y  |  value-0
fldsampid2  |  PARLABEL-a  |  value-9
fldsampid2  |  PARLABEL-c  |  value-8
fldsampid3  |  PARLABEL-a  |  value-2
fldsampid3  |  PARLABEL-d  |  value-8
fldsampid3  |  PARLABEL-w  |  value-3
fldsampid3  |  PARLABEL-x  |  value-9
fldsampid3  |  PARLABEL-y  |  value-6

OUTPUT
fldsampid   |  PARLABEL-a  |  PARLABEL-b  |  PARLABEL-w  |  PARLABEL-x  |  
PARLABEL-y  
+--+--+--+--+--
fldsampid1  |   value-8|   value-5|   NA |   value-2|   
value-0
fldsampid2  |   value-9|   value-2|   NA |   NA |   NA  
   
fldsampid3  |   value-2|   NA |   value-3|   value-9|  
value-6 

If it would help I could attach a 31kb file written with 
write.table(Units_NG.L, file=Units_NG.L, quote=FALSE, sep=\t)
This subset has 97 individual samples and 3 PARLABELS distributed across 249 
individual lines.


Added Responses:
1. The structure of my actual input file appears to be correct per the 
following:
   (I has sent you a separate extration from an excel file)
   (Strings as Factors, numbers as num or int; a date changed via as.Date())
'data.frame':   19694 obs. of  25 variables:
 $ Transect : Factor w/ 78 levels FLR01,FLR02,..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Offset   : Factor w/ 16 levels 0,A,B,C,..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Location : Factor w/ 246 levels FLR010,FLR01A,..: 1 1 1 1 1 1 1 1 1 1 ...
 $ fldsampid: Factor w/ 863 levels FLR010-ANE1,..: 1 1 1 1 1 1 1 1 1 1 ...
 $ CLP_ID   : Factor w/ 586 levels ,MY6591,MY6593,..: 1 1 1 1 1 1 1 1 1 1 
...
 $ sacode   : Factor w/ 2 levels FD,N: 2 2 2 2 2 2 2 2 2 2 ...
 $ matrix   : Factor w/ 6 levels SE,SO,TA,..: 3 3 3 3 3 3 3 3 3 3 ...
 $ LTCCODE  : Factor w/ 4 levels BH,LK,RE,..: 4 4 4 4 4 4 4 4 4 4 ...
 $ Northing : num  2444733 2444733 2444733 2444733 2444733 ...
 $ Easting  : num  5684613 5684613 5684613 5684613 5684613 ...
 $ CRDUNITS : Factor w/ 1 level FT: 1 1 1 1 1 1 1 1 1 1 ...
 $ Event: int  1 1 1 1 1 1 1 1 1 1 ...
 $ LOGDATE  :Class 'Date'  num [1:19694] -717743 -717743 -717743 -717743 
-717743 ...
 $ sbd  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sed  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ prccode  : Factor w/ 5 levels INO,MET,MI,..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Lab  : Factor w/ 5 levels A4SW,BRLS,..: 2 2 2 2 2 2 2 2 2 2 ...
 $ EXMCODE  : Factor w/ 5 levels FLDFLT,METHOD,..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Analysis : Factor w/ 23 levels A2320,A2540G,..: 10 11 12 12 12 12 12 12 
12 12 ...
 $ PARLABEL : Factor w/ 48 levels AG,AL,ALK,..: 27 20 1 2 6 7 8 9 12 14 
...
 $ PARVQ: Factor w/ 3 levels =,ND,TR: 1 1 2 1 1 1 1 2 1 1 ...
 $ Result   : num  20.6 24.7 5 14900 60 100 4930 4 182 80 ...
 $ RL   : num  3.1 0.77 10 5750 160 790 160 8 10 80 ...
 $ EPA_FLAGS: Factor w/ 10 levels ,J,J-,J+,..: 4 1 7 3 2 2 1 7 1 2 ...
 $ units: Factor w/ 3 levels ug/kg,ug/L,..: 1 1 1 1 1 1 1 1 1 1 ...

2. etc... Sorry to confuse you, this was to indicate additional columns.

Guy Jett
ITSI,  A Gilbane Company
(925) 946-3340 Direct
(925) 457-4168 ITSI Cell
gj...@itsi.com

-Original Message-
From: John Kane [mailto:jrkrid...@yahoo.ca] 
Sent: Monday, January 10, 2011 4:29 PM
To: r-help@r-project.org; Guy 

[R] navigating in lists

2011-01-12 Thread Jannis
Dear list members,


I am stuck with navigating in a rather complicated list object. 

In general I would need a solution to access all first (or other) elements of 
the different sublists in one list:

test=list(a=list(1,2),b=list(3,4),c=list(5,6))

like:

test[[1:3]][[1]]

which should result in

c(1,3,5)


Is there any way to access lists in such a way? Using unlist would create quite 
complicated objects

Cheers
Jannis



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot: skip a range of axis

2011-01-12 Thread Carl Witthoft

You could always create a new vector, something like

Xprime- if x-0 x else x-2   #not valid R code

Thus mapping +1 to -1, and shifting everything else down.  Fixing the 
x-tick labels is left as a homework problem :-)


I'm assuming from your description that there are no y-values 
corresponding to -1x1, so plotting y vs Xprime won't lose any of your 
data.


Carl




From: Yuan Jian jayuan2008_at_yahoo.com
Date: Wed, 12 Jan 2011 04:15:38 -0800 (PST)


thanks Jim,
I found gap.plot seperates x axis or y axis into two boxes. do you know 
any plot tool that can skip a range in x-axis or y-axis without lines?

regards
YU

* On Wed, 12/1/11, Jim Lemon jim_at_bitwrit.com.au wrote:

From: Jim Lemon jim_at_bitwrit.com.au
Subject: Re: [R] plot: skip a range of axis To: Yuan Jian 
jayuan2008_at_yahoo.com

Cc: r-help_at_r-project.org
Received: Wednesday, 12 January, 2011, 10:26 AM

On 01/12/2011 03:46 PM, Yuan Jian wrote:
 Hi,
 I am using plot to show scatter points in 2_D.
 in my data, there is no data between -1 and +1 in x-axis.
 I want to skip this region, i.e. x axis becomes [-Inf:-1, 1:Inf].
 can any one tell me how to do?

Hi Yu,
Try the gap.plot function in the plotrix package.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 2d plot with modification of plotting symbol to indicate third dimension.

2011-01-12 Thread Bert Gunter
Google on R Graph Gallery to find examples with code.

You can also almost certainly  RSiteSearch() on appropriate keys to
find a pre-existing function (which someone may provideyou on the
list).
Also:
?symbols

-- Bert

On Wed, Jan 12, 2011 at 11:19 AM, John Sorkin
jsor...@grecc.umaryland.edu wrote:
 I would like to plot 3-dimensional data on a two-dimensional scatter-plot.
 Is there a way I can automatically modify the plot symbol (e.g. changing size 
 or color) to indicate the value of a third variable? E.g. How can I plot 
 weight vs. age and indicate the value of muscle mass for each value 
 weight-age pair by making the plot point proportional to the subject's muscle 
 mass?
 Thanks,
 John


 John David Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)

 Confidentiality Statement:
 This email message, including any attachments, is for th...{{dropped:6}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Bert Gunter
Genentech Nonclinical Biostatistics
467-7374
http://devo.gene.com/groups/devo/depts/ncb/home.shtml

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] navigating in lists

2011-01-12 Thread Greg Snow
 sapply(test, '[[', 1)
a b c 
1 3 5

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Jannis
 Sent: Wednesday, January 12, 2011 1:50 PM
 To: r-help@r-project.org
 Subject: [R] navigating in lists
 
 Dear list members,
 
 
 I am stuck with navigating in a rather complicated list object.
 
 In general I would need a solution to access all first (or other)
 elements of the different sublists in one list:
 
 test=list(a=list(1,2),b=list(3,4),c=list(5,6))
 
 like:
 
 test[[1:3]][[1]]
 
 which should result in
 
 c(1,3,5)
 
 
 Is there any way to access lists in such a way? Using unlist would
 create quite complicated objects
 
 Cheers
 Jannis
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with Data Transformation

2011-01-12 Thread Dennis Murphy
Hi:

This seems like a problem that is well suited for the cast() function in
package reshape2. Here's a toy example:

library(reshape2)
df - data.frame(idnum = rep(101:104, c(3, 2, 4, 3)),
  lab = unlist(sapply(c(3, 2, 4, 3), function(x)
sample(LETTERS[1:6], x))),
  y = rpois(12, 7))
df
   idnum lab  y
1101   C  7
2101   F 11
3101   E  4
4102   B 10
5102   A  7
6103   E  6
7103   D  9
8103   B  3
9103   F 12
10   104   C 10
11   104   D  9
12   104   A  5
cast(df, idnum ~ lab, value = 'y')
  idnum  A  B  C  D  E  F
1   101 NA NA  7 NA  4 11
2   102  7 10 NA NA NA NA
3   103 NA  3 NA  9  6 12
4   104  5 NA 10  9 NA NA

Since you have multiple 'invariant' variables, you could use something like

cast(df, . ~ lab, value = 'y')

This would make all variables other than lab the 'rows', the values of lab
as separate 'columns', with the value of the variable y inserted in the
appropriate locations, NA fill otherwise.

cast() is an alternative to the reshape() function in base R.

HTH,
Dennis

On Wed, Jan 12, 2011 at 12:19 PM, Guy Jett gj...@itsi.com wrote:

 Hi John,
 Thank you for your patience.  I was away for a State certification exam
 yesterday, so am just getting back to this.

 Reading through you response I believe I wasn't clear enough about what I'm
 trying to do.  Your description seems to rearrange the matrix without
 grouping the analytical results for a single sample onto a single line, as I
 had hoped.  I may have confused things by attempting to send a
 truncated/simplified dataset.

 Restatement of needs:
 * I have 863 individual samples.  The following columns contain invariant
 results for each sample:
-
 Transect,Offset,Location,fldsampid,CLP_ID,sacode,matrix,LTCCODE,
Northing,Easting,CRDUNITS,Event,LOGDATE,sbd,sed.
- Sorting can make use of fldsampid as these values are entirely
 unique to
  each sample.
 * Each sample is associates with one or more of the following 48 analytical
 parameters:
-
 AG,AL,ALK,ALKB,ALKC,AS,B,BA,BE,BR,CA,CDCL,CO,CR,CU,

  
 DOC,FE,Hg,HG,HGACIDLAB,HGEXTINO,HGEXTORG,HGNONMOBHGSEMIMOB,K,

  
 MEHG,MG,MN,MO,NH3N,NI,NO2N,NO3,NO3N,PBPO4,S,SB,SE,SO4,
SOLID,SSC,TL,TOC,V,Zn,ZN
- These are currently stored in the PARLABEL column.
 * For each sample ID I would like to create a single line;
  - Extract each PARLABEL to use as a column name; and
  - Place the Result in the appropriate column.
 * I can subset the data so that prccode, Lab, EXMCODE, Analysis,
 PARVQ, RL,
  EPA_FLAGS, and units are irrelevant to the issue.

 The following snippet should illustrate the absolute minumum needs:
 INPUT
 fldsampid   |  PARLABEL|  Result
 +--+-
 fldsampid1  |  PARLABEL-a  |  value-8
 fldsampid1  |  PARLABEL-b  |  value-5
 fldsampid1  |  PARLABEL-x  |  value-2
 fldsampid1  |  PARLABEL-y  |  value-0
 fldsampid2  |  PARLABEL-a  |  value-9
 fldsampid2  |  PARLABEL-c  |  value-8
 fldsampid3  |  PARLABEL-a  |  value-2
 fldsampid3  |  PARLABEL-d  |  value-8
 fldsampid3  |  PARLABEL-w  |  value-3
 fldsampid3  |  PARLABEL-x  |  value-9
 fldsampid3  |  PARLABEL-y  |  value-6

 OUTPUT
 fldsampid   |  PARLABEL-a  |  PARLABEL-b  |  PARLABEL-w  |  PARLABEL-x  |
  PARLABEL-y

 +--+--+--+--+--
 fldsampid1  |   value-8|   value-5|   NA |   value-2|
 value-0
 fldsampid2  |   value-9|   value-2|   NA |   NA |
 NA
 fldsampid3  |   value-2|   NA |   value-3|   value-9|
  value-6

 If it would help I could attach a 31kb file written with
write.table(Units_NG.L, file=Units_NG.L, quote=FALSE, sep=\t)
 This subset has 97 individual samples and 3 PARLABELS distributed across
 249 individual lines.


 Added Responses:
 1. The structure of my actual input file appears to be correct per the
 following:
   (I has sent you a separate extration from an excel file)
   (Strings as Factors, numbers as num or int; a date changed via as.Date())
 'data.frame':   19694 obs. of  25 variables:
  $ Transect : Factor w/ 78 levels FLR01,FLR02,..: 1 1 1 1 1 1 1 1 1 1
 ...
  $ Offset   : Factor w/ 16 levels 0,A,B,C,..: 1 1 1 1 1 1 1 1 1 1
 ...
  $ Location : Factor w/ 246 levels FLR010,FLR01A,..: 1 1 1 1 1 1 1 1 1
 1 ...
  $ fldsampid: Factor w/ 863 levels FLR010-ANE1,..: 1 1 1 1 1 1 1 1 1 1
 ...
  $ CLP_ID   : Factor w/ 586 levels ,MY6591,MY6593,..: 1 1 1 1 1 1 1 1
 1 1 ...
  $ sacode   : Factor w/ 2 levels FD,N: 2 2 2 2 2 2 2 2 2 2 ...
  $ matrix   : Factor w/ 6 levels SE,SO,TA,..: 3 3 3 3 3 3 3 3 3 3 ...
  $ LTCCODE  : Factor w/ 4 levels BH,LK,RE,..: 4 4 4 4 4 4 4 4 4 4 ...
  $ Northing : num  2444733 2444733 2444733 2444733 2444733 ...
  $ Easting  : num  5684613 5684613 5684613 5684613 5684613 ...
  $ CRDUNITS : Factor w/ 1 level FT: 1 1 1 1 1 1 1 1 1 1 ...
  $ Event: int  1 1 1 1 1 1 1 1 1 1 ...
  $ LOGDATE  :Class 'Date'  num [1:19694] 

Re: [R] Don´t know what test i have to use

2011-01-12 Thread Bert Gunter
... But I would think that month should be treated as a cyclical
quantity, not as a factor with 12 independent levels, e.g. by
transforming month to  sin( 2*pi*monthNumber/12) .  This assumes 1
year periodicity, which might not be right, of course. Time series
methods could obviously be relevant here. Given the possible
importance of such periodicity and the relative complexity of the
methodology necessary to deal with it properly, you might benefit by
consulting your local statistician for help.

-- Bert

On Wed, Jan 12, 2011 at 10:43 AM, Joshua Wiley jwiley.ps...@gmail.com wrote:
 Hi,

 That is basically correct.  You can specify the link as logit (see my
 example), but that is the default so you do not strictly need to in
 this case.  II would encourage you to keep your variables
 (prevalencia, edad, sexo, mes) stored in a data frame, in which case
 you would add the data = argument to glm().

 model2 - glm(prevalencia ~ edad * sexo * mes * zona,
  family = binomial(link = logit),
  data = your_dataframe)

 Also, you might take a look at ?predict.glm  it has some examples with
 binomial data based off the wonderful book by Drs. Venables and
 Ripley.  Oh, and finally, if you have 12 levels of months, ? levels of
 zones, and 2 levels of sex, you might not want the 4way interactions
 that you will get by default from using the '*' operator inside a
 formula.  Unless you have a theory that there is an additional effect
 of being a middle aged female in the month of July for zone 8, but
 not

 Cheers,

 Josh

 On Wed, Jan 12, 2011 at 9:51 AM, gaiarrido gaiarr...@usal.es wrote:

 Hello,
 I´m starting with my PhD and I have to stop because i got a little knowledge
 in R and statistics.
 I´ve got a model of this kind:
 binary response variable: prevalence of infection (0/1)
 3 categorical independent variables: sex, month and name of the area

 I was trying with a full model like this, before the simplification

 model-aov(prevalencia~sex*month*area)

 but the Fligner test told that i haven´t got homoscedascity, so I suppose I
 should trying with glm, with a model

 model2-glm(prevalencia~edad*sexo*mes*zona,binomial)

 is that correct? where I must put the link (logit) ?

 Thnks very much
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Don-t-know-what-test-i-have-to-use-tp3214491p3214491.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Joshua Wiley
 Ph.D. Student, Health Psychology
 University of California, Los Angeles
 http://www.joshuawiley.com/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to disable using enter key to exit the browser in debugging mode

2011-01-12 Thread Feng Li
Dear R,

How can I disable using enter key to exit the browser() in debug mode? I
would love to have this option because it is so annoying to jump out of the
debugging mode unexpectedly when I don't want to. I guess some of us have
encouraged at least one of these situations,

1, Accidentally pressed the enter key within the browser.

2, Copy and paste a piece of debugging code containing empty lines to the
prompt within the debugging mode.

3, If I paste a piece of code to the prompt to debug as follows, it will
eventually jump out before I can do anything.

### copy starting from this line ##
test - function()
{
 x- 5
 browser()
 y-4
}

test()

 end of copy at this line 


Any suggestions are most welcome!

Feng

-- 
Feng Li
Department of Statistics
Stockholm University
106 91 Stockholm, Sweden
http://feng.li/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RNetCDF: retrieving variable names and units

2011-01-12 Thread Jannis
Dear List,


does anybody has experience with the RNetCDF package? I manage to open a 
connection and copy data from a ncdf file but would need a way to automatically 
retrieve variable names (ideally all of them from one file) and units from the 
file.


Any ideas?
Jannis



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RNetCDF: retrieving variable names and units

2011-01-12 Thread David Pierce
Hi Jannis,

although I don't know how you'd do that with RNetCDF, with the ncdf
package it's pretty easy:

ncid = open.ncdf( 'file.nc' )
nvars = ncid$nvars
for( ivar in 1:nvars )
print(paste(var number,ivar,is named, ncid$var[[ivar]]$name, and
has units, ncid$var[[ivar]]$units ))

Regards,

--Dave

Jannis wrote:
 Dear List,


 does anybody has experience with the RNetCDF package? I manage to open a
 connection and copy data from a ncdf file but would need a way to
 automatically retrieve variable names (ideally all of them from one file)
 and units from the file.


 Any ideas?
 Jannis



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




---
David W. Pierce
Division of Climate, Atmospheric Science, and Physical Oceanography
Scripps Institution of Oceanography
(858) 534-8276 (voice)  /  (858) 534-8561 (fax)dpie...@ucsd.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RNetCDF: retrieving variable names and units

2011-01-12 Thread Michael Sumner
There are a number of functions in the package to inquire about the
file contents. See library(help = RNetCDF).

For example:

library(RNetCDF)

nc - open.nc(file.nc)

var.inq.nc(nc, 0)
$id
[1] 0

$name
[1] longitude_U

$type
[1] NC_DOUBLE

$ndims
[1] 1

$dimids
[1] 1

$natts
[1] 0

You can then read from the file with something like this:

obj0 - var.inq.nc(nc, 0)
dat - var.get.nc(nc, obj0$name, start = ...)

Going with ncdf or ncdf4 package is probably better ( I just happen to
be more familiar with RNetCDF.)

Cheers, Mike.

On Thu, Jan 13, 2011 at 9:00 AM, Jannis bt_jan...@yahoo.de wrote:
 Dear List,


 does anybody has experience with the RNetCDF package? I manage to open a 
 connection and copy data from a ncdf file but would need a way to 
 automatically retrieve variable names (ideally all of them from one file) and 
 units from the file.


 Any ideas?
 Jannis



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Michael Sumner
Institute for Marine and Antarctic Studies, University of Tasmania
Hobart, Australia
e-mail: mdsum...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] syntax for extending a line in a script??

2011-01-12 Thread Mike Williamson
Hello,

A hopefully simple question.  I use 'R' through emacs, but I suspect the
following would occur with any manner of text editor:

   - my editor has a normally quite handy feature where it will
   automatically indent to the appropriate level when I start a new line.
   However, this occasionally creates cases where there is no friendly way to
   break a long line of code into two lines which still function as one
   command.  Therefore, I need a nice way to be able to flag 'R' to know that
   the code is continuing on the next line.  Let me explain via example:

numericColumns - names(listOfDataFrames[[myDF]][,columnsOI])
[sapply(listOfDataFrames[[myDF]][,columnsOI],
is.numeric) ]

As you can see in this case, I would *like* for these 2 lines of code to
be read as 1 line, but since the names(blah) command is sufficiently a
command on its own, 'R' see this as a completed line of code.  I could try
to break it up at different points, but emacs (and other text editors) takes
a guess as to the most intelligent way to indent, so that if I were to write
something like:

numericColumns - names(listOfDataFrames[[myDF]][,columnsOI])
[sapply(
  listOfDataFrames[[myDF]][,columnsOI], is.numeric) ]


it would actually indent something more like this:

numericColumns - names(listOfDataFrames[[myDF]][,columnsOI])
[sapply(

listOfDataFrames[[myDF]][,columnsOI], is.numeric) ]


and as you can see, that doesn't help the issue of preventing the code from
wrapping around (and therefore doesn't help readability).  Is there some
simple way to flag that the next line is continuing?  Something like
python's \ at the end of a line?  I tried wrapping the whole thing around
curly braces { } but that didn't work, either.

   Thanks!
   Mike


Telescopes and bathyscaphes and sonar probes of Scottish lakes,
Tacoma Narrows bridge collapse explained with abstract phase-space maps,
Some x-ray slides, a music score, Minard's Napoleanic war:
The most exciting frontier is charting what's already here.
  -- xkcd

--
Help protect Wikipedia. Donate now:
http://wikimediafoundation.org/wiki/Support_Wikipedia/en

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] speed up subsetting with certain conditions

2011-01-12 Thread Duke

Hi folks,

I am working on a project that requires subsetting of a found file based 
on some known file. The known file contains several lines like below:


chr132375463237547rs523104280+
chr132375493237550rs520975820+
chr245133264513327rs297692800+
chr245133374513338rs332860090+

where the first column can be chr2, chr1, chr12 etc... The second and 
third are numbers (cordinates). The found file contains lines like:


chr13213435GC
chr13237547TC
chr13237549GT
chr24513326AG
chr24513337CG

where the first column, again, can be chr1, chr2, chr12 etc... and the 
second is a number. What I have to do is to separate the found file to 
two files: one (foundY) contains lines that have the same first column 
and the second column in range of the two columns 2 and 3 of any line of 
known file, and one (foundN) contains lines that do not meet the 
previous condition. For the two examples above, foundN will be the first 
line, and foundY will be the next 4 lines.


What I came up with is this algorithm:

* get the uniq item in the first column of found file (chr1, chr2, 
chr12, chr13 etc...)
* for each of the uniq item, set subset of the known file and the found 
file that have same first column, then scanning each item in the known 
subset to see if any line meets any condition


The code is like below:

## CODE START###
# import known and found files to data frames
known - read.table( known.txt, sep=\t, header=FALSE )
found - read.table( found.txt, sep=\t, header=FALSE, fill=TRUE )

# get the uniq item in first column of found file
found.Chr - as.character(found[!duplicated(found[[1]]),1])

# create two empty result data frames
foundN - found[0,]
foundY - found[0,]

# scan for each of the uniq items
for ( iChr in found.Chr ) {
  # subset of known and found with specific item
  found.iChr - found[found[[1]]==iChr,]
  known.iChr - known[known[[1]]==iChr,]

  # scan through all found subset items
  if ( nrow(known.iChr)0 ) {
for ( i in 1:nrow(found.iChr) ) {
  if ( nrow(known.iChr[known.iChr[[3]]=found.iChr[i,2]  
known.iChr[[2]]=found.iChr[i,2],])==0 ) {

  foundN - rbind( foundN, found.iChr[i,] )
  } else {
  foundY - rbind( foundN, found.iChr[i,] )
  }
}
  }
}

## CODE END###

The code works well, but I tested it for only small known and found 
files. When trying with larger files (the known file can contains ~ 15 
million lines, the found ~ 15k lines), it takes like hrs to run.


I want to speed up the process, and I believe there must be a better 
algorithm to do this with R. My questions are:


* any body has a better algorithm or comments or suggestion?
* I read (google) that matrices work faster than data frame. Can I use 
matrices for this case? (is matrices for numbers only?)
* I read (google) that I should avoid rbind, and prelocate data frame 
for faster speed. How would I do that in this case?


Thank you very much in advance,

Bests,

D.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] syntax for extending a line in a script??

2011-01-12 Thread Brian Diggs

On 1/12/2011 2:46 PM, Mike Williamson wrote:

Hello,

 A hopefully simple question.  I use 'R' through emacs, but I suspect the
following would occur with any manner of text editor:

- my editor has a normally quite handy feature where it will
automatically indent to the appropriate level when I start a new line.
However, this occasionally creates cases where there is no friendly way to
break a long line of code into two lines which still function as one
command.  Therefore, I need a nice way to be able to flag 'R' to know that
the code is continuing on the next line.  Let me explain via example:

 numericColumns- names(listOfDataFrames[[myDF]][,columnsOI])
 [sapply(listOfDataFrames[[myDF]][,columnsOI],
is.numeric) ]


You can put the right hand side of the assignment in parentheses. Then 
even with the same breaks, the first line is not complete, so R will 
continue parsing.  An emacs still indents reasonably. (I added a second 
line break to try and avoid email wrapping affecting things).


numericColumns - (names(listOfDataFrames[[myDF]][,columnsOI])
   [sapply(listOfDataFrames[[myDF]][,columnsOI],
   is.numeric)])


 As you can see in this case, I would *like* for these 2 lines of code to
be read as 1 line, but since the names(blah) command is sufficiently a
command on its own, 'R' see this as a completed line of code.  I could try
to break it up at different points, but emacs (and other text editors) takes
a guess as to the most intelligent way to indent, so that if I were to write
something like:

 numericColumns- names(listOfDataFrames[[myDF]][,columnsOI])
[sapply(
   listOfDataFrames[[myDF]][,columnsOI], is.numeric) ]


it would actually indent something more like this:

 numericColumns- names(listOfDataFrames[[myDF]][,columnsOI])
[sapply(

listOfDataFrames[[myDF]][,columnsOI], is.numeric) ]


and as you can see, that doesn't help the issue of preventing the code from
wrapping around (and therefore doesn't help readability).  Is there some
simple way to flag that the next line is continuing?  Something like
python's \ at the end of a line?  I tried wrapping the whole thing around
curly braces { } but that didn't work, either.


Putting the right hand side in curly braces might work too.  That would 
turn it into a code block, which should evaluate to whatever the last 
statement in the code block is (which in this case is the only 
statement).  I wouldn't be surprised if there is some case where curly 
braces might lead to a different result; parentheses shouldn't (but I 
may be wrong).



Thanks!
Mike


--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health  Science University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed up subsetting with certain conditions

2011-01-12 Thread Martin Morgan

On 1/12/2011 2:52 PM, Duke wrote:

Hi folks,

I am working on a project that requires subsetting of a found file 
based on some known file. The known file contains several lines like 
below:


chr132375463237547rs523104280+
chr132375493237550rs520975820+
chr245133264513327rs297692800+
chr245133374513338rs332860090+

where the first column can be chr2, chr1, chr12 etc... The second and 
third are numbers (cordinates). The found file contains lines like:


chr13213435GC
chr13237547TC
chr13237549GT
chr24513326AG
chr24513337CG

where the first column, again, can be chr1, chr2, chr12 etc... and the 
second is a number. What I have to do is to separate the found file to 
two files: one (foundY) contains lines that have the same first column 
and the second column in range of the two columns 2 and 3 of any line 
of known file, and one (foundN) contains lines that do not meet the 
previous condition. For the two examples above, foundN will be the 
first line, and foundY will be the next 4 lines.


What I came up with is this algorithm:

* get the uniq item in the first column of found file (chr1, chr2, 
chr12, chr13 etc...)
* for each of the uniq item, set subset of the known file and the 
found file that have same first column, then scanning each item in the 
known subset to see if any line meets any condition


The code is like below:

## CODE START###
# import known and found files to data frames
known - read.table( known.txt, sep=\t, header=FALSE )
found - read.table( found.txt, sep=\t, header=FALSE, fill=TRUE )

# get the uniq item in first column of found file
found.Chr - as.character(found[!duplicated(found[[1]]),1])

# create two empty result data frames
foundN - found[0,]
foundY - found[0,]

# scan for each of the uniq items
for ( iChr in found.Chr ) {
  # subset of known and found with specific item
  found.iChr - found[found[[1]]==iChr,]
  known.iChr - known[known[[1]]==iChr,]

  # scan through all found subset items
  if ( nrow(known.iChr)0 ) {
for ( i in 1:nrow(found.iChr) ) {
  if ( nrow(known.iChr[known.iChr[[3]]=found.iChr[i,2]  
known.iChr[[2]]=found.iChr[i,2],])==0 ) {

  foundN - rbind( foundN, found.iChr[i,] )
  } else {
  foundY - rbind( foundN, found.iChr[i,] )
  }
}
  }
}

## CODE END###

The code works well, but I tested it for only small known and found 
files. When trying with larger files (the known file can contains ~ 15 
million lines, the found ~ 15k lines), it takes like hrs to run.


I want to speed up the process, and I believe there must be a better 
algorithm to do this with R. My questions are:


* any body has a better algorithm or comments or suggestion?


The Bioconductor project has many tools for dealing with 
sequence-related data. With the data


k - read.table(textConnection(
chr132375463237547rs523104280+
chr132375493237550rs520975820+
chr245133264513327rs297692800+
chr245133374513338rs332860090+))

f - read.table(textConnection(
chr13213435GC
chr13237547TC
chr13237549GT
chr24513326AG
chr24513337CG))

One might use the GenomicRanges package as

library(GenomicRanges)
kgr - with(k, GRanges(V1, IRanges(V2, V3, names=V4), V6, score=V5))
fgr - with(f, GRanges(V1, IRanges(V2, width=1), V3=V3, V4=V4))
olaps - findOverlaps(fgr, kgr)
idx - countOverlaps(fgr, kgr) != 0

resulting in

 idx
[1] FALSE  TRUE  TRUE  TRUE  TRUE

This will be fast.

One could write foundY with as.data.frame(fgr[idx]) (maybe a little 
editing) but likely one would want to stay in R / Bioc and do something 
more interesting...


See

http://bioconductor.org/install/index.html

Martin


* I read (google) that matrices work faster than data frame. Can I use 
matrices for this case? (is matrices for numbers only?)
* I read (google) that I should avoid rbind, and prelocate data frame 
for faster speed. How would I do that in this case?


Thank you very much in advance,

Bests,

D.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Dr. Martin Morgan, PhD
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] syntax for extending a line in a script??

2011-01-12 Thread David Winsemius


On Jan 12, 2011, at 5:46 PM, Mike Williamson wrote:


Hello,

   A hopefully simple question.  I use 'R' through emacs, but I  
suspect the

following would occur with any manner of text editor:

  - my editor has a normally quite handy feature where it will
  automatically indent to the appropriate level when I start a new  
line.
  However, this occasionally creates cases where there is no  
friendly way to

  break a long line of code into two lines which still function as one
  command.  Therefore, I need a nice way to be able to flag 'R' to  
know that
  the code is continuing on the next line.  Let me explain via  
example:


My practice is to use the opening of a paired code delimiter like [  
or ( at the end of a line as I have modified your code to show:


   numericColumns - names(listOfDataFrames[[myDF]][,columnsOI])[
   sapply(listOfDataFrames[[myDF]] 
[,columnsOI],

  is.numeric) ]

   As you can see in this case, I would *like* for these 2 lines of  
code to
be read as 1 line, but since the names(blah) command is  
sufficiently a
command on its own, 'R' see this as a completed line of code.  I  
could try
to break it up at different points, but emacs (and other text  
editors) takes
a guess as to the most intelligent way to indent, so that if I were  
to write

something like:

   numericColumns - names(listOfDataFrames[[myDF]][,columnsOI])
[sapply(
 listOfDataFrames[[myDF]][,columnsOI],  
is.numeric) ]



it would actually indent something more like this:

   numericColumns - names(listOfDataFrames[[myDF]][,columnsOI])
[sapply(

listOfDataFrames[[myDF]][,columnsOI], is.numeric) ]


and as you can see, that doesn't help the issue of preventing the  
code from
wrapping around (and therefore doesn't help readability).  Is there  
some

simple way to flag that the next line is continuing?  Something like
python's \ at the end of a line?  I tried wrapping the whole thing  
around

curly braces { } but that didn't work, either.

  Thanks!
  Mike


Telescopes and bathyscaphes and sonar probes of Scottish lakes,
Tacoma Narrows bridge collapse explained with abstract phase-space  
maps,

Some x-ray slides, a music score, Minard's Napoleanic war:
The most exciting frontier is charting what's already here.
 -- xkcd

--
Help protect Wikipedia. Donate now:
http://wikimediafoundation.org/wiki/Support_Wikipedia/en

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggredating date data

2011-01-12 Thread Jannis
?aggregate

I would transfer your date character sting into a date object (as.POSIXct) and 
then extract month or week numbers (?format) from this vector and use them as 
indices for the aggregate function. There may be more elegant ways though


HTH
Jannis

--- analys...@hotmail.com analys...@hotmail.com schrieb am Mi, 12.1.2011:

 Von: analys...@hotmail.com analys...@hotmail.com
 Betreff: [R] aggredating date data
 An: r-help@r-project.org
 Datum: Mittwoch, 12. Januar, 2011 20:20 Uhr
 I tried a date by date forecast of a
 time series and it seems to be
 too wild.  How can I aggregate the date into weeks or
 months as
 required?
 
 Thanks.
 
 The input looks like
 
 ID datadate(-MM-DD)  value_for_day
 --     -       
                
         ---
 --     --     
                
          
 
 and I want to be able to change it to
 
 ID dataweek value_for_week
 
 or
 
 ID datamonth value_ for_ month
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.
 



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Grouped bars in barplot

2011-01-12 Thread Steve Murray

Dear all,
 
I am trying to make a barplot with clustered pairs of bars, using class=numeric 
data and the following command:
 
barplot(c(bline_precip[10,9], bline_runoff[10,9], cccma_precip[10,9], 
cccma_runoff[10,9], csiro_precip[10,9], csiro_runoff[10,9], ipsl_precip[10,9], 
ipsl_runoff[10,9], mpi_precip[10,9], mpi_runoff[10,9], ncar_precip[10,9], 
ncar_runoff[10,9], ukmo_precip[10,9], ukmo_runoff[10,9]), beside=TRUE, 
space=c(0,2))
 
This results in all bars being packed tightly together, but with no gap between 
each pair. I suspect the problem is something to do with the data not being a 
matrix, but I've tried using as.matrix for each data element and this doesn't 
seem to work. If any one has any suggestions I'd be very grateful to hear them.
 
Also, I'm hoping to put a label beneath each pair of bars on the x-axis, in the 
centre. At present I can only get labels to appear directly underneath a single 
bar, as opposed to the centre of the pair of bars. Does anyone have any 
suggestions for solving this?
 
Many thanks for any help offered.
 
Steve 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] navigating in lists

2011-01-12 Thread Erik Gregory
Or, if for some reason the lists differ in length...

test=list(a=list(1,2),b=list(3,4),c=list(5,6,7))


picker - function(x, i) {
if(length(x)=i)
  x[[i]]
else
  NA
}

pick - function(list,i) {
sapply(list, function(x) picker(x, i))
}

 pick(test, 1)
a b c 
1 3 5 
 pick(test, 2)
a b c 
2 4 6 
 pick(test, 3)
 a  b  c 
NA NA  7 
 pick(test, 4)
 a  b  c 
NA NA NA 

-Erik Gregory
Student Assistant, California EPA
CSU Sacramento, Mathematics

- Original Message 
From: Greg Snow greg.s...@imail.org
To: Jannis bt_jan...@yahoo.de; r-help@r-project.org r-help@r-project.org
Sent: Wed, January 12, 2011 1:17:54 PM
Subject: Re: [R] navigating in lists

 sapply(test, '[[', 1)
a b c 
1 3 5

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Jannis
 Sent: Wednesday, January 12, 2011 1:50 PM
 To: r-help@r-project.org
 Subject: [R] navigating in lists
 
 Dear list members,
 
 
 I am stuck with navigating in a rather complicated list object.
 
 In general I would need a solution to access all first (or other)
 elements of the different sublists in one list:
 
 test=list(a=list(1,2),b=list(3,4),c=list(5,6))
 
 like:
 
 test[[1:3]][[1]]
 
 which should result in
 
 c(1,3,5)
 
 
 Is there any way to access lists in such a way? Using unlist would
 create quite complicated objects
 
 Cheers
 Jannis
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multivariate autoregressive models with lasso penalization

2011-01-12 Thread John M. Drake
I wish to estimate sparse causal networks from simulated time series data.
Although there's some discussion about this problem in the literature (at
least a few authors have used lasso and l(1,2) regularization to enforce
sparsity in multivariate autoregressive models, e.g.,
http://user.cs.tu-berlin.de/~nkraemer/papers/grplasso_causality.pdf), I
can't find any R packages with these capabilities.

Has anyone in the R community experimented with such or put code out for
this problem?

Many thanks.
John

-- 
John M. Drake, Ph.D.
Associate Professor
University of Georgia
Odum School of Ecology
Athens, GA 30602-2202

phone:  706.583.5539
fax:   706.542.4819
email:   jdr...@uga.edu
skype:  john.drake.uga
web: http://dragonfly.ecology.uga.edu/drakelab

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed up subsetting with certain conditions

2011-01-12 Thread Duke

On 1/12/11 6:12 PM, Martin Morgan wrote:
The Bioconductor project has many tools for dealing with 
sequence-related data. With the data


k - read.table(textConnection(
chr132375463237547rs523104280+
chr132375493237550rs520975820+
chr245133264513327rs297692800+
chr245133374513338rs332860090+))

f - read.table(textConnection(
chr13213435GC
chr13237547TC
chr13237549GT
chr24513326AG
chr24513337CG))

One might use the GenomicRanges package as

library(GenomicRanges)
kgr - with(k, GRanges(V1, IRanges(V2, V3, names=V4), V6, score=V5))
fgr - with(f, GRanges(V1, IRanges(V2, width=1), V3=V3, V4=V4))
olaps - findOverlaps(fgr, kgr)
idx - countOverlaps(fgr, kgr) != 0

resulting in

 idx
[1] FALSE  TRUE  TRUE  TRUE  TRUE

This will be fast.


Thanks so much for your suggestion Martin. I had Bioconductor installed 
but I honestly do not know all its applications. Anyway, I am testing 
GenomicRanges with my data now. I will report back when I get the result.




One could write foundY with as.data.frame(fgr[idx]) (maybe a little 
editing) but likely one would want to stay in R / Bioc and do 
something more interesting...




I suppose foundN - as.data.frame(fgr[!idx]) and foundY - 
as.data.frame(fgr[idx]) as you suggested, but I dont really understand 
your last comment :).


Thanks,

D.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] syntax for extending a line in a script??

2011-01-12 Thread Mike Williamson
Thanks to Brian  David!  For some reason, I'd thought to use {, but
not (.  I guess I can chalk that up to a slow brain.  Interestingly (and I
didn't bother to figure out why), the { didn't work for me.  But I tried a
few cases, and ( always seems to work, so far.

  Regards,
   Mike


Telescopes and bathyscaphes and sonar probes of Scottish lakes,
Tacoma Narrows bridge collapse explained with abstract phase-space maps,
Some x-ray slides, a music score, Minard's Napoleanic war:
The most exciting frontier is charting what's already here.
  -- xkcd

--
Help protect Wikipedia. Donate now:
http://wikimediafoundation.org/wiki/Support_Wikipedia/en


On Wed, Jan 12, 2011 at 6:08 PM, Brian Diggs dig...@ohsu.edu wrote:

 On 1/12/2011 2:46 PM, Mike Williamson wrote:

 Hello,

 A hopefully simple question.  I use 'R' through emacs, but I suspect
 the
 following would occur with any manner of text editor:

- my editor has a normally quite handy feature where it will
automatically indent to the appropriate level when I start a new line.
However, this occasionally creates cases where there is no friendly way
 to
break a long line of code into two lines which still function as one
command.  Therefore, I need a nice way to be able to flag 'R' to know
 that
the code is continuing on the next line.  Let me explain via example:

 numericColumns- names(listOfDataFrames[[myDF]][,columnsOI])
 [sapply(listOfDataFrames[[myDF]][,columnsOI],
 is.numeric) ]


 You can put the right hand side of the assignment in parentheses. Then even
 with the same breaks, the first line is not complete, so R will continue
 parsing.  An emacs still indents reasonably. (I added a second line break to
 try and avoid email wrapping affecting things).


 numericColumns - (names(listOfDataFrames[[myDF]][,columnsOI])
   [sapply(listOfDataFrames[[myDF]][,columnsOI],
   is.numeric)])

  As you can see in this case, I would *like* for these 2 lines of code
 to
 be read as 1 line, but since the names(blah) command is sufficiently a
 command on its own, 'R' see this as a completed line of code.  I could try
 to break it up at different points, but emacs (and other text editors)
 takes
 a guess as to the most intelligent way to indent, so that if I were to
 write
 something like:

 numericColumns- names(listOfDataFrames[[myDF]][,columnsOI])
 [sapply(
   listOfDataFrames[[myDF]][,columnsOI], is.numeric) ]


 it would actually indent something more like this:

 numericColumns- names(listOfDataFrames[[myDF]][,columnsOI])
 [sapply(

 listOfDataFrames[[myDF]][,columnsOI], is.numeric) ]


 and as you can see, that doesn't help the issue of preventing the code
 from
 wrapping around (and therefore doesn't help readability).  Is there some
 simple way to flag that the next line is continuing?  Something like
 python's \ at the end of a line?  I tried wrapping the whole thing
 around
 curly braces { } but that didn't work, either.


 Putting the right hand side in curly braces might work too.  That would
 turn it into a code block, which should evaluate to whatever the last
 statement in the code block is (which in this case is the only statement).
  I wouldn't be surprised if there is some case where curly braces might lead
 to a different result; parentheses shouldn't (but I may be wrong).

 Thanks!
Mike


 --
 Brian S. Diggs, PhD
 Senior Research Associate, Department of Surgery
 Oregon Health  Science University


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to disable using enter key to exit the browser in debugging mode

2011-01-12 Thread Gene Leynes
That also drives me crazy!

I don't have that problem when I use the StatEt plug-in for Eclipse.

Of course, using a new IDE is a big undertaking, but I can assure you: it's
worth it!  This is just one small benefit.

On Wed, Jan 12, 2011 at 4:00 PM, Feng Li m...@feng.li wrote:

 Dear R,

 How can I disable using enter key to exit the browser() in debug mode? I
 would love to have this option because it is so annoying to jump out of the
 debugging mode unexpectedly when I don't want to. I guess some of us have
 encouraged at least one of these situations,

 1, Accidentally pressed the enter key within the browser.

 2, Copy and paste a piece of debugging code containing empty lines to the
 prompt within the debugging mode.

 3, If I paste a piece of code to the prompt to debug as follows, it will
 eventually jump out before I can do anything.

 ### copy starting from this line ##
 test - function()
 {
 x- 5
 browser()
 y-4
 }

 test()

  end of copy at this line 


 Any suggestions are most welcome!

 Feng

 --
 Feng Li
 Department of Statistics
 Stockholm University
 106 91 Stockholm, Sweden
 http://feng.li/

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 2d plot with modification of plotting symbol to indicate third dimension.

2011-01-12 Thread John Sorkin
I would like to plot 3-dimensional data on a two-dimensional scatter-plot.
Is there a way I can automatically modify the plot symbol (e.g. changing size 
or color) to indicate the value of a third variable? E.g. How can I plot weight 
vs. age and indicate the value of muscle mass for each value weight-age pair by 
making the plot point proportional to the subject's muscle mass?
Thanks,
John


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Openbugs and rbugs on mac with wine

2011-01-12 Thread Corey Sparks

Hello list,
I’ve been trying to get OpenBUGS running on my mac using the wine  
emulator.  I can run Openbugs just fine by doing:

wine ~/OpenBUGS312/OpenBUGS.exe

In the terminal, so OpenBUGS works.  When I try to run the schools  
example using rbugs(), the OpenBUGS process starts in wine, but it  
just sits there, no log, no script, no output of any sort.  The rbugs 
() call makes the init, data, model and script file, but there seems  
to be a problem with R piping the script to OpenBUGS, here is my example


library(rbugs)
data(schools)
J - nrow(schools)
y - schools$estimate
y - rnorm(length(y))
sigma.y - schools$sd
schools.data - list (J, y, sigma.y)
## schools.data - list(J=J, y=y, sigma.y=sigma.y)
inits - function() {list (theta=rnorm(J,0,100),
   mu.theta=rnorm(1,0,100),
   sigma.theta=runif(1,0,100))}
parameters - c(theta, mu.theta, sigma.theta)
schools.bug - file.path(.path.package(rbugs), bugs/model,  
schools.bug)

file.show(schools.bug)

#This almost runs, it makes all files, but doesn't run the script
schools.sim - rbugs(data=schools.data, inits, parameters,
 schools.bug, n.chains=3, n.iter=1,seed=123,
 workingDir=/Users/ozd504/Documents/,
 bugsWorkingDir=/Users/ozd504/Documents/,
 useWine=TRUE,
 wine=/opt/local/bin/wine,
 bugs = /Users/ozd504/OpenBUGS312/ 
OpenBUGS.exe,OpenBugs=T,

 debug=TRUE)

This Returns an error saying that bugs terminated before the coda  
could be written


I can also send a screen shot of what happens if anyone is  
interested.  Any help would be most appreciated.  Here is my  
sessionInfo()

R version 2.12.1 (2010-12-16)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] R2WinBUGS_2.1-16 coda_0.14-2  lattice_0.19-13  rbugs_0.4-9

loaded via a namespace (and not attached):
[1] grid_2.12.1  tools_2.12.1


Thanks,
Corey

Corey Sparks
Assistant Professor
Department of Demography and Organization Studies
University of Texas at San Antonio
501 West Durango Blvd
Monterey Building 2.270C
San Antonio, TX 78207
210-458-3166
corey.sparks 'at' utsa.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Openbugs and rbugs on mac with wine

2011-01-12 Thread COREY SPARKS
Hello list,
I’ve been trying to get OpenBUGS running on my mac using the wine emulator.  I 
can run Openbugs just fine by doing:
wine ~/OpenBUGS312/OpenBUGS.exe

In the terminal, so OpenBUGS works.  When I try to run the schools example 
using rbugs(), the OpenBUGS process starts in wine, but it just sits there, no 
log, no script, no output of any sort.  The rbugs() call makes the init, data, 
model and script file, but there seems to be a problem with R piping the script 
to OpenBUGS, here is my example

library(rbugs)
data(schools)
J - nrow(schools)
y - schools$estimate
y - rnorm(length(y))
sigma.y - schools$sd
schools.data - list (J, y, sigma.y)
## schools.data - list(J=J, y=y, sigma.y=sigma.y)
inits - function() {list (theta=rnorm(J,0,100),
   mu.theta=rnorm(1,0,100),
   sigma.theta=runif(1,0,100))}
parameters - c(theta, mu.theta, sigma.theta)
schools.bug - file.path(.path.package(rbugs), bugs/model, schools.bug)
file.show(schools.bug)

#This almost runs, it makes all files, but doesn't run the script
schools.sim - rbugs(data=schools.data, inits, parameters,
 schools.bug, n.chains=3, n.iter=1,seed=123,
 workingDir=/Users/ozd504/Documents/,
 bugsWorkingDir=/Users/ozd504/Documents/,
 useWine=TRUE,
 wine=/opt/local/bin/wine,
 bugs = /Users/ozd504/OpenBUGS312/OpenBUGS.exe,OpenBugs=T,
 debug=TRUE)

This Returns an error saying that bugs terminated before the coda could be 
written

I can also send a screen shot of what happens if anyone is interested.  Any 
help would be most appreciated.  Here is my sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] R2WinBUGS_2.1-16 coda_0.14-2  lattice_0.19-13  rbugs_0.4-9

loaded via a namespace (and not attached):
[1] grid_2.12.1  tools_2.12.1


Thanks,
Corey


Corey S. Sparks, Ph.D.

Assistant Professor 
Department of Demography and Organization Studies
University of Texas San Antonio
501 West Durango Blvd 
San Antonio, TX 78207
email:corey.spa...@utsa.edu
web: https://rowdyspace.utsa.edu/users/ozd504/www/index.htm


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Grouped bars in barplot

2011-01-12 Thread Peter Alspach
Tena koe Steve

Convert your data into a matrix:

dataMat - matrix(c(bline_precip[10,9], bline_runoff[10,9],
cccma_precip[10,9], cccma_runoff[10,9],
csiro_precip[10,9], csiro_runoff[10,9],
ipsl_precip[10,9], ipsl_runoff[10,9],
mpi_precip[10,9], mpi_runoff[10,9],
ncar_precip[10,9], ncar_runoff[10,9],
ukmo_precip[10,9], ukmo_runoff[10,9]), nrow=2)

Since I don't know the nature of your data I will use a simple example:

dataMat - matrix(1:14, nrow=2)
barplot(dataMat, beside=TRUE, space=c(0,2))
tTicks - barplot(dataMat, beside=TRUE, space=c(0,2))
tTicks - tapply(tTicks, rep(1:7, each=2), mean)
axis(1, tTicks, letters[1:7])

Is that what you want?

HTH 

Peter Alspach

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Steve Murray
 Sent: Thursday, 13 January 2011 12:14 p.m.
 To: r-help@r-project.org
 Subject: [R] Grouped bars in barplot
 
 
 Dear all,
 
 I am trying to make a barplot with clustered pairs of bars, using
 class=numeric data and the following command:
 
 barplot(c(bline_precip[10,9], bline_runoff[10,9], cccma_precip[10,9],
 cccma_runoff[10,9], csiro_precip[10,9], csiro_runoff[10,9],
 ipsl_precip[10,9], ipsl_runoff[10,9], mpi_precip[10,9],
 mpi_runoff[10,9], ncar_precip[10,9], ncar_runoff[10,9],
 ukmo_precip[10,9], ukmo_runoff[10,9]), beside=TRUE, space=c(0,2))
 
 This results in all bars being packed tightly together, but with no gap
 between each pair. I suspect the problem is something to do with the
 data not being a matrix, but I've tried using as.matrix for each data
 element and this doesn't seem to work. If any one has any suggestions
 I'd be very grateful to hear them.
 
 Also, I'm hoping to put a label beneath each pair of bars on the x-
 axis, in the centre. At present I can only get labels to appear
 directly underneath a single bar, as opposed to the centre of the pair
 of bars. Does anyone have any suggestions for solving this?
 
 Many thanks for any help offered.
 
 Steve
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

The contents of this e-mail are confidential and may be subject to legal 
privilege.
 If you are not the intended recipient you must not use, disseminate, 
distribute or
 reproduce all or any part of this e-mail or attachments.  If you have received 
this
 e-mail in error, please notify the sender and delete all material pertaining 
to this
 e-mail.  Any opinion or views expressed in this e-mail are those of the 
individual
 sender and may not represent those of The New Zealand Institute for Plant and
 Food Research Limited.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] easy loop question

2011-01-12 Thread Sebastián Daza

Hi everyone,

I am new in R and programming. I have tried to remove the values out of 
range in some variables using a loop:


1)

var  - names(est8vo[, 77:83])   # I got the variable names

 var
[1] p16.1 p16.2 p16.3 p16.4 p16.5 p16.6 p16.7

for (i in 1:7)  {
var.i  - var[i]
est8vo$var.i[ est8vo$var.i==3] - 99
  }

I got this error:

Error in `$-.data.frame`(`*tmp*`, var.i, value = numeric(0)) :
  replacement has 0 rows, data has 215700


2) The second step would be to define the factors, but I got the same error:

for (i in 1:7)  {
var.i  - var[i]
est8vo$var.i- factor(est8vo$var.i,
  levels=c(0, 1, 2, 99),
  labels=c(vacío, sí, no, doble marca)
  )
  }


I don't know how to do it.
Thank you in advance!
Sebastian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Rotated, Right-Justified Labels for Shortened Tick Marks

2011-01-12 Thread dms
Hello R-help,

I'm trying to make a fairly simple plot axis that goes something like
this:

plot(-10:10,-10:10, yaxt='n')
axis(side=2, las=1, hadj=1,  tck=-.01, cex.axis=.6)

...but as you can see, the labels are not close enough to the y-axis
(where I want them... to save space for publication).

Can anybody help me figure out how to move these labels over the the
right a bit?

Thanks,

-D

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggredating date data

2011-01-12 Thread Gene Leynes
I like the zoo package, and there are several helpful examples.
library(zoo)

You can easily convert your data into a zoo object using
I was actually just doing this using this function:
LoadReturnData=function(x){
ret = read.csv(x)
ret = zoo(ret[ , -1], as.Date(ret[ , 1]))
colnames(ret) = toupper(colnames(ret))
return(ret)
}
fnd = LoadReturnData('/Data/SomeSpecialData.csv')

My data is already in weeks, and aggregating to months is easy using
as.yearmon

MonthIndex=as.yearmon(index(fnd))
aggregate(.~MonthIndex, data=fnd, sum)

If you have daily data and you need weeks, then you'll have to create a
vector to indicate the week, like the MonthIndex above.

e.g. for 365 days
WeekIndex = rep(1:53, each=7, length.out=365)

On Wed, Jan 12, 2011 at 2:20 PM, analys...@hotmail.com 
analys...@hotmail.com wrote:

 I tried a date by date forecast of a time series and it seems to be
 too wild.  How can I aggregate the date into weeks or months as
 required?

 Thanks.

 The input looks like

 ID datadate(-MM-DD)  value_for_day
 -- ----
 -- --   

 and I want to be able to change it to

 ID dataweek value_for_week

 or

 ID datamonth value_ for_ month

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What does the shell() command do?

2011-01-12 Thread l.chhay

Dear R community,

I am trying to understand what the shell() function does.

An example is:

xfile - shell(paste(dir/b ,
paste(directory.folder,file.name,sep=)),intern=T)

I'm afraid I wasn't able to completely understand the explanation under the
Help files.

Thanks for your help!

Leanne.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/What-does-the-shell-command-do-tp3215032p3215032.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] standard errors in johansen test

2011-01-12 Thread Walter Zhang
Dear all,



I have a question. How to get the standard errors of alpha and beta
when using  ca.jo to test cointergration?

In the paper by Bernhard Pfaff and Kronberg im Taunus “VAR, SVAR and SVEC
Models: Implementation Within R Package”  pp.24-25. The standard errors are
listed on the table 5 following the code:

R vecm.r1 - cajorls(vecm, r = 1)

I tried this in my Mac R, but failed.

Thanks.


-- 
  Best Regards

  Walter   an ACCA Affiliate (Association of Chartered Certified
Accountants)

   I COME FROM CHINA

我有一所房子,面朝大海,春暖花开

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] unicodepdf font problem

2011-01-12 Thread tdenes

Dear List,

I would like to print a plot into pdf. The problem is that the character
\U0171 is replaced by a simple 'u' (i.e. without accents) in the pdf file.

Example:
# this works fine
plot(1,type=n)
text(1,1,print \U0171)

# this fails
pdf(trial.pdf)
plot(1,type=n)
text(1,1,print \U0171)
dev.off()

I found an earlier post at
http://www.mail-archive.com/r-help@r-project.org/msg65541.html, but it is
too hard to understand at my R-level. Any help is appreciated.

Regards,
  Denes

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] question about svm(e1071)

2011-01-12 Thread mutohrn
Dear all,

I executed svm calculation using e1071 library with a microarray data 
(http://www.iu.a.u-tokyo.ac.jp/~kadota/R/data_Singh_RMA_3274.txt).
Then, I shuffled the data samples and executed svm calculation again.
The results of 2 calculation were different (in SV, coefs and weights).

I attached the script below. Could please tell me why this happens?
If possible please tell me how to make them equal.

Best regards,

Hiro

### Script start ###

library(e1071)
data - 
read.table('http://www.iu.a.u-tokyo.ac.jp/~kadota/R/data_Singh_RMA_3274.txt', 
header=TRUE, row.names=1, sep=\t, quote=)

data.cl - rep(NA,ncol(data))
data.cl[grep('Normal',colnames(data))] - 'Normal'
data.cl[grep('Tumour',colnames(data))] - 'Tumour'

s - sample(ncol(data))

m   - svm(x=t(data), y=factor(data.cl   ), scale=T, 
type=C-classification,kernel=linear)
m.s - svm(x=t(data[,s]), y=factor(data.cl[s]), scale=T, 
type=C-classification, kernel=linear)

w   - t(m  $coefs) %*% m$SV
w.s - t(m.s$coefs) %*% m.s$SV

# SV and coefs are slightly different
sum(abs(m$SV[order(rownames(m$SV)),] - m.s$SV[order(rownames(m.s$SV)),]))
sum(abs(m$coefs[order(rownames(m$SV))] -m.s$coefs[order(rownames(m.s$SV))]))

# rank of weight are not identical
all(rank(w)==rank(w.s))

### Script end ###


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Repeating value occurence

2011-01-12 Thread Rustamali Manesiya
How can achieve this in R using seq, or rep function

c(-1,0,1,0,-1,0,1,0,-1,0)

The range value is between-1 and 1,  and I want it such that there could be
n number of points between -1 and 1

Anyone? Please help Thanks
Rusty

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with Data Transformation - RESOLVED

2011-01-12 Thread Guy Jett
Hi Dennis,
SOLVED!!!

My thanks to both you, John, and others who chimed in.  Took a little more 
digging before finally working, but it is working!

Here's a little more of what I did and the ultimate resolutions:
I installed the reshape2 library from CRAN.
Executed
 (.packages())
[1] reshape2  stats graphics  grDevices datasets  rcom  
rscproxy  utils methods   base
confirming presence.
Executed your toy script (stepwise so as not to include the output as input)
Received the following result at the end:
 cast(df, idnum ~ lab, value = 'y')
Error: could not find function cast

The Package 'reshape2' documentation indicates that either acast() or dcast() 
are used, not giving an example of cast() itself.
I resolved this by trying both acast() and dcast().  Both yielded the same 
screen results.

Turning to my own data I was able to successfully execute the operation on one 
of my subsets AFTER, cleaning up some results with duplicates PARLABELS.   
(throws the resulting matrix into a count of values but the docs give an 
indication of the resolution).  So it looks like I'm on my way after some 
further data clean-up.

I am using dcast() as I wish to stay 2-dimensional (personal limitations on 
brain()).

Cheers and thanks again to all,
Guy
ITSI,  A Gilbane Company
(925) 946-3340 Direct
(925) 457-4168 ITSI Cell
gj...@itsi.commailto:gj...@itsi.com

From: Dennis Murphy [mailto:djmu...@gmail.com]
Sent: Wednesday, January 12, 2011 1:19 PM
To: Guy Jett
Cc: r-help@r-project.org
Subject: Re: [R] Help with Data Transformation

Hi:

This seems like a problem that is well suited for the cast() function in 
package reshape2. Here's a toy example:

library(reshape2)
df - data.frame(idnum = rep(101:104, c(3, 2, 4, 3)),
  lab = unlist(sapply(c(3, 2, 4, 3), function(x) 
sample(LETTERS[1:6], x))),
  y = rpois(12, 7))
df
   idnum lab  y
1101   C  7
2101   F 11
3101   E  4
4102   B 10
5102   A  7
6103   E  6
7103   D  9
8103   B  3
9103   F 12
10   104   C 10
11   104   D  9
12   104   A  5
cast(df, idnum ~ lab, value = 'y')
  idnum  A  B  C  D  E  F
1   101 NA NA  7 NA  4 11
2   102  7 10 NA NA NA NA
3   103 NA  3 NA  9  6 12
4   104  5 NA 10  9 NA NA

Since you have multiple 'invariant' variables, you could use something like

cast(df, . ~ lab, value = 'y')

This would make all variables other than lab the 'rows', the values of lab as 
separate 'columns', with the value of the variable y inserted in the 
appropriate locations, NA fill otherwise.

cast() is an alternative to the reshape() function in base R.

HTH,
Dennis
On Wed, Jan 12, 2011 at 12:19 PM, Guy Jett 
gj...@itsi.commailto:gj...@itsi.com wrote:
Hi John,
Thank you for your patience.  I was away for a State certification exam 
yesterday, so am just getting back to this.

Reading through you response I believe I wasn't clear enough about what I'm 
trying to do.  Your description seems to rearrange the matrix without grouping 
the analytical results for a single sample onto a single line, as I had hoped.  
I may have confused things by attempting to send a truncated/simplified dataset.

Restatement of needs:
* I have 863 individual samples.  The following columns contain invariant 
results for each sample:
   - 
Transect,Offset,Location,fldsampid,CLP_ID,sacode,matrix,LTCCODE,
   Northing,Easting,CRDUNITS,Event,LOGDATE,sbd,sed.
   - Sorting can make use of fldsampid as these values are entirely unique to
 each sample.
* Each sample is associates with one or more of the following 48 analytical 
parameters:
   - 
AG,AL,ALK,ALKB,ALKC,AS,B,BA,BE,BR,CA,CDCL,CO,CR,CU,
 
DOC,FE,Hg,HG,HGACIDLAB,HGEXTINO,HGEXTORG,HGNONMOBHGSEMIMOB,K,
 
MEHG,MG,MN,MO,NH3N,NI,NO2N,NO3,NO3N,PBPO4,S,SB,SE,SO4,
   SOLID,SSC,TL,TOC,V,Zn,ZN
   - These are currently stored in the PARLABEL column.
* For each sample ID I would like to create a single line;
 - Extract each PARLABEL to use as a column name; and
 - Place the Result in the appropriate column.
* I can subset the data so that prccode, Lab, EXMCODE, Analysis, 
PARVQ, RL,
 EPA_FLAGS, and units are irrelevant to the issue.

The following snippet should illustrate the absolute minumum needs:
INPUT
fldsampid   |  PARLABEL|  Result
+--+-
fldsampid1  |  PARLABEL-a  |  value-8
fldsampid1  |  PARLABEL-b  |  value-5
fldsampid1  |  PARLABEL-x  |  value-2
fldsampid1  |  PARLABEL-y  |  value-0
fldsampid2  |  PARLABEL-a  |  value-9
fldsampid2  |  PARLABEL-c  |  value-8
fldsampid3  |  PARLABEL-a  |  value-2
fldsampid3  |  PARLABEL-d  |  value-8
fldsampid3  |  PARLABEL-w  |  value-3
fldsampid3  |  PARLABEL-x  |  value-9
fldsampid3  |  PARLABEL-y  |  value-6

OUTPUT
fldsampid   |  PARLABEL-a  |  PARLABEL-b  |  PARLABEL-w  |  PARLABEL-x  |  
PARLABEL-y
+--+--+--+--+--
fldsampid1  |   value-8|   value-5|   NA |   value-2|   

Re: [R] easy loop question

2011-01-12 Thread David Winsemius


On Jan 12, 2011, at 10:54 PM, Sebastián Daza wrote:


Hi everyone,

I am new in R and programming. I have tried to remove the values out  
of range in some variables using a loop:


1)

var  - names(est8vo[, 77:83])   # I got the variable names

 var
[1] p16.1 p16.2 p16.3 p16.4 p16.5 p16.6 p16.7

for (i in 1:7)  {
var.i  - var[i]
est8vo$var.i[ est8vo$var.i==3] - 99


You CANNOT use names like that. (It makes no sense to supply a vector  
argument to $.) If you want to change every instance of 3 within  a  
group of columns. See if this works


est8vo[, 77:83] -  sapply( est8vo[, 77:83], function(x) ifelse(x==3,  
99, x) )




 }

I got this error:

Error in `$-.data.frame`(`*tmp*`, var.i, value = numeric(0)) :
 replacement has 0 rows, data has 215700


2) The second step would be to define the factors, but I got the  
same error:


est8vo[, 77:83] - sapply(est8vo[, 77:83] , factor, labels=c(vacío,  
sí, no, doble marca))




for (i in 1:7)  {
var.i  - var[i]
est8vo$var.i- factor(est8vo$var.i,


Wrong. Wrong. Wrong. And please forget about using $ inside loops in  
the left-hand side. It is not designed for that.



 levels=c(0, 1, 2, 99),
 labels=c(vacío, sí, no, doble marca)
 )
 }


I don't know how to do it.
Thank you in advance!
Sebastian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >