[R] Plotting time data for various countries in same graph

2013-03-06 Thread Anindya Sankar Dey
Hi,

I've the following kind of data

Time  Country Values
2010Q1India   5
2010Q2India   7
2010Q3India   5
2010Q4India   9
2010Q1China 10
2010Q2China  6
2010Q3China  9
2010Q4 China 14


I needed to plot a graph with the x-axis being time,y-axis being he Values
and 2 line graph , one for India and one for counry.

I don't have great knowledge on graphics in R.

I was trying to use, ggplot(data,aes(x=Time,y=Values,colour=Country))

But this does not help.

Can anyone help me with this?

-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] robustbase adjbox segfault - memory not mapped

2013-03-06 Thread Martin Maechler
 B == Baan  baanba...@gmail.com
 on Mon, 4 Mar 2013 22:47:10 +0530 writes:

B Thank you Martin. Look forward to the fix.

Committed to the R-forge version of robustbase.

It was a simple integer overflow, indeed, 
necessarily happening when the sample size was = 2^16.5.

I'm planning to submit  robustbase_0.9-7  to CRAN today.
Martin

B Regards
B Baan


B On Monday 04 March 2013 10:19 PM, Martin Maechler wrote:
 B == Baan  baanba...@gmail.com
 on Mon, 4 Mar 2013 15:02:02 +0530 writes:
B Hi, I encountered a segfault, memory not mapped error
B when using adjbox in robustbase. In trying to recreate
B the issue I found that the error occurs only for large
B sample size. Here is the code.
 
  require(robustbase)
B Loading required package: robustbase
  x - rnorm(10)
  y - rep(1, 10)
  adjbox(x ~ y) ## gives a plot
  x - rnorm(1)
  y - rep(1, 1)
  adjbox(x ~ y) ## gives a plot
  x - rnorm(10)
  y - rep(1, 10)
  adjbox(x ~ y)
 
B *** caught segfault ***
B address 0xfffcc47af530, cause 'memory not mapped'
 
 
B Traceback:
B 1: .C(mc_C, x, n, eps = eps, iter = c.iter, medc = double(1))
B 2: mcComp(x, doReflect, eps1 = eps1, eps2 = eps2, maxit = maxit,
B trace.lev = trace.lev)
B 3: mc.default(x, ..., na.rm = TRUE)
B 4: mc(x, ..., na.rm = TRUE)
B 5: adjboxStats(unclass(groups[[i]]), coef = range, doReflect = doReflect)
B 6: adjbox.default(split(mf[[response]], mf[-response]), ...)
B 7: adjbox(split(mf[[response]], mf[-response]), ...)
B 8: adjbox.formula(x ~ y)
B 9: adjbox(x ~ y)
 
 Indeed, I (as maintainer of robustbase) can reproduce the
 segfault *even* though you did not specify the random seed...
 
 So this should be fixed ... hopefully within a week or so,
 but I am not promising anything, given my busy schedule!
 
 Martin Maechler,
 ETH Zurich
 
 []
 
B My setup details:
 
B R --version
B R version 2.15.2 (2012-10-26) -- Trick or Treat
 
B Package:robustbase
B Version:0.9-5
B Date:   2012-03-01
B Packaged:   2013-03-01 16:34:03 UTC; maechler
B NeedsCompilation:   yes
B Repository: CRAN
B Date/Publication:   2013-03-01 18:31:33
B Built:  R 2.15.2; x86_64-pc-linux-gnu; 2013-03-04 05:54:20
B UTC; unix
 
 
B Platform: x86_64-pc-linux-gnu (64-bit)
B uname -a
B Linux R 2.6.32-5-amd64 #1 SMP Mon Feb 25 00:26:11 UTC 2013 x86_64 
GNU/Linux
B Debian squeeze
 
B Could someone pls help.
 
B Regards
B Baan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to construct bivariate joint cumulative pdf from bivariate joint pdf

2013-03-06 Thread mcwu
Hello,

I am using sm.density() to find the bivariate joint PDFof events:

For eg,
x-cbind(rnorm(30),rnorm(30))
den-sm.density(x)

Then I get the joint pdf from den$estimate in order to constructthe 
joint cumulative PDF.
However, summing up all the values from den$estimateisnot equal to 
1(have multipliedby the grid size).

Anyone could help?

Thanks.
mc



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aov() and anova() making faulty F-tests

2013-03-06 Thread peter dalgaard

On Mar 6, 2013, at 03:56 , Rolf Turner wrote:

 
 
 Your subject line is patent nonsense.  The aov() and anova() functions
 have been around for decades.  If they were doing something wrong
 it would have been noticed long since.
 
 You should realize that the fault is in your understanding, not in these
 functions.
 
 I cannot really follow your convoluted and messy code, but it would
 appear that you want to consider M and I to be random effects.

Only M and M:I, AFAICT.  And, yes, it is messy; in particular, I refuse to 
believe that y~M*I has generated output with lowercase m and i!

 
 Where have you informed aov() as to the presence of these
 random effects?

To be specific, try y~I + Error(M + M:I). Without the random effects, aov() is 
just telling you that there is a highly significant interaction between M and 
I, and beyond that, no sensible comparisons can be made.

 
cheers,
 
Rolf Turner
 
 On 03/06/2013 03:36 PM, PatGauthier wrote:
 Dear useRs,
 
 I've just encountered a serious problem involving the F-test being carried
 out in aov() and anova(). In the provided example, aov() is not making the
 correct F-test for an hypothesis involving the expected mean square (EMS) of
 a factor divided by the EMS of another factor (i.e., instead of the error
 EMS).
 
 Here is the example:
 
 
   Expected Mean Squaredf
 Mi σ2+18σ2M  1
 Ij  σ2+6σ2MI+12Ф(I)  2
 MIij   σ2+6σ2MI  2
 ε(ijk)lσ2   30
 
 The clear test for Ij is EMS(I) / EMS(MI) -  F(2,2)
 
 However, observe the following example carried out in R,
 
 M - rep(c(M1, M2), each = 18)
 I - as.ordered(rep(rep(c(5,10,15), each = 6), 2))
 y -
 c(44,39,48,40,43,41,27,20,25,21,28,22,35,30,29,34,31,38,12,7,6,11,7,12,15,10,12,17,11,13,22,15,27,22,21,19)
 dat - data.frame(M, I, y)
 summary(aov(y~M*I, data = dat))
DfSum Sq   Mean Sq F value
 Pr(F)
 m 1 3136.0   3136.0295.85  
 2e-16 ***
 i2  513.7  256.9  24.23
 5.45e-07 ***
 m:i   2  969.5  484.7  45.73
 7.77e-10 ***
 Residuals   30   318.010.6
 ---
 
 In this example aov has taken the F-ratio of MS(I) / MS(ε) -  F(2,30) =
 24.23 with F-crit = qf(0.95,2,3) = 9.55 -- significant
 
 However, as stated above,  the correct F-ratio is MS(I) / MS(MI) -  F(2,2) =
 0.53 with F-crit = qf(0.95,2,2) = 19 -- non-significant
 
 Why is aov() miscalculating the F-ratio, and is there a way to fix this
 without prior knowledge of the appropriate test (e.g., EMS(I)/EMS(MI)?
 
 Thanks for your help,
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lm Regression takes 24+ GB RAM - Error message

2013-03-06 Thread Jonas125
Hello,

I am a rather unexperienced r-user (learned the language 1 month ago) and
run into the following problem using a local computer with 6 cores  24 GB
RAM and R 2.15 64-bit. I didn't install any additional packages

1. Via the read.table command I load a data table (with different data
types) which is about 730 MB large
2. I add 2 calculated columns
3. I split the dataset by 5 criteria
4. I run the lm command on the split with the calculated columns as the
variables

The RAM consumption goes rapidly up and stays at 24 GB for a couple of
minutes.
The result:
Error: cannot allocate vector size of 5.0 Mb
In addition: There ware 50 or more warnings (use warnings() to see the first
50)
-- Reached total allocation of 24559Mb

My code works perfectly fine for a smaller dataset. I am surprised about the
errors as the CPU should do all the work with the lm calculations and the
output cannot be that large, can it??? (I cannot check the object size of
the lm object due to the error)

Right now I am running only 1 linear model, but actually I wanted to run 6!

Is Windows putting some restrictions on R regarding the RAM usage? Can I
change any settings?
A RAM upgrade is not an option. Do I need to use a different R package
instead (bigmemory?)?


Thanks in advance for your help!!





--
View this message in context: 
http://r.789695.n4.nabble.com/lm-Regression-takes-24-GB-RAM-Error-message-tp4660434.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Understanding lm-based analysis of fractional factorial experiments

2013-03-06 Thread Kjetil Kjernsmo

All,

I have just returned to R after a decade of absence, and it is good to 
see that R has become such a great success! I'm trying to bring Design 
of Experiments into some aspects of software performance evaluation, and 
to teach myself that, I picked up Experiments: Planning, Analysis and 
Optimization by Wu and Hamada. I try to reproduce an analysis in the 
book using lm, but have to conclude I don't understand what lm does in 
this context, even though I end up at the desired result. I'm currently 
using R 2.15.2 on a recent Fedora system, but I get the same result on 
Debian Wheezy and Debian Squeeze. I think the discussion below can be 
followed without having the book at hand though.


I'm working with tables 5.2 and 5.5 in the above mentioned book. Table 
5.2 contains data from the Leaf spring experiment. The dataset is also 
in this zip file:


ftp://ftp.wiley.com/public/sci_tech_med/experiments-planning/data%20sets.zip

I've learned from the book that the effects can be found using a linear 
model and double the coefficients. So, I do
 leaf - 
read.table(/ifi/bifrost/a03/kjekje/fag/experimental-planning/book-datasets/LeafSpring 
table 5.2.dat, col.names=c(B, C, D, E, Q, paste(r, 1:3, 
sep=), yavg, ssq, lnssq))

 leaf.lm - lm(yavg ~ B * C * D * E * Q, data=leaf)
 leaf.lm

Call:
lm(formula = yavg ~ B * C * D * E * Q, data = leaf)

Coefficients:
   (Intercept)  B+  C+  D+ 
 E+
   7.54000 0.07003 0.32333-0.09668 
0.07668
Q+   B+:C+   B+:D+   C+:D+ 
  B+:E+
  -0.33670 0.01335 0.11995 0.02335 
 NA
 C+:E+   D+:E+   B+:Q+   C+:Q+ 
  D+:Q+
NA  NA 0.22915-0.25745 
0.28255
 E+:Q+B+:C+:D+B+:C+:E+B+:D+:E+ 
C+:D+:E+
   0.05415  NA  NA  NA 
 NA
  B+:C+:Q+B+:D+:Q+C+:D+:Q+B+:E+:Q+ 
C+:E+:Q+
   0.04160-0.16160-0.18840  NA 
 NA
  D+:E+:Q+ B+:C+:D+:E+ B+:C+:D+:Q+ B+:C+:E+:Q+ 
B+:D+:E+:Q+
NA  NA  NA  NA 
 NA

   C+:D+:E+:Q+  B+:C+:D+:E+:Q+
NA  NA

(seems there is little I can do about the line breaks here, sorry)

However, the book (table 5.5), has 0.221 for the main effect of B and 
0.176, and the above is neither this, nor half of it. Now, I can 
reproduce what's in the book with


 lm(yavg ~ B, data=leaf)

Call:
lm(formula = yavg ~ B, data = leaf)

Coefficients:
(Intercept)   B+
 7.5254   0.2213

 lm(yavg ~ C, data=leaf)

Call:
lm(formula = yavg ~ C, data = leaf)

Coefficients:
(Intercept)   C+
 7.5479   0.1763

Assuming lm does in fact double the coefficient in this case, but here 
the intercept varies, which doesn't seem correct, nor can I as trivially 
find the interactions the same way.


Now, I try the effects() function, and get familiar numbers:
 effects(leaf.lm)
(Intercept)  B+  C+  D+  E+  Q+
  -30.54415-0.44250 0.35250-0.05750-0.20750-0.51920
  B+:C+   B+:D+   C+:D+   B+:Q+   C+:Q+   D+:Q+
   -0.03415-0.03915 0.07085-0.16915 0.33085-0.10755
  E+:Q+B+:C+:Q+B+:D+:Q+C+:D+:Q+
0.05415-0.02080 0.08080-0.09420

and indeed, I have verified that effects(leaf.lm)/2 gives me the 
expected result.


So, I have found the correct answer, but I don't understand why. I have 
read the documentation for effects() as well as looked through the 
relevant chapter in Statistical Models in S, but from that all I got 
was that I suppose there is a hint in the phrase the effects are the 
uncorrelated single-degree-of-freedom, and that is somewhat different 
from the coefficients, but I can't make out from the book (Wu  Hamada) 
why the coefficients should be any different than the effects, to the 
contrary, it is quite clear from equation (5.8) in the book that the 
coefficients they use are effects(leaf.lm)/4.


So, there are at least two points of confusion here, one is how coef() 
differs from effects() in the case of fractional factorial experiments, 
and the other is the factor 1/4 between the coefficients used by Wu  
Hamada and the values returned by effects() as I would think from theory 
I've read that it should be a factor 2.


Best regards,

Kjetil
--
Kjetil Kjernsmo
PhD Research Fellow, University of Oslo, Norway
Semantic Web / SPARQL Query Federation
kje...@ifi.uio.no http://www.kjetil.kjernsmo.net/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error message

2013-03-06 Thread Jim Holtman
most likely either 'lower' or 'upper' is NA.  put

options(error = recover)

in your script to stop on the error and examine the value.  you need to learn 
debugging 101 to help yourself out.

Sent from my iPad

On Mar 5, 2013, at 16:00, li li hannah@gmail.com wrote:

 Dear all,
 I got an error message when running the following code.
 Can anyone give any suggestions on fixing  this type of error?
 Thank you very much in advance.
Hanna
 
 
 
 integrand - function(x, rho, a, b, z){
 +  x1 - x[1]
 +  x2 - x[2]
 +  Sigma - matrix(c(1, rho, rho, 1), 2,2)
 +  mu - rep(0,2)
 +  f - pmnorm(c((z-a*x1)/b, (z-a*x2)/b), mu,
 Sigma)*dmnorm(c(0,0), mu, diag(2))
 +  f
 +}
 
 adaptIntegrate(integrand, lower=rep(-Inf, 2), upper=c(2,2),
 + rho=0.1, a=0.6, b=0.3, z=3,  maxEval=1)
 Error in if (any(lower  upper)) stop(lowerupper integration limits) :
  missing value where TRUE/FALSE needed
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting time data for various countries in same graph

2013-03-06 Thread Jim Lemon

On 03/06/2013 07:06 PM, Anindya Sankar Dey wrote:

Hi,

I've the following kind of data

Time  Country Values
2010Q1India   5
2010Q2India   7
2010Q3India   5
2010Q4India   9
2010Q1China 10
2010Q2China  6
2010Q3China  9
2010Q4 China 14


I needed to plot a graph with the x-axis being time,y-axis being he Values
and 2 line graph , one for India and one for counry.

I don't have great knowledge on graphics in R.

I was trying to use, ggplot(data,aes(x=Time,y=Values,colour=Country))

But this does not help.

Can anyone help me with this?


Hi Anindya,
This might be a start for you:

asd.df-read.table(
 text=Time  Country Values
 2010Q1India   5
 2010Q2India   7
 2010Q3India   5
 2010Q4India   9
 2010Q1China 10
 2010Q2China  6
 2010Q3China  9
 2010Q4 China 14
 ,header=TRUE)
# Time is read as a factor, so it can be used directly in plotting
as.numeric(asd.df$Time)
[1] 1 2 3 4 1 2 3 4
plot(as.numeric(asd.df$Time)[asd.df$Country == India],
 asd.df$Values[asd.df$Country == India],
 type=l,col=4,lwd=2,xaxt=n,xlab=Financial Quarter,
 ylab=Value,ylim=c(0,14))
lines(as.numeric(asd.df$Time)[asd.df$Country == China],
 asd.df$Values[asd.df$Country == China],
 col=2,lwd=2)
axis(1,at=1:4,labels=paste(Q,1:4,sep=))
legend(2,2,c(India,China),lty=1,lwd=2,col=c(4,2))

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] combining column having same values

2013-03-06 Thread eliza botto

Dear useRs,
I have a matrix in the following form

 [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  [,10]  [,11] 
 1  1   3  2   3   1  1   2  3   3  2

and following is my desired output  (combining the column headers, having same 
values).
a-1,2,6,7

b-3,5,9,10

c-4,8,11
Thanks in advance
Elisa 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] faulty F-tests

2013-03-06 Thread PatGauthier
Thanks so much. I see my foolish ways now.



--
View this message in context: 
http://r.789695.n4.nabble.com/aov-and-anova-making-faulty-F-tests-tp4660407p4660439.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm and Formula tutorial

2013-03-06 Thread Eva Prieto Castro

Dear Alex,

Here you have some url's:

http://data.princeton.edu/R/linearModels.html
http://www.r-bloggers.com/r-tutorial-series-simple-linear-regression/

Regards,
Eva

--- El mié, 6/3/13, Alaios ala...@yahoo.com escribió:

De: Alaios ala...@yahoo.com
Asunto: [R] lm and Formula tutorial
Para: R help R-help@r-project.org
Fecha: miércoles, 6 de marzo, 2013 08:08

Dear all,
I was reading last night the lm and the Formula manual page, and 'I have to 
admit that I had tough time to understand their syntax. Is there a simpler 
guide for the dummies like me to start with?

I would like to thank you in advance for your help

Regards
Alex
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] combining column having same values

2013-03-06 Thread ONKELINX, Thierry
Dear Eliza,

You question is not very clear. I think you are looking for the which() 
function.

Best regards,

Thierry

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
eliza botto
Verzonden: woensdag 6 maart 2013 12:26
Aan: r-help@r-project.org
Onderwerp: [R] combining column having same values


Dear useRs,
I have a matrix in the following form

 [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  [,10]  [,11] 
 1  1   3  2   3   1  1   2  3   3  2

and following is my desired output  (combining the column headers, having same 
values).
a-1,2,6,7

b-3,5,9,10

c-4,8,11
Thanks in advance
Elisa
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en 
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is 
door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the 
writer and may not be regarded as stating an official position of INBO, as long 
as the message is not confirmed by a duly signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] boxplot with frequencies(counts)

2013-03-06 Thread Jim Lemon

On 03/06/2013 12:45 AM, km wrote:

Dear All,

I have a table as following
position type count
1   2 100
1   3  51
1   5  64
1   8  81
1   6  32
2   2  41
2   3  85
and so on


Normally if  would have a vector of 2,3,4,5... by position position and
plot them by position.
But now i have counts of these types.
Is there a way to compute boxplot of such kind of data ?


Hi KM,
We must assume that the type variable is to be used as a value, 
otherwise you would want something like a frequency plot by two factors 
of position and type. (If this is the case, I would suggest a nested bar 
plot). Here is a fairly awful kludge that will get you a boxplot (tdf is 
your table):


reprow-function(x) return(matrix(rep(x[1:2],x[3]),ncol=2,byrow=TRUE))
replist-apply(as.matrix(tdf),1,reprow)
repmat-replist[[1]]
for(rep in 1:length(replist)) repmat-rbind(repmat,replist[[rep]])
boxplot(repmat[,2],repmat[,1])

With apologies
Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm Regression takes 24+ GB RAM - Error message

2013-03-06 Thread R. Michael Weylandt
On Wed, Mar 6, 2013 at 9:51 AM, Jonas125 schleeberge...@pg.com wrote:
 Hello,

 I am a rather unexperienced r-user (learned the language 1 month ago) and
 run into the following problem using a local computer with 6 cores  24 GB
 RAM and R 2.15 64-bit. I didn't install any additional packages

 1. Via the read.table command I load a data table (with different data
 types) which is about 730 MB large
 2. I add 2 calculated columns
 3. I split the dataset by 5 criteria
 4. I run the lm command on the split with the calculated columns as the
 variables

 The RAM consumption goes rapidly up and stays at 24 GB for a couple of
 minutes.
 The result:
 Error: cannot allocate vector size of 5.0 Mb
 In addition: There ware 50 or more warnings (use warnings() to see the first
 50)
 -- Reached total allocation of 24559Mb

So it seems R has access to all your memory.

My guess is that you have so-called factors [Categorical variables]
in your dataset and this makes the linear regression a much larger
calculation (in the intermediate steps) than you might realize because
the design matrix has to deal with all the crossed categories.

Can you provide the output of str(DATA_SET)?

MW


 My code works perfectly fine for a smaller dataset. I am surprised about the
 errors as the CPU should do all the work with the lm calculations and the
 output cannot be that large, can it??? (I cannot check the object size of
 the lm object due to the error)

 Right now I am running only 1 linear model, but actually I wanted to run 6!

 Is Windows putting some restrictions on R regarding the RAM usage? Can I
 change any settings?
 A RAM upgrade is not an option. Do I need to use a different R package
 instead (bigmemory?)?

Not a bad idea.



 Thanks in advance for your help!!





 --
 View this message in context: 
 http://r.789695.n4.nabble.com/lm-Regression-takes-24-GB-RAM-Error-message-tp4660434.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting time data for various countries in same graph

2013-03-06 Thread Rui Barradas

Hello,

You've forgot to use a geom.
Also, to have Time be the x axis variable you need to do a conversion.

library(ggplot2)

dat - read.table(text = 
Time  Country Values
2010Q1India   5
2010Q2India   7
2010Q3India   5
2010Q4India   9
2010Q1China 10
2010Q2China  6
2010Q3China  9
2010Q4 China 14
, header = TRUE)

dat$Time - as.numeric(sub(Q, \\., dat$Time))

p - ggplot(dat,aes(x=Time,y=Values,colour=Country))
p + geom_line()


Hope this helps,

Rui Barradas

Em 06-03-2013 08:06, Anindya Sankar Dey escreveu:

Hi,

I've the following kind of data

Time  Country Values
2010Q1India   5
2010Q2India   7
2010Q3India   5
2010Q4India   9
2010Q1China 10
2010Q2China  6
2010Q3China  9
2010Q4 China 14


I needed to plot a graph with the x-axis being time,y-axis being he Values
and 2 line graph , one for India and one for counry.

I don't have great knowledge on graphics in R.

I was trying to use, ggplot(data,aes(x=Time,y=Values,colour=Country))

But this does not help.

Can anyone help me with this?



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?

2013-03-06 Thread Anna Zakrisson
Hi,

# For publications, I am not allowed to repeat the axes. I have tried to 
remove the axes using:
# yaxt=n, but it did not work. I have not understood how to do this in 
ggplot2. Can you help me?
# I also do not want loads of space between the graphs (see below script 
with Dummy Data). 
# If I could make it look like the examples on the (nice) examples page:
# http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html
# using the facet_grid(), I would be very very happy.

# I also do not want the gemoetric points to be filled and the fill=white 
commande
# does not seem to work - why? and are there alternatives?

#Furthermore, I would like to add legends to inside the plot area instead of 
on the side. Like when you use plotrix() and brkdn.plot:
legend(topright, c(A, B), pch=c(0,1), bg=white,
   lty = 1:2, cex=1, bty=n)
# This did not work in ggplot2. What are my alternatives. I have extensively 
searched the internet and have I missed something obvious, it was due to 
   # tiredness and not to lazyness. 

# Some dummy data:
mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)),
  factor2 = factor(rep(c(1:5), each = 16)),
  factor3 = factor(rep(c(1:4), each = 4)),
  var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40),
  sd = rep(c(1, 2, 3), each = 20)),
  var2 = rnorm(120, mean = rep(c(6, 7, 8), each = 40),
  sd = rep(c(1, 2, 3), each = 20)))


# Splitting data into 3 data frames (based on factor1)
# If I could do this using for example facet_wrap() or facet_grid(), I would 
be very
# happy! I have tried but failed that method.

DataAB - mydata[(mydata$factor1) %in% c(A, B), ]
DataCD - mydata[(mydata$factor1) %in% c(C, D), ]
DataEF - mydata[(mydata$factor1) %in% c(E, F), ]
DataAB
library(plyr)
library(ggplot2)

#Plot: levels A and B:
# Summary (means etc)
SummAB  -   ddply(DataAB, .(factor3,factor1), summarize,
   mean = mean(var1, na.rm = FALSE),
   sdv = sd(var1, na.rm = FALSE),
   se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1
SummAB
p1  -  ggplot(SummAB, aes(factor3, mean, 
   colour = factor1, group = factor1, 
   shape = factor1)) + 
  geom_point(aes(shape=factor(factor1)), color=black, fill=white, 
 position = dodge, width = 0.3, size=3) +  
  geom_line(aes(linetype=factor1), color = black, size = 0.5) +
  geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
position = dodge, color = black, size=0.3) + 
  theme_bw() +
  ylab(expression(paste(my measured stuff))) + 
  xlab(factor3) + ggtitle() + 
  labs(color = factor1, shape = factor1, group = factor1, 
   linetype = factor1)
p1

#Plot: levels C and D:
# Summary (means etc)
SummCD  -   ddply(DataCD, .(factor3,factor1), summarize,
   mean = mean(var1, na.rm = FALSE),
   sdv = sd(var1, na.rm = FALSE),
   se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1

p2  -  ggplot(SummCD, aes(factor3, mean, 
   colour = factor1, group = factor1, 
   shape = factor1)) + 
  geom_point(aes(shape=factor(factor1)), color=black, fill=white, 
 position = dodge, width = 0.3, size=3) +  
  geom_line(aes(linetype=factor1), color = black, size = 0.5) +
  geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
position = dodge, color = black, size=0.3) + 
  theme_bw() +
  ylab(expression(paste(my measured stuff))) + 
  xlab(factor3) + ggtitle() + 
  labs(color = factor1, shape = factor1, group = factor1, 
   linetype = factor1)
p2

#Plot: levels C and D:
# Summary (means etc)
SummEF  -   ddply(DataEF, .(factor3,factor1), summarize,
   mean = mean(var1, na.rm = FALSE),
   sdv = sd(var1, na.rm = FALSE),
   se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1

p3  -  ggplot(SummEF, aes(factor3, mean, 
   colour = factor1, group = factor1, 
   shape = factor1)) + 
  geom_point(aes(shape=factor(factor1)), color=black, fill=white, #Why 
is the fill commando not working?
 position = dodge, width = 0.3, size=3) +  
  geom_line(aes(linetype=factor1), color = black, size = 0.5) +
  geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
position = dodge, color = black, size=0.3) + 
  theme_bw() +
  ylab(expression(paste(my measured stuff))) + 
  xlab(factor3) + ggtitle() + 
  labs(color = factor1, shape = factor1, group = factor1, 
   linetype = factor1)
p3

ary(gridExtra)
sidebysideplot - grid.arrange(p1, p2, p3, ncol=2)


Anna Zakrisson Braeunlich
PhD student

Department of Ecology Environment and Plant Sciences
Stockholm University
Svante Arrheniusv. 21A
SE-106 91 Stockholm
Sweden

Lives in 

Re: [R] combining column having same values

2013-03-06 Thread arun
Hi,
Try this:
mat1- as.matrix(read.table(text= 
1  1  3  2  3  1  1  2  3  3  2
,sep=,header=FALSE))
 res-lapply(1:3,function(i) which(mat1==i))
 names(res)- c(a,c,b)
 res
#$a
#[1] 1 2 6 7

#$c
#[1]  4  8 11

#$b
#[1]  3  5  9 10
A.K.




- Original Message -
From: eliza botto eliza_bo...@hotmail.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Wednesday, March 6, 2013 6:26 AM
Subject: [R] combining column having same values


Dear useRs,
I have a matrix in the following form

[,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  [,10]  [,11]      
1      1       3      2       3       1      1       2      3       3      2

and following is my desired output  (combining the column headers, having same 
values).
a-1,2,6,7

b-3,5,9,10

c-4,8,11
Thanks in advance
Elisa                           
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?

2013-03-06 Thread Stephen Sefick
Look at the function melt and use the factor columns as id variables.  
You should be able to do what you want with facet_grid. I have found 
inkscape useful to build legends and modify axes labels.  I know this is 
only a partial answer, but I hope this helps.


Stephen

On Wed 06 Mar 2013 06:32:42 AM CST, Anna Zakrisson wrote:


Hi,

# For publications, I am not allowed to repeat the axes. I have tried to
remove the axes using:
# yaxt=n, but it did not work. I have not understood how to do this in
ggplot2. Can you help me?
# I also do not want loads of space between the graphs (see below script
with Dummy Data).
# If I could make it look like the examples on the (nice) examples page:
# http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html
# using the facet_grid(), I would be very very happy.

# I also do not want the gemoetric points to be filled and the 
fill=white

commande
# does not seem to work - why? and are there alternatives?

#Furthermore, I would like to add legends to inside the plot area 
instead of

on the side. Like when you use plotrix() and brkdn.plot:
legend(topright, c(A, B), pch=c(0,1), bg=white,
lty = 1:2, cex=1, bty=n)
# This did not work in ggplot2. What are my alternatives. I have 
extensively

searched the internet and have I missed something obvious, it was due to
# tiredness and not to lazyness.

# Some dummy data:
mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)),
factor2 = factor(rep(c(1:5), each = 16)),
factor3 = factor(rep(c(1:4), each = 4)),
var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40),
sd = rep(c(1, 2, 3), each = 20)),
var2 = rnorm(120, mean = rep(c(6, 7, 8), each = 40),
sd = rep(c(1, 2, 3), each = 20)))


# Splitting data into 3 data frames (based on factor1)
# If I could do this using for example facet_wrap() or facet_grid(), I 
would

be very
# happy! I have tried but failed that method.

DataAB - mydata[(mydata$factor1) %in% c(A, B), ]
DataCD - mydata[(mydata$factor1) %in% c(C, D), ]
DataEF - mydata[(mydata$factor1) %in% c(E, F), ]
DataAB
library(plyr)
library(ggplot2)

#Plot: levels A and B:
# Summary (means etc)
SummAB - ddply(DataAB, .(factor3,factor1), summarize,
mean = mean(var1, na.rm = FALSE),
sdv = sd(var1, na.rm = FALSE),
se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1
SummAB
p1 - ggplot(SummAB, aes(factor3, mean,
colour = factor1, group = factor1,
shape = factor1)) +
geom_point(aes(shape=factor(factor1)), color=black, fill=white,
position = dodge, width = 0.3, size=3) +
geom_line(aes(linetype=factor1), color = black, size = 0.5) +
geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
position = dodge, color = black, size=0.3) +
theme_bw() +
ylab(expression(paste(my measured stuff))) +
xlab(factor3) + ggtitle() +
labs(color = factor1, shape = factor1, group = factor1,
linetype = factor1)
p1

#Plot: levels C and D:
# Summary (means etc)
SummCD - ddply(DataCD, .(factor3,factor1), summarize,
mean = mean(var1, na.rm = FALSE),
sdv = sd(var1, na.rm = FALSE),
se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1

p2 - ggplot(SummCD, aes(factor3, mean,
colour = factor1, group = factor1,
shape = factor1)) +
geom_point(aes(shape=factor(factor1)), color=black, fill=white,
position = dodge, width = 0.3, size=3) +
geom_line(aes(linetype=factor1), color = black, size = 0.5) +
geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
position = dodge, color = black, size=0.3) +
theme_bw() +
ylab(expression(paste(my measured stuff))) +
xlab(factor3) + ggtitle() +
labs(color = factor1, shape = factor1, group = factor1,
linetype = factor1)
p2

#Plot: levels C and D:
# Summary (means etc)
SummEF - ddply(DataEF, .(factor3,factor1), summarize,
mean = mean(var1, na.rm = FALSE),
sdv = sd(var1, na.rm = FALSE),
se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1

p3 - ggplot(SummEF, aes(factor3, mean,
colour = factor1, group = factor1,
shape = factor1)) +
geom_point(aes(shape=factor(factor1)), color=black, fill=white, #Why
is the fill commando not working?
position = dodge, width = 0.3, size=3) +
geom_line(aes(linetype=factor1), color = black, size = 0.5) +
geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
position = dodge, color = black, size=0.3) +
theme_bw() +
ylab(expression(paste(my measured stuff))) +
xlab(factor3) + ggtitle() +
labs(color = factor1, shape = factor1, group = factor1,
linetype = factor1)
p3

ary(gridExtra)
sidebysideplot - grid.arrange(p1, p2, p3, ncol=2)


Anna Zakrisson Braeunlich
PhD student

Department of Ecology Environment and Plant Sciences
Stockholm University
Svante Arrheniusv. 21A
SE-106 91 Stockholm
Sweden

Lives in Berlin.
For paper mail:
Katzbachstr. 21
D-10965, Berlin - Kreuzberg
Germany/Deutschland

E-mail: anna.zakris...@su.se
Tel work: +49-(0)3091541281
Mobile: +49-(0)15777374888
LinkedIn: http://se.linkedin.com/pub/anna-zakrisson-braeunlich/33/5a2/51b



º`•. . • `•. .• `•. . º`•. . • `•. .•


`•. .º`•. . • 

[R] chi square exact test

2013-03-06 Thread Knut Krueger
SPPS is offering a chi square exact test for one dimensional data with 
small sample size (6).


What is the comparable function in R?

Kind Regards Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Good practice for data() for R-packages

2013-03-06 Thread Uwe Ligges



On 05.03.2013 11:21, Johannes Radinger wrote:

Hi,

I am compiling a R-package and have two tables (.rda files) that are used
by the functions in my package. In the manual for ?data
(http://stat.ethz.ch/R-manual/R-patched/library/utils/html/data.html),
there is a chapter on good practice for such sysdata..

However what is not clear to me yet:

1) Probably I need to do the second approach:
For objects which are system data, for example lookup tables used in
calculations within the function, use a file ‘R/sysdata.rda’ in the
package sources or create the objects by R code at package
installation time.
But what if I have two rda-files? Is the sysdata.rda a fixed name
(name convention)?



But both objects into the same rda file.


2) How should these rda-table be used/loaded in the package functions?
is data() still working/okay?


The objects will be available in your NAMESPACE.

Best,
Uwe ligges



/johannnes

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chi square exact test

2013-03-06 Thread Nicole Ford
A quick google search produces multiple results.  Good luck. :)

~Nicole Ford
Ph.D. Student
Graduate Assistant/ Instructor
Department of Government and International Affairs
University of South Florida
office: SOC 012M


Sent from my iPhone

On Mar 6, 2013, at 6:30 AM, Knut Krueger r...@knut-krueger.de wrote:

 SPPS is offering a chi square exact test for one dimensional data with small 
 sample size (6).
 
 What is the comparable function in R?
 
 Kind Regards Knut
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to combine conditional argument and logical argument in R to create subset of data...

2013-03-06 Thread arun
Hi HJ,
Tem2- as.data.frame(Tem1)
 res-do.call(rbind,split(Tem2,Tem2$V1))
 row.names(res)- 1:nrow(res)
head(res,7)
#   V1 V2
#1 111  1
#2 111  2
#3 111  3
#4 111  4
#5 111 13
#6 111 14
#7 111 15
A.K.







From: HJ YAN yhj...@googlemail.com
To: arun smartpink...@yahoo.com 
Cc: r-help@r-project.org 
Sent: Wednesday, March 6, 2013 8:24 AM
Subject: Re: [R] How to combine conditional argument and logical argument in R 
to create subset of data...


Hi Arun


Thank you so much for the help, that's really helpful!!

Also I have a quick question about the code below where I can not see why it 
doesn't work...

I know the I shou

V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3))
V2-c(1:23)
Tem1-cbind(V1,V2)


So Tem 1 looks like...
 Tem1
       V1 V2
 [1,] 111  1
 [2,] 111  2
 [3,] 111  3
 [4,] 111  4
 [5,] 222  5
 [6,] 222  6
 [7,] 222  7
 [8,] 222  8
 [9,] 333  9
[10,] 333 10
[11,] 333 11
[12,] 333 12
[13,] 111 13
[14,] 111 14
[15,] 111 15
[16,] 111 16
[17,] 222 17
[18,] 222 18
[19,] 222 19
[20,] 222 20
[21,] 333 21
[22,] 333 22
[23,] 333 23

I would like the outcome to be...

      V1 V2

     111  1
     111  2
     111  3
     111  4
     111 13
     111 14
     111 15
     111 16
     222  5
     222  6
     222  7
     222  8
     222 17
     222 18
     222 19
     222 20
     333  9
     333 10
     333 11
     333 12
     333 21
     333 22
     333 23


So I tried code as below 
--
Tem3-c(NA,NA)
for(i in length(unique(Tem1[,1]))){
Tem2-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i])
Tem3-rbind(Tem3,Tem2)
Tem3
}
Tem4-Tem3[-1,]
---

And only get this...


 V1 V2
 333  9
 333 10
 333 11
 333 12
 333 21
 333 22
 333 23


I tried to run the code step by step, e.g. letting i=1, then i=2, then i= 3, 
and updating my Tem3, I did get what I wanted, but wondered why in the loop 
above it did not work...??


Many thanks in advance!

HJ















On Wed, Mar 6, 2013 at 4:36 AM, arun smartpink...@yahoo.com wrote:

Hi,

 b[b[,4]15  (b[,1]4|is.na(b[,1]))  (b[,2]4|is.na(b[,2])),]
 #    [,1] [,2] [,3] [,4] [,5]
#[1,]    6   NA   NA   16   20
#[2,]   NA    5   NA   17   21
A.K.



- Original Message -
From: HJ YAN yhj...@googlemail.com
To: r-help@r-project.org
Cc:
Sent: Tuesday, March 5, 2013 9:33 PM
Subject: [R] How to combine conditional argument and logical argument in R to 
create subset of data...

Dear R user

I have data created using code below

b-matrix(2:21,nrow=4)
b[,1:3]=NA
b[4,2]=5
b[3,1]=6

Now the data is

 b
         [,1]  [,2]   [,3]  [,4]  [,5]
[1,]   NA   NA   NA   14   18
[2,]   NA   NA   NA   15   19
[3,]      6   NA   NA   16   20
[4,]   NA    5     NA    17   21


I want to keep data in column 4 greater than 15 and the value in column 1 
2 either greater than 4 or is 'NA'. So I would like to have
my outcome as below...

[3,]   6   NA NA 16 20
[4,] NA 5 NA 17 21

I thought something like the code below gonna to work but it only returns
the last row,e.g NA 5 NA 17 21. ...

bb-b[which( (b[,2]4 | b[,2]==NA)  (b[,1]4 | b[,1]==NA)  b[,4]15) ,])


Please could anyone help?

Many thanks in advance

HJ

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

     

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding the knots in a smoothing spline using nknots

2013-03-06 Thread Mike Nielsen
Thanks, David!  That makes sense.  I shall re-read the manual page again.

Regards,

Mike Nielsen


On Wed, Feb 27, 2013 at 12:19 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Feb 27, 2013, at 6:39 AM, Mike Nielsen wrote:

  Hi r-helpers.
 
  Please forgive my ignorance, but I would like to plot a smoothing spline
  (smooth.spline) from package stats, and show the knots in the plot,
 and I
  can't seem to figure out where smooth.spline has located the knots (when
 I
  use nknots).  Unfortunately, I don't know a lot about splines, but I know
  that they provide me an easy way to estimate the location of local maxima
  and minima on varying time-scales (number of knots) in my original data.
 
  I see there is a fit$knot, but it's not clear to me what those values
 are:
  for some reason I had expected that they would be contained in my
 original
  y values, but they're not.

 It appears they are in the range of [0-1] and the ss$fit$min and
 ss$fit$range provide the scaling data ( for the x-values rather than
 the y-values):

  unique(ss$fit$knot)
  [1] 0. 0.04095904 0.08291708 0.12487512 0.16583417 0.20779221
 0.24975025 0.29070929
  [9] 0.33266733 0.37462537 0.41658342 0.45754246 0.49950050 0.54145854
 0.58241758 0.62437562
 [17] 0.66633367 0.70829171 0.74925075 0.79120879 0.83316683 0.87412587
 0.91608392 0.95804196
 [25] 1.

 I would think that in your case with x0 being 0 you could just use
 ss$fit$range*unique(ss$fit$knot) as your knot positions. In the more
 geneneral case you would need to add ss$fit$min. I tried confirming this
 hunch by looking statiscal Models in S, inMASSe4, and at the R code but
 the R code calls a FORTRAN routine, so you would need to pull the source to
 confirm.

 --
 David.

   I tried generating nknots equally spaced points
  in my x, but when I plotted the points that corresponded to my original y
  values at those equally-spaced x values, I found that the spline did not
  pass through them, which, perhaps naively, I thought it might.
 
  Also, the manual says that yin comprises the y values used at the
 unique y
  values -- should this read at the unique x values?
 
  Could someone kindly point to a resource where I can get a slightly
 fuller
  explanation?  I looked at the code for smooth.spline, but can't readily
  follow it.
 
  Here's a toy example:
 
  x-seq(from=0,to=4*pi,length=1002)
  y-sin(x)
  ss-smooth.spline(x,y=y,all.knots=F,nknots=25)
  ss
  Call:
  smooth.spline(x = x, y = y, all.knots = F, nknots = 25)
 
  Smoothing Parameter  spar= -0.4573636  lambda= 1.006117e-09 (14
 iterations)
  Equivalent Degrees of Freedom (Df): 26.99935
  Penalized Criterion: 3.027077e-06
  GCV: 3.190666e-09
  str(ss)
  List of 15
  $ x   : num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ...
  $ y   : num [1:1002] 2.88e-05 1.26e-02 2.51e-02 3.77e-02 5.02e-02 ...
  $ w   : num [1:1002] 1 1 1 1 1 1 1 1 1 1 ...
  $ yin : num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ...
  $ data:List of 3
   ..$ x: num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ...
   ..$ y: num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ...
   ..$ w: num [1:1002] 1 1 1 1 1 1 1 1 1 1 ...
  $ lev : num [1:1002] 0.2238 0.177 0.1399 0. 0.0891 ...
  $ cv.crit : num 3.19e-09
  $ pen.crit: num 3.03e-06
  $ crit: num 3.19e-09
  $ df  : num 27
  $ spar: num -0.457
  $ lambda  : num 1.01e-09
  $ iparms  : Named int [1:3] 1 0 14
   ..- attr(*, names)= chr [1:3] icrit ispar iter
  $ fit :List of 5
   ..$ knot : num [1:31] 0 0 0 0 0.041 ...
   ..$ nk   : num 27
   ..$ min  : num 0
   ..$ range: num 12.6
   ..$ coef : num [1:27] 2.88e-05 1.72e-01 5.19e-01 9.04e-01 1.05 ...
   ..- attr(*, class)= chr smooth.spline.fit
  $ call: language smooth.spline(x = x, y = y, all.knots = F, nknots =
  25)
  - attr(*, class)= chr smooth.spline
 
 
  Many thanks!
 
 
  Regards,
 
  Mike Nielsen
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 David Winsemius
 Alameda, CA, USA



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Understanding lm-based analysis of fractional factorial experiments

2013-03-06 Thread Ben Bolker
Kjetil Kjernsmo kjekje at ifi.uio.no writes:

 
 All,
 
 I have just returned to R after a decade of absence, and it is good to 
 see that R has become such a great success! I'm trying to bring Design 
 of Experiments into some aspects of software performance evaluation, and 
 to teach myself that, I picked up Experiments: Planning, Analysis and 
 Optimization by Wu and Hamada. I try to reproduce an analysis in the 
 book using lm, but have to conclude I don't understand what lm does in 
 this context, even though I end up at the desired result. I'm currently 
 using R 2.15.2 on a recent Fedora system, but I get the same result on 
 Debian Wheezy and Debian Squeeze. I think the discussion below can be 
 followed without having the book at hand though.

   Just a quick thought (sorry for removing context): what happens if
you use sum-to-zero contrasts throughout, i.e. options(contrasts=c(contr.sum,
contr.poly)) ... ?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to combine conditional argument and logical argument in R to create subset of data...

2013-03-06 Thread arun
Hi,
You can also try this:
 Tem3- list()
 for(i in unique(Tem1[,1])) {
 Tem3[[i]]- subset(Tem1,Tem1[,1]==i)
 Tem4- do.call(rbind,Tem3)
 }
head(Tem4)
#  V1 V2
#[1,] 111  1
#[2,] 111  2
#[3,] 111  3
#[4,] 111  4
#[5,] 111 13
#[6,] 111 14


#or
Tem3-c(NA,NA)
 for(i in unique(Tem1[,1])) {
 Tem2- subset(Tem1, Tem1[,1]==i)
 Tem3- rbind(Tem3,Tem2)
 Tem5- Tem3[-1,]
 }
head(Tem5)
#  V1 V2
# 111  1
# 111  2
# 111  3
# 111  4
# 111 13
# 111 14

A.K.



From: HJ YAN yhj...@googlemail.com
To: arun smartpink...@yahoo.com 
Cc: r-help@r-project.org 
Sent: Wednesday, March 6, 2013 8:24 AM
Subject: Re: [R] How to combine conditional argument and logical argument in R 
to create subset of data...


Hi Arun


Thank you so much for the help, that's really helpful!!

Also I have a quick question about the code below where I can not see why it 
doesn't work...

I know the I shou

V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3))
V2-c(1:23)
Tem1-cbind(V1,V2)


So Tem 1 looks like...
 Tem1
       V1 V2
 [1,] 111  1
 [2,] 111  2
 [3,] 111  3
 [4,] 111  4
 [5,] 222  5
 [6,] 222  6
 [7,] 222  7
 [8,] 222  8
 [9,] 333  9
[10,] 333 10
[11,] 333 11
[12,] 333 12
[13,] 111 13
[14,] 111 14
[15,] 111 15
[16,] 111 16
[17,] 222 17
[18,] 222 18
[19,] 222 19
[20,] 222 20
[21,] 333 21
[22,] 333 22
[23,] 333 23

I would like the outcome to be...

      V1 V2

     111  1
     111  2
     111  3
     111  4
     111 13
     111 14
     111 15
     111 16
     222  5
     222  6
     222  7
     222  8
     222 17
     222 18
     222 19
     222 20
     333  9
     333 10
     333 11
     333 12
     333 21
     333 22
     333 23


So I tried code as below 
--
Tem3-c(NA,NA)
for(i in length(unique(Tem1[,1]))){
Tem2-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i])
Tem3-rbind(Tem3,Tem2)
Tem3
}
Tem4-Tem3[-1,]
---

And only get this...


 V1 V2
 333  9
 333 10
 333 11
 333 12
 333 21
 333 22
 333 23


I tried to run the code step by step, e.g. letting i=1, then i=2, then i= 3, 
and updating my Tem3, I did get what I wanted, but wondered why in the loop 
above it did not work...??


Many thanks in advance!

HJ















On Wed, Mar 6, 2013 at 4:36 AM, arun smartpink...@yahoo.com wrote:

Hi,

 b[b[,4]15  (b[,1]4|is.na(b[,1]))  (b[,2]4|is.na(b[,2])),]
 #    [,1] [,2] [,3] [,4] [,5]
#[1,]    6   NA   NA   16   20
#[2,]   NA    5   NA   17   21
A.K.



- Original Message -
From: HJ YAN yhj...@googlemail.com
To: r-help@r-project.org
Cc:
Sent: Tuesday, March 5, 2013 9:33 PM
Subject: [R] How to combine conditional argument and logical argument in R to 
create subset of data...

Dear R user

I have data created using code below

b-matrix(2:21,nrow=4)
b[,1:3]=NA
b[4,2]=5
b[3,1]=6

Now the data is

 b
         [,1]  [,2]   [,3]  [,4]  [,5]
[1,]   NA   NA   NA   14   18
[2,]   NA   NA   NA   15   19
[3,]      6   NA   NA   16   20
[4,]   NA    5     NA    17   21


I want to keep data in column 4 greater than 15 and the value in column 1 
2 either greater than 4 or is 'NA'. So I would like to have
my outcome as below...

[3,]   6   NA NA 16 20
[4,] NA 5 NA 17 21

I thought something like the code below gonna to work but it only returns
the last row,e.g NA 5 NA 17 21. ...

bb-b[which( (b[,2]4 | b[,2]==NA)  (b[,1]4 | b[,1]==NA)  b[,4]15) ,])


Please could anyone help?

Many thanks in advance

HJ

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

     

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function completely locks up my computer if the input is too big

2013-03-06 Thread Milan Bouchet-Valat
Le mardi 05 mars 2013 à 15:19 -0800, Benjamin Caldwell a écrit :
 Hi all,
 
 Thanks for the suggestions. Updating the function as below to break the
 problem into chunks seemed to do the trick - perhaps there is a relatively
 small limit to the size of a vector that R can work with?
On the contrary, that's because the limit supported by R is too high for
your computer's memory that the OS tries to allocate too much memory and
swaps to death. Admittedly, an OS should be smart enough not to
completely freeze in the process, but that's just how it is...

(If R did not support such a long vector, you would just get a nice
error message, that's all.)


Regards

 Best
 
 rotate - function(x,y,tilt,threshold){
 
 df.main-data.frame(x,y)
 
  if(length(x) threshold){
 l - round(length(x)/ threshold, 0)
 dfchunk - split(df.main, factor(sort(rank(row.names(df.main))%%l)))
  n-length(summary(dfchunk)[,1])
 xy-vector(list, n)
 for (i in 1:n){
 wk.df - dfchunk[[i]]
 x - wk.df$x
 y - wk.df$y
 d2 - x^2+y^2
 rotate.dis-sqrt(d2)
 or.rad - atan(x/y)
 or.deg - Rad2Deg(or.rad)
 
  or.deg[is.na(or.deg)] - 0
  tilt.in - tilt + or.deg
 xy[[i]]-data.frame(Pol2Car(distance=rotate.dis, deg=tilt.in))
 }
  xy-do.call(rbind, xy[1:n])
   } else {
  d2 - x^2+y^2
 rotate.dis-sqrt(d2)
 or.rad - atan(x/y)
 or.deg - Rad2Deg(or.rad)
 
 n - length(or.deg)
 for(i in 1:n){
 if(is.na(or.deg[i])==TRUE) {or.deg[i] - 0}
 }
  tilt.in - tilt + or.deg
 
 xy-data.frame(Pol2Car (distance=rotate.dis, deg=tilt.in))
 }
 
 xy
 }
 
 *Ben Caldwell*
 
 Graduate Fellow
 University of California, Berkeley
 130 Mulford Hall #3114
 Berkeley, CA 94720
 Office 223 Mulford Hall
 (510)859-3358
 
 
 On Tue, Mar 5, 2013 at 1:44 PM, Peter Alspach 
 peter.alsp...@plantandfood.co.nz wrote:
 
  Tena koe Benjamin
 
  I haven't looked at you code in detail, but in general ifelse is slow and
  can generally be avoided.  For example,
 
  ben - 1:10^7
  system.time(BEN - ifelse(ben10, NA, -ben))
 user  system elapsed
 1.310.241.56
  system.time({BEN1 - -ben; BEN1[BEN1 -10] - NA})
 user  system elapsed
 0.170.030.20
  all.equal(BEN, BEN1)
  [1] TRUE
 
  HTH ...
 
  Peter Alspach
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
  On Behalf Of Benjamin Caldwell
  Sent: Wednesday, 6 March 2013 10:18 a.m.
  To: r-help
  Subject: [R] Function completely locks up my computer if the input is too
  big
 
  Dear r-help,
 
 
  Somewhere in my innocuous function to rotate an object in Cartesian space
  I've created a monster that completely locks up my computer (requires a
  hard reset every time). I don't know if this is useful description to
  anyone - the mouse still responds, but not the keyboard and not windows
  explorer.
 
  The script only does this when the input matrix is large, and so my
  initial runs didn't catch it as I used a smaller matrix to speed up the
  test runs.
  When I tried an input matrix with a number of dimensions in the same order
  of magnitude as the data I want to read in, R and my computer choked. This
  was a surprise for me, as I've always been able to break execution in the
  past or do other tasks. So i tried it again, and still no dice.
 
  Now I need the function to work as subsequent functions/functionality are
  dependent, and I can't see anything on the face of it that would likely
  cause the issue.
 
  Any insight on why this happens in general or specifically in my case are
  appreciated. Running R 15.2, Platform: x86_64-w64-mingw32/x64 (64-bit) on a
  windows 7 machine with 4 mb RAM. In the meantime I suppose I'll write a
  loop to do this function piece-wise for larger data and see if that helps.
 
  Script is attached and appended below.
 
  Thanks
 
  Ben Caldwell
 
 
 
  #compass to polar coordinates
 
  compass2polar - function(x) {-x+90}
 
 
 
  #degrees (polar) to radians
 
  Deg2Rad - function(x) {(x*pi)/180}
 
 
 
  # radians to degrees
 
  Rad2Deg - function (rad) (rad/pi)*180
 
 
 
  # polar to cartesian coordinates - assumes degrees those from a compass.
  output is a list, x  y of equal length
 
  Pol2Car - function(distance,deg) {
 
 
  rad - Deg2Rad(compass2polar(deg))
 
  rad - rep(rad, length(distance))
 
 
  x - ifelse(is.na(distance), NA, distance * cos(rad))
 
  y - ifelse(is.na(distance), NA, distance * sin(rad))
 
 
  x-round(x,2)
 
  y-round(y,2)
 
 
  cartes- list(x,y)
 
  name-c('x','y')
 
  names(cartes)-name
 
  cartes
 
  }
 
 
 
 
 
  #rotate an object, with assumed origin at 0,0, in any number of degrees
 
  rotate - function(x,y,tilt){ 8
 
 
  d2 - x^2+y^2
 
  rotate.dis-sqrt(d2)
 
  or.rad - atan(x/y)
 
  or.deg - Rad2Deg(or.rad)
 
 
  n - length(or.deg)
 
  for(i in 1:n){
 
  if(is.na(or.deg[i])==TRUE) {or.deg[i] - 0}
 
  }
 
  # browser()
 
  tilt.in - tilt + or.deg
 
 
  xy-Pol2Car (distance=rotate.dis, deg=tilt.in)
 
   # if(abs(tilt) = 0) {
 
   # shift.frame - cbind(xy$x, xy$y)
 
  # shift.frame.val - 

Re: [R] Understanding lm-based analysis of fractional factorial experiments

2013-03-06 Thread Ista Zahn
Hi,

On Wed, Mar 6, 2013 at 5:46 AM, Kjetil Kjernsmo kje...@ifi.uio.no wrote:
 All,

 I have just returned to R after a decade of absence, and it is good to see
 that R has become such a great success! I'm trying to bring Design of
 Experiments into some aspects of software performance evaluation, and to
 teach myself that, I picked up Experiments: Planning, Analysis and
 Optimization by Wu and Hamada. I try to reproduce an analysis in the book
 using lm, but have to conclude I don't understand what lm does in this
 context, even though I end up at the desired result. I'm currently using R
 2.15.2 on a recent Fedora system, but I get the same result on Debian Wheezy
 and Debian Squeeze. I think the discussion below can be followed without
 having the book at hand though.

I have my doubts...


 I'm working with tables 5.2 and 5.5 in the above mentioned book. Table 5.2
 contains data from the Leaf spring experiment. The dataset is also in this
 zip file:

 ftp://ftp.wiley.com/public/sci_tech_med/experiments-planning/data%20sets.zip

 I've learned from the book that the effects can be found using a linear
 model and double the coefficients. So, I do
 leaf -
 read.table(/ifi/bifrost/a03/kjekje/fag/experimental-planning/book-datasets/LeafSpring
 table 5.2.dat, col.names=c(B, C, D, E, Q, paste(r, 1:3,
 sep=), yavg, ssq, lnssq))
 leaf.lm - lm(yavg ~ B * C * D * E * Q, data=leaf)

That is complete nonsense:

 dim(leaf)
[1] 16 11
 length(coef(leaf.lm))
[1] 32

So you are trying to estimate 32 coefficients from 16 data points.
That is never going to work.

 leaf.lm

 Call:
 lm(formula = yavg ~ B * C * D * E * Q, data = leaf)

 Coefficients:
(Intercept)  B+  C+  D+  E+
7.54000 0.07003 0.32333-0.09668 0.07668
 Q+   B+:C+   B+:D+   C+:D+   B+:E+
   -0.33670 0.01335 0.11995 0.02335  NA
  C+:E+   D+:E+   B+:Q+   C+:Q+   D+:Q+
 NA  NA 0.22915-0.25745 0.28255
  E+:Q+B+:C+:D+B+:C+:E+B+:D+:E+ C+:D+:E+
0.05415  NA  NA  NA  NA
   B+:C+:Q+B+:D+:Q+C+:D+:Q+B+:E+:Q+ C+:E+:Q+
0.04160-0.16160-0.18840  NA  NA
   D+:E+:Q+ B+:C+:D+:E+ B+:C+:D+:Q+ B+:C+:E+:Q+ B+:D+:E+:Q+
 NA  NA  NA  NA  NA
C+:D+:E+:Q+  B+:C+:D+:E+:Q+
 NA  NA

 (seems there is little I can do about the line breaks here, sorry)

 However, the book (table 5.5), has 0.221 for the main effect of B and 0.176,
 and the above is neither this, nor half of it. Now, I can reproduce what's
 in the book with

 lm(yavg ~ B, data=leaf)

 Call:
 lm(formula = yavg ~ B, data = leaf)

 Coefficients:
 (Intercept)   B+
  7.5254   0.2213


 lm(yavg ~ C, data=leaf)

 Call:
 lm(formula = yavg ~ C, data = leaf)

 Coefficients:
 (Intercept)   C+
  7.5479   0.1763

 Assuming lm does in fact double the coefficient in this case,

I have no idea what this means.

 but here the
 intercept varies, which doesn't seem correct,

You mean that the intercept for

lm(yavg ~ B, data=leaf)

differs from the intercept for

lm(yavg ~ C, data=leaf)

? If so that is expected. The intercept is the expected value of yavg
when all predictors are zero. The expected value for B = zero does not
have to be the same as the expected value for C = 0.

 nor can I as trivially find
 the interactions the same way.

What way?


 Now, I try the effects() function, and get familiar numbers:
 effects(leaf.lm)
 (Intercept)  B+  C+  D+  E+  Q+
   -30.54415-0.44250 0.35250-0.05750-0.20750-0.51920
   B+:C+   B+:D+   C+:D+   B+:Q+   C+:Q+   D+:Q+
-0.03415-0.03915 0.07085-0.16915 0.33085-0.10755
   E+:Q+B+:C+:Q+B+:D+:Q+C+:D+:Q+
 0.05415-0.02080 0.08080-0.09420

 and indeed, I have verified that effects(leaf.lm)/2 gives me the expected
 result.

 So, I have found the correct answer, but I don't understand why. I have read
 the documentation for effects() as well as looked through the relevant
 chapter in Statistical Models in S, but from that all I got was that I
 suppose there is a hint in the phrase the effects are the uncorrelated
 single-degree-of-freedom, and that is somewhat different from the
 coefficients, but I can't make out from the book (Wu  Hamada) why the
 coefficients should be any different than the effects, to the contrary, it
 is quite clear from equation (5.8) in the book that the coefficients they
 use are effects(leaf.lm)/4.

 So, there are at least two points of confusion here, one is how coef()
 differs from effects() in the case of fractional factorial experiments, and
 the other is the 

[R] Generating unique filenames.

2013-03-06 Thread Sahana Srinivasan
Hi,
I am trying to create unique filenames for my output text file. The idea is
that I would like to append a string to .zsc.txt so that all my files are
uniquely named but with a similar format. I have tried adding the string
variable to .zsc.txt while creating the output file name, i.e.
write.table function, is what I have tried using :

write.table(x, file=str,.,.zsc.txt);

but it isn't working.



Would appreciate everyone's input on the matter. Thanks :)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generating unique filenames.

2013-03-06 Thread Uwe Ligges



On 06.03.2013 15:20, Sahana Srinivasan wrote:

Hi,
I am trying to create unique filenames for my output text file. The idea is
that I would like to append a string to .zsc.txt so that all my files are
uniquely named but with a similar format. I have tried adding the string
variable to .zsc.txt while creating the output file name, i.e.
write.table function, is what I have tried using :

write.table(x, file=str,.,.zsc.txt);

but it isn't working.



?paste

Uwe Ligges





Would appreciate everyone's input on the matter. Thanks :)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generating unique filenames.

2013-03-06 Thread Ben Tupper
Hello,

On Mar 6, 2013, at 9:20 AM, Sahana Srinivasan wrote:

 Hi,
 I am trying to create unique filenames for my output text file. The idea is
 that I would like to append a string to .zsc.txt so that all my files are
 uniquely named but with a similar format. I have tried adding the string
 variable to .zsc.txt while creating the output file name, i.e.
 write.table function, is what I have tried using :
 
 write.table(x, file=str,.,.zsc.txt);
 
 but it isn't working.
 

I think you are looking for the paste() or the newish paste0() function to 
assemble the parts of your filename.  In the example below I make a unique name 
out of a timestamp and a path.  You might have a different unique name to use 
instead of timestamp.  Note, use file.path() to build up filename that includes 
a path description.

path - /my/own/path
appendage - .zsc.txt
string - format(Sys.time(), format = %Y-%j-%H%M%S)
outputFile - file.path(path, paste0(string, appendage))
write.table(x, file = outputFile) 

Cheers,
Ben






 
 
 Would appreciate everyone's input on the matter. Thanks :)
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm and Formula tutorial

2013-03-06 Thread Bert Gunter
I may be wrong, but I believe what makes it difficult is that the Help
file assumes some linear model statistics that you may not have. I
suggest that you look for a tutorial on linear models first and then
re-read the Help. Incidentally, the provenance of the syntax is GLIM
(correction requested if I'm wrong), and the Nelder-McCullough book on
GLM's has a chapter on linear models and syntax that is relevant,
iirc.

-- Bert

On Tue, Mar 5, 2013 at 11:08 PM, Alaios ala...@yahoo.com wrote:
 Dear all,
 I was reading last night the lm and the Formula manual page, and 'I have to 
 admit that I had tough time to understand their syntax. Is there a simpler 
 guide for the dummies like me to start with?

 I would like to thank you in advance for your help

 Regards
 Alex
 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Understanding lm-based analysis of fractional factorial experiments

2013-03-06 Thread Kjetil Kjernsmo

On 03/06/2013 02:50 PM, Ben Bolker wrote:

Just a quick thought (sorry for removing context): what happens if
you use sum-to-zero contrasts throughout, i.e. options(contrasts=c(contr.sum,
contr.poly)) ... ?


That works (except for the sign)! What would this mean?

Kjetil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Understanding lm-based analysis of fractional factorial experiments

2013-03-06 Thread Bert Gunter
As Ista indicates, the basic issue is that the OP does not understand
linear modeling and is therefore just thrashing around with lm. For
example, the statement about effects being double coefficient is only
true with the orthogonal (-1,1) parameterization of the contrasts.

So I suggest the OP either find some local statistical help or start
reading up on linear models, rather than wasting further time and
space here.

-- Bert

On Wed, Mar 6, 2013 at 6:17 AM, Ista Zahn istaz...@gmail.com wrote:
 Hi,

 On Wed, Mar 6, 2013 at 5:46 AM, Kjetil Kjernsmo kje...@ifi.uio.no wrote:
 All,

 I have just returned to R after a decade of absence, and it is good to see
 that R has become such a great success! I'm trying to bring Design of
 Experiments into some aspects of software performance evaluation, and to
 teach myself that, I picked up Experiments: Planning, Analysis and
 Optimization by Wu and Hamada. I try to reproduce an analysis in the book
 using lm, but have to conclude I don't understand what lm does in this
 context, even though I end up at the desired result. I'm currently using R
 2.15.2 on a recent Fedora system, but I get the same result on Debian Wheezy
 and Debian Squeeze. I think the discussion below can be followed without
 having the book at hand though.

 I have my doubts...


 I'm working with tables 5.2 and 5.5 in the above mentioned book. Table 5.2
 contains data from the Leaf spring experiment. The dataset is also in this
 zip file:

 ftp://ftp.wiley.com/public/sci_tech_med/experiments-planning/data%20sets.zip

 I've learned from the book that the effects can be found using a linear
 model and double the coefficients. So, I do
 leaf -
 read.table(/ifi/bifrost/a03/kjekje/fag/experimental-planning/book-datasets/LeafSpring
 table 5.2.dat, col.names=c(B, C, D, E, Q, paste(r, 1:3,
 sep=), yavg, ssq, lnssq))
 leaf.lm - lm(yavg ~ B * C * D * E * Q, data=leaf)

 That is complete nonsense:

 dim(leaf)
 [1] 16 11
 length(coef(leaf.lm))
 [1] 32

 So you are trying to estimate 32 coefficients from 16 data points.
 That is never going to work.

 leaf.lm

 Call:
 lm(formula = yavg ~ B * C * D * E * Q, data = leaf)

 Coefficients:
(Intercept)  B+  C+  D+  E+
7.54000 0.07003 0.32333-0.09668 0.07668
 Q+   B+:C+   B+:D+   C+:D+   B+:E+
   -0.33670 0.01335 0.11995 0.02335  NA
  C+:E+   D+:E+   B+:Q+   C+:Q+   D+:Q+
 NA  NA 0.22915-0.25745 0.28255
  E+:Q+B+:C+:D+B+:C+:E+B+:D+:E+ C+:D+:E+
0.05415  NA  NA  NA  NA
   B+:C+:Q+B+:D+:Q+C+:D+:Q+B+:E+:Q+ C+:E+:Q+
0.04160-0.16160-0.18840  NA  NA
   D+:E+:Q+ B+:C+:D+:E+ B+:C+:D+:Q+ B+:C+:E+:Q+ B+:D+:E+:Q+
 NA  NA  NA  NA  NA
C+:D+:E+:Q+  B+:C+:D+:E+:Q+
 NA  NA

 (seems there is little I can do about the line breaks here, sorry)

 However, the book (table 5.5), has 0.221 for the main effect of B and 0.176,
 and the above is neither this, nor half of it. Now, I can reproduce what's
 in the book with

 lm(yavg ~ B, data=leaf)

 Call:
 lm(formula = yavg ~ B, data = leaf)

 Coefficients:
 (Intercept)   B+
  7.5254   0.2213


 lm(yavg ~ C, data=leaf)

 Call:
 lm(formula = yavg ~ C, data = leaf)

 Coefficients:
 (Intercept)   C+
  7.5479   0.1763

 Assuming lm does in fact double the coefficient in this case,

 I have no idea what this means.

  but here the
 intercept varies, which doesn't seem correct,

 You mean that the intercept for

 lm(yavg ~ B, data=leaf)

 differs from the intercept for

 lm(yavg ~ C, data=leaf)

 ? If so that is expected. The intercept is the expected value of yavg
 when all predictors are zero. The expected value for B = zero does not
 have to be the same as the expected value for C = 0.

  nor can I as trivially find
 the interactions the same way.

 What way?


 Now, I try the effects() function, and get familiar numbers:
 effects(leaf.lm)
 (Intercept)  B+  C+  D+  E+  Q+
   -30.54415-0.44250 0.35250-0.05750-0.20750-0.51920
   B+:C+   B+:D+   C+:D+   B+:Q+   C+:Q+   D+:Q+
-0.03415-0.03915 0.07085-0.16915 0.33085-0.10755
   E+:Q+B+:C+:Q+B+:D+:Q+C+:D+:Q+
 0.05415-0.02080 0.08080-0.09420

 and indeed, I have verified that effects(leaf.lm)/2 gives me the expected
 result.

 So, I have found the correct answer, but I don't understand why. I have read
 the documentation for effects() as well as looked through the relevant
 chapter in Statistical Models in S, but from that all I got was that I
 suppose there is a hint in the phrase 

Re: [R] CARET and NNET fail to train a model when the input is high dimensional

2013-03-06 Thread Max Kuhn
James,

I did a fresh install from CRAN to get caret_5.15-61 and ran your code with
method.name = nnet and grid.len = 3.

I don't get an error, although there were issues:

   In nominalTrainWorkflow(dat = trainData, info = trainInfo,  ... :
 There were missing values in resampled performance measures.

The results had:

Resampling results across tuning parameters:

  size  decay  ROCSens   Spec   ROC SD   Sens SD  Spec SD
  1 0  0.521  0.52   0.521  0.0148   0.0312   0.00901
  1 1e-04  0.513  0.528  0.498  0.00616  0.00386  0.00552
  1 0.10.515  0.522  0.514  0.0169   0.0284   0.0426
  3 0  NaNNaNNaNNA   NA   NA
  3 1e-04  NaNNaNNaNNA   NA   NA
  3 0.1NaNNaNNaNNA   NA   NA
  5 0  NaNNaNNaNNA   NA   NA
  5 1e-04  NaNNaNNaNNA   NA   NA
  5 0.1NaNNaNNaNNA   NA   NA

To test more, I ran:

test - nnet(trX, trY, size = 3, decay = 0)
   Error in nnet.default(trX, trY, size = 3, decay = 0) :
 too many (2107) weights

So, you need to pass in MaxNWts to nnet() with a value that let's you fit
the model. Off the top of my head, you could use something like:

   MaxNWts  = length(levels(trY))*(max(my.grid$.size) * (nCol + 1) +
max(my.grid$.size) + 1)

Also, this one of the methods for getting help (the other is to just email
me). I also try to keep up on stack exchange too.

Max



On Tue, Mar 5, 2013 at 9:47 PM, James Jong ribonucle...@gmail.com wrote:

 The following code fails to train a nnet model in a random dataset using
 caret:

 nR - 700
 nCol - 2000
   myCtrl - trainControl(method=cv, number=3, preProcOptions=NULL,
 classProbs = TRUE, summaryFunction = twoClassSummary)
   trX - data.frame(replicate(nR, rnorm(nCol)))
   trY - runif(1)*trX[,1]*trX[,2]^2+runif(1)*trX[,3]/trX[,4]
   trY - as.factor(ifelse(sign(trY)0,'X1','X0'))
   my.grid - createGrid(method.name, grid.len, data=trX)
   my.model - train(trX,trY,method=method.name
 ,trace=FALSE,trControl=myCtrl,tuneGrid=my.grid,
 metric=ROC)
   print(Done)

 The error I get is:
 task 2 failed - arguments imply differing number of rows: 1334, 666

 However, everything works if I reduce nR to, say 20.

 Any thoughts on what may be causing this? Is there a place where I could
 report this bug other than this mailing list?

 Here is my session info:
  sessionInfo()
 R version 2.15.2 (2012-10-26)
 Platform: x86_64-unknown-linux-gnu (64-bit)

 locale:
 [1] C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] nnet_7.3-5  pROC_1.5.4  caret_5.15-052  foreach_1.4.0
 [5] cluster_1.14.3  plyr_1.8reshape2_1.2.2  lattice_0.20-13

 loaded via a namespace (and not attached):
 [1] codetools_0.2-8 compiler_2.15.2 grid_2.15.2 iterators_1.0.6
 [5] stringr_0.6.2   tools_2.15.2

 Thanks,

 James

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Max

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] combining column having same values

2013-03-06 Thread Rui Barradas

Hello,

Try

x - scan(text =  1  1   3  2   3   1  1 
2  3   3  2)

sapply(unique(x), function(.x) which(x == .x))


Hope this helps,

Rui Barradas

Em 06-03-2013 11:26, eliza botto escreveu:


Dear useRs,
I have a matrix in the following form

  [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  [,10]  [,11]
  1  1   3  2   3   1  1   2  3   3  2

and following is my desired output  (combining the column headers, having same 
values).
a-1,2,6,7

b-3,5,9,10

c-4,8,11
Thanks in advance
Elisa   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Understanding lm-based analysis of fractional factorial experiments

2013-03-06 Thread Peter Claussen

On Mar 6, 2013, at 4:46 AM, Kjetil Kjernsmo kje...@ifi.uio.no wrote:

 All,
 
 I have just returned to R after a decade of absence, and it is good to see 
 that R has become such a great success! I'm trying to bring Design of 
 Experiments into some aspects of software performance evaluation, and to 
 teach myself that, I picked up Experiments: Planning, Analysis and 
 Optimization by Wu and Hamada. I try to reproduce an analysis in the book 
 using lm, but have to conclude I don't understand what lm does in this 
 context, even though I end up at the desired result. I'm currently using R 
 2.15.2 on a recent Fedora system, but I get the same result on Debian Wheezy 
 and Debian Squeeze. I think the discussion below can be followed without 
 having the book at hand though.
 
 I'm working with tables 5.2 and 5.5 in the above mentioned book. Table 5.2 
 contains data from the Leaf spring experiment. The dataset is also in this 
 zip file:
 
 ftp://ftp.wiley.com/public/sci_tech_med/experiments-planning/data%20sets.zip
 
 I've learned from the book that the effects can be found using a linear model 
 and double the coefficients. So, I do
  leaf - 
  read.table(/ifi/bifrost/a03/kjekje/fag/experimental-planning/book-datasets/LeafSpring
   table 5.2.dat, col.names=c(B, C, D, E, Q, paste(r, 1:3, 
  sep=), yavg, ssq, lnssq))
  leaf.lm - lm(yavg ~ B * C * D * E * Q, data=leaf)
  leaf.lm
 
 

I'll ignore the rest of your question, in the hope that this will answer them 
sufficiently.

You probably want a simple linear model, specified in R using + instead of 
*.

 leaf.lm - lm(yavg ~ B + C + D + E + Q, data=leaf)
 leaf.lm

Call:
lm(formula = yavg ~ B + C + D + E + Q, data = leaf)

Coefficients:
(Intercept)   B+   C+   D+   E+   Q+  
   7.50084  0.22125  0.17625  0.02875  0.10375 -0.25960  

Does this give you the numbers you expect?

Peter
 
 Best regards,
 
 Kjetil
 -- 
 Kjetil Kjernsmo
 PhD Research Fellow, University of Oslo, Norway
 Semantic Web / SPARQL Query Federation
 kje...@ifi.uio.no http://www.kjetil.kjernsmo.net/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Understanding lm-based analysis of fractional factorial experiments

2013-03-06 Thread Kjetil Kjernsmo

On 03/06/2013 04:18 PM, Peter Claussen wrote:

I'll ignore the rest of your question, in the hope that this will answer them 
sufficiently.


OK!


You probably want a simple linear model, specified in R using + instead of 
*.


leaf.lm - lm(yavg ~ B + C + D + E + Q, data=leaf)
leaf.lm

Call:
lm(formula = yavg ~ B + C + D + E + Q, data = leaf)

Coefficients:
(Intercept)   B+   C+   D+   E+   Q+
 7.50084  0.22125  0.17625  0.02875  0.10375 -0.25960

Does this give you the numbers you expect?


Well, it partly gives the numbers I expect, but I want the interactions 
as well, so it is only a partial answer.


Best,

Kjetil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Understanding lm-based analysis of fractional factorial experiments

2013-03-06 Thread Peter Claussen

On Mar 6, 2013, at 9:23 AM, Kjetil Kjernsmo kje...@ifi.uio.no wrote:

 On 03/06/2013 04:18 PM, Peter Claussen wrote:
 I'll ignore the rest of your question, in the hope that this will answer 
 them sufficiently.
 
 OK!
 
 You probably want a simple linear model, specified in R using + instead of 
 *.
 
 leaf.lm - lm(yavg ~ B + C + D + E + Q, data=leaf)
 leaf.lm
 Call:
 lm(formula = yavg ~ B + C + D + E + Q, data = leaf)
 
 Coefficients:
 (Intercept)   B+   C+   D+   E+   Q+
 7.50084  0.22125  0.17625  0.02875  0.10375 -0.25960
 
 Does this give you the numbers you expect?
 
 Well, it partly gives the numbers I expect, but I want the interactions as 
 well, so it is only a partial answer.


But you don't have enough data points to estimate all of the possible 
interactions; that's why you have NA in your original results. You could add 
the just the first order interactions manually, i.e.,  + B:C + B:D …

Peter

 
 Best,
 
 Kjetil
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Troubles with labeling x axis

2013-03-06 Thread iDa
Hi!

I have problems with labeling x axis while plotting time series data. I have
40 monthly measurement. One period lasts 4 months. I'd like to have 40 ticks
on x axis (10 larger, the rest smaller) and labels just at the beginning of
each period, just like in the image
http://r.789695.n4.nabble.com/file/n4660465/2221.jpg 

My code leaves x axis empty:

 data - read.csv(file=CSV files/Komen.csv, head=TRUE, sep=;)
 dataTimeSeries - ts(data, frequency=12, start=c(2000,4))
 dataTimeSeries
 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2000   7  45  47   3  24 132  35  32  28
2001 161  48  31  33 161 154 420  19 149  44  54  16
2002 152  94  43  64 193  85  98  77 236  87  72  47
2003 196 120  51  27 143  99  56
 require(graphics)
 plot.ts(dataTimeSeries, xaxt=n, xlab= Perioda, ylab= Opazovane
 vrednosti, type='l', col='red')
 axis(side=1, at=seq(1,40,4), labels=seq(1,10,1))

Thanks in advance for any help!



--
View this message in context: 
http://r.789695.n4.nabble.com/Troubles-with-labeling-x-axis-tp4660465.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to combine conditional argument and logical argument in R to create subset of data...

2013-03-06 Thread HJ YAN
Hi Arun


Thank you so much for the help, that's really helpful!!

Also I have a quick question about the code below where I can not see why
it doesn't work...

I know the I shou

V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3))
V2-c(1:23)
Tem1-cbind(V1,V2)


So Tem 1 looks like...
 Tem1
   V1 V2
 [1,] 111  1
 [2,] 111  2
 [3,] 111  3
 [4,] 111  4
 [5,] 222  5
 [6,] 222  6
 [7,] 222  7
 [8,] 222  8
 [9,] 333  9
[10,] 333 10
[11,] 333 11
[12,] 333 12
[13,] 111 13
[14,] 111 14
[15,] 111 15
[16,] 111 16
[17,] 222 17
[18,] 222 18
[19,] 222 19
[20,] 222 20
[21,] 333 21
[22,] 333 22
[23,] 333 23

I would like the outcome to be...

  V1 V2

 111  1
 111  2
 111  3
 111  4
 111 13
 111 14
 111 15
 111 16
 222  5
 222  6
 222  7
 222  8
 222 17
 222 18
 222 19
 222 20
 333  9
 333 10
 333 11
 333 12
 333 21
 333 22
 333 23


So I tried code as below
--
Tem3-c(NA,NA)
for(i in length(unique(Tem1[,1]))){
Tem2-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i])
Tem3-rbind(Tem3,Tem2)
Tem3
}
Tem4-Tem3[-1,]
---

And only get this...


 V1 V2
 333  9
 333 10
 333 11
 333 12
 333 21
 333 22
 333 23


I tried to run the code step by step, e.g. letting i=1, then i=2, then i=
3, and updating my Tem3, I did get what I wanted, but wondered why in the
loop above it did not work...??


Many thanks in advance!

HJ














On Wed, Mar 6, 2013 at 4:36 AM, arun smartpink...@yahoo.com wrote:

 Hi,

  b[b[,4]15  (b[,1]4|is.na(b[,1]))  (b[,2]4|is.na(b[,2])),]
  #[,1] [,2] [,3] [,4] [,5]
 #[1,]6   NA   NA   16   20
 #[2,]   NA5   NA   17   21
 A.K.


 - Original Message -
 From: HJ YAN yhj...@googlemail.com
 To: r-help@r-project.org
 Cc:
 Sent: Tuesday, March 5, 2013 9:33 PM
 Subject: [R] How to combine conditional argument and logical argument in R
 to create subset of data...

 Dear R user

 I have data created using code below

 b-matrix(2:21,nrow=4)
 b[,1:3]=NA
 b[4,2]=5
 b[3,1]=6

 Now the data is

  b
  [,1]  [,2]   [,3]  [,4]  [,5]
 [1,]   NA   NA   NA   14   18
 [2,]   NA   NA   NA   15   19
 [3,]  6   NA   NA   16   20
 [4,]   NA5 NA17   21


 I want to keep data in column 4 greater than 15 and the value in column 1 
 2 either greater than 4 or is 'NA'. So I would like to have
 my outcome as below...

 [3,]   6   NA NA 16 20
 [4,] NA 5 NA 17 21

 I thought something like the code below gonna to work but it only returns
 the last row,e.g NA 5 NA 17 21. ...

 bb-b[which( (b[,2]4 | b[,2]==NA)  (b[,1]4 | b[,1]==NA)  b[,4]15) ,])


 Please could anyone help?

 Many thanks in advance

 HJ

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Friedman test in R

2013-03-06 Thread chet83
Dear R users,

I am new to R and looking into using a Friedman test in R with post-hoc
analysis for a time series datset in which I am looking at changes in
multiple features over three time points in a number of individuals.  As
well as detecting if there is an overall difference in features between the
three time points I want to determine which features change significantly
and between which time points.  I therefore need to perform a post-hoc test
such as the wilcoxon signed-rank test. 

I am having trouble formatting my data and performing the formula:
friedman.test (y~A|B)
I think that y should be the feature measurements, A should be the time
points and B the subject. 

The data look something like this..

subject timepoint   feature1feature2feature3
feature4 ..
1   1   26  32  43  45 
1   2   45  63  3   87
1   3   23  22  4   94
2   1   76  44  79  79
2   2   56  56  8   76
2   3   87  23  7   67 

etc 

My question is how I could read this table into R in a format that would
allow the above test to be performed?   Also is there any way I can perform
post-hoc wilcoxon signed rank tests to determine which features are
different and between which time points? 

Thanks very much in advance for any help you can offer! 
 




--
View this message in context: 
http://r.789695.n4.nabble.com/Friedman-test-in-R-tp4660441.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?

2013-03-06 Thread ONKELINX, Thierry
Dear Anna,

Is this what you would like?

Summ  -   ddply(mydata, .(factor3,factor1), summarize,
   mean = mean(var1, na.rm = FALSE),
   sdv = sd(var1, na.rm = FALSE),
   se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1
Summ$Grouping - c(AB, AB, CD, CD, EF, EF)[Summ$factor1]
Summ$factor1bis - c(0, 1, 0, 1, 0, 1)[Summ$factor1]

ggplot(Summ, aes(factor3, mean, group = factor1bis, shape = factor1bis, 
linetype = factor1bis, ymin = mean - sdv , ymax = mean + sdv)) +
geom_point(position = position_dodge(width = 0.25), size = 3) +
geom_line(position = position_dodge(width = 0.25)) +
geom_errorbar(width = 0.3, position = position_dodge(width = 0.25), size = 0.3) 
+
facet_wrap(~Grouping, ncol = 2) +
theme_bw() +
ylab(expression(paste(my measured stuff))) +
xlab(factor3) +
labs(shape = factor1, group = factor1, linetype = factor1)

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
Anna Zakrisson
Verzonden: woensdag 6 maart 2013 13:33
Aan: r-help@r-project.org
Onderwerp: [R] Ggplot2: Moving legend, change fill and removal of space between 
plots when using grid.arrange() possible use of facet_grid?

Hi,

# For publications, I am not allowed to repeat the axes. I have tried to
remove the axes using:
# yaxt=n, but it did not work. I have not understood how to do this in
ggplot2. Can you help me?
# I also do not want loads of space between the graphs (see below script
with Dummy Data).
# If I could make it look like the examples on the (nice) examples page:
# http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html
# using the facet_grid(), I would be very very happy.

# I also do not want the gemoetric points to be filled and the fill=white
commande
# does not seem to work - why? and are there alternatives?

#Furthermore, I would like to add legends to inside the plot area instead of
on the side. Like when you use plotrix() and brkdn.plot:
legend(topright, c(A, B), pch=c(0,1), bg=white,
   lty = 1:2, cex=1, bty=n)
# This did not work in ggplot2. What are my alternatives. I have extensively
searched the internet and have I missed something obvious, it was due to
   # tiredness and not to lazyness.

# Some dummy data:
mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)),
  factor2 = factor(rep(c(1:5), each = 16)),
  factor3 = factor(rep(c(1:4), each = 4)),
  var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40),
  sd = rep(c(1, 2, 3), each = 20)),
  var2 = rnorm(120, mean = rep(c(6, 7, 8), each = 40),
  sd = rep(c(1, 2, 3), each = 20)))


# Splitting data into 3 data frames (based on factor1)
# If I could do this using for example facet_wrap() or facet_grid(), I would
be very
# happy! I have tried but failed that method.

DataAB - mydata[(mydata$factor1) %in% c(A, B), ]
DataCD - mydata[(mydata$factor1) %in% c(C, D), ]
DataEF - mydata[(mydata$factor1) %in% c(E, F), ]
DataAB
library(plyr)
library(ggplot2)

#Plot: levels A and B:
# Summary (means etc)
SummAB  -   ddply(DataAB, .(factor3,factor1), summarize,
   mean = mean(var1, na.rm = FALSE),
   sdv = sd(var1, na.rm = FALSE),
   se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1
SummAB
p1  -  ggplot(SummAB, aes(factor3, mean,
   colour = factor1, group = factor1,
   shape = factor1)) +
  geom_point(aes(shape=factor(factor1)), color=black, fill=white,
 position = dodge, width = 0.3, size=3) +
  geom_line(aes(linetype=factor1), color = black, size = 0.5) +
  geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
position = dodge, color = black, size=0.3) +
  theme_bw() +
  ylab(expression(paste(my measured stuff))) +
  xlab(factor3) + ggtitle() +
  labs(color = factor1, shape = factor1, group = factor1,
   linetype = factor1)
p1

#Plot: levels C and D:
# Summary (means etc)
SummCD  -   ddply(DataCD, .(factor3,factor1), summarize,
   mean = mean(var1, na.rm = FALSE),
   sdv = sd(var1, na.rm = FALSE),
   

Re: [R] Understanding lm-based analysis of fractional factorial experiments

2013-03-06 Thread Peter Claussen

On Mar 6, 2013, at 4:46 AM, Kjetil Kjernsmo kje...@ifi.uio.no wrote:

 All,
 
 I have just returned to R after a decade of absence, and it is good to see 
 that R has become such a great success! I'm trying to bring Design of 
 Experiments into some aspects of software performance evaluation, and to 
 teach myself that, I picked up Experiments: Planning, Analysis and 
 Optimization by Wu and Hamada. I try to reproduce an analysis in the book 
 using lm, but have to conclude I don't understand what lm does in this 
 context, even though I end up at the desired result. I'm currently using R 
 2.15.2 on a recent Fedora system, but I get the same result on Debian Wheezy 
 and Debian Squeeze. I think the discussion below can be followed without 
 having the book at hand though.
 
 I'm working with tables 5.2 and 5.5 in the above mentioned book. Table 5.2 
 contains data from the Leaf spring experiment. The dataset is also in this 
 zip file:
 
 ftp://ftp.wiley.com/public/sci_tech_med/experiments-planning/data%20sets.zip
 
 I've learned from the book that the effects can be found using a linear model 
 and double the coefficients. So, I do
  leaf - 
  read.table(/ifi/bifrost/a03/kjekje/fag/experimental-planning/book-datasets/LeafSpring
   table 5.2.dat, col.names=c(B, C, D, E, Q, paste(r, 1:3, 
  sep=), yavg, ssq, lnssq))
  leaf.lm - lm(yavg ~ B * C * D * E * Q, data=leaf)
  leaf.lm
 

I'll ignore the rest of your question, in the hope that this will answer them 
sufficiently.

You probably want a simple linear model, specified in R using + instead of 
*.

 leaf.lm - lm(yavg ~ B + C + D + E + Q, data=leaf)
 leaf.lm

Call:
lm(formula = yavg ~ B + C + D + E + Q, data = leaf)

Coefficients:
(Intercept)   B+   C+   D+   E+   Q+  
7.50084  0.22125  0.17625  0.02875  0.10375 -0.25960  

Does this give you the numbers you expect?

Peter


 
 
 Kjetil
 -- 
 Kjetil Kjernsmo
 PhD Research Fellow, University of Oslo, Norway
 Semantic Web / SPARQL Query Federation
 kje...@ifi.uio.no http://www.kjetil.kjernsmo.net/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difficulty in caper: Error in phy$node.label[which(newNb 0) - Ntip]

2013-03-06 Thread Nicole Thompson
Hello,

I'm doing a comparative analysis of mammal brain and body size data.
I'm following Charlie Nunn and Natalie Cooper's instructions for
Running PGLS in R using caper.

I run into the following error when I create my comparative dataset,
combining my phylogenetic tree (mammaltree) and taxon measures
(mammaldata):

Error in phy$node.label[which(newNb  0) - Ntip] : only 0's may be
mixed with negative subscripts

My full script is provided at the bottom.

I have looked at the caper manual by David Orme to understand how
comparative.data() constructs the dataset, but still cannot interpret
the error. Many thanks to anyone who could provide me with insight.

Nicole Thompson
E3B Columbia University



 library(caper)

Loading required package: ape

Loading required package: MASS

Loading required package: mvtnorm



 mammaldata -read.csv(R.Mammal_data.csv, header = TRUE)

 mammaltree -read.nexus(BEphylotree.nex)

 mammal - comparative.data(phy = mammaltree, data = mammaldata, names.col = 
 Taxon, vcv = TRUE, na.omit = FALSE, warn.dropped = TRUE) #names.col?

Error in phy$node.label[which(newNb  0) - Ntip] : only 0's may be
mixed with negative subscripts





--
Nicole A Thompson
E3B Columbia University, NYCEP
nat2...@columbia.edu
480.522.4212

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Gui frond has stopped working

2013-03-06 Thread lefelit
I use R (2.15.2) on a windows 7 (64) for data mining with xcms package. This
is a routine process for me and didn't encount any problems until few weeks
ago when I got an error message for R GUI  frond end has stopped working.
The following information was given  by windows to describe the error.

 

 \AppData\Local\Temp\WER5936.tmp.WERInternalMetadata.xml

 \AppData\Local\Temp\WER7243.tmp.appcompat.txt

 \AppData\Local\Temp\WER7282.tmp.mdmp

 

I have uninstall and re-install couple of time the R software but it did not
help. Any ideas how to resolve this issue? I even tried to upgrade to 2.15.3
or to 2.15.1 but still get the same error message

 

Thanks in advance for any help

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About basic logical operators

2013-03-06 Thread Duncan Murdoch

On 05/03/2013 7:53 PM, Victor hyk wrote:

Hello everyone,
   I have a basic question regarding logical operators.
 x-seq(-1,1,by=0.02)
 x
   [1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 -0.78
  [13] -0.76 -0.74 -0.72 -0.70 -0.68 -0.66 -0.64 -0.62 -0.60 -0.58 -0.56 -0.54
  [25] -0.52 -0.50 -0.48 -0.46 -0.44 -0.42 -0.40 -0.38 -0.36 -0.34 -0.32 -0.30
  [37] -0.28 -0.26 -0.24 -0.22 -0.20 -0.18 -0.16 -0.14 -0.12 -0.10 -0.08 -0.06
  [49] -0.04 -0.02  0.00  0.02  0.04  0.06  0.08  0.10  0.12  0.14  0.16  0.18
  [61]  0.20  0.22  0.24  0.26  0.28  0.30  0.32  0.34  0.36  0.38  0.40  0.42
  [73]  0.44  0.46  0.48  0.50  0.52  0.54  0.56  0.58  0.60  0.62  0.64  0.66
  [85]  0.68  0.70  0.72  0.74  0.76  0.78  0.80  0.82  0.84  0.86  0.88  0.90
  [97]  0.92  0.94  0.96  0.98  1.00
 x[x=0.02]
  [1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 -0.78
[13] -0.76 -0.74 -0.72 -0.70 -0.68 -0.66 -0.64 -0.62 -0.60 -0.58 -0.56 -0.54
[25] -0.52 -0.50 -0.48 -0.46 -0.44 -0.42 -0.40 -0.38 -0.36 -0.34 -0.32 -0.30
[37] -0.28 -0.26 -0.24 -0.22 -0.20 -0.18 -0.16 -0.14 -0.12 -0.10 -0.08 -0.06
[49] -0.04 -0.02  0.00
 x[x0.2]
  [1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 -0.78
[13] -0.76 -0.74 -0.72 -0.70 -0.68 -0.66 -0.64 -0.62 -0.60 -0.58 -0.56 -0.54
[25] -0.52 -0.50 -0.48 -0.46 -0.44 -0.42 -0.40 -0.38 -0.36 -0.34 -0.32 -0.30
[37] -0.28 -0.26 -0.24 -0.22 -0.20 -0.18 -0.16 -0.14 -0.12 -0.10 -0.08 -0.06
[49] -0.04 -0.02  0.00  0.02  0.04  0.06  0.08  0.10  0.12  0.14  0.16  0.18
[61]  0.20

  Why does x[x=0.02] return  no 0.02


You don't have a 0.02 in your dataset.  Evaluate x[52] - 0.02 and you 
won't get zero due to rounding (as Jeff said, see FAQ 7.31).

but x[x0.2] return a subsample with 0.02?


You don't have 0.2, either.  Evaluate x[61] - 0.2 and you get a negative 
value.


Duncan Murdoch


  Anyone who can tell me why?
  Thanks!

  Victor

[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] print justify

2013-03-06 Thread Berry Boessenkool



Hi everyone,

I'm trying to print a table justified to the left, but it doesn't work.
Any hints?

KennArt - data.frame(NR=c(171,172,174,175,176,177,181,411,980), 
TYP=c(Körnermais,
 Corn Cob Mix, Zuckermais, Mischanbau (Silo)Mais/Sonnenblumen,
 Mais mit Bejagungsschneise in gutem landwirtschaftlichen und ökologischen 
Zustand,
 Mais mit Bejagungsschneise (Kulturpflanze), Hirse, Silomais (Als 
Hauptfutter), Sudangras))


print(KennArt, justify=left)  

still justifies to the right:

   NR   
    TYP
1 171    
Körnermais
2 172  Corn 
Cob Mix
3 174    
Zuckermais
4 175    Mischanbau 
(Silo)Mais/Sonnenblumen
5 176 Mais mit Bejagungsschneise in gutem landwirtschaftlichen und ökologischen 
Zustand
6 177    Mais mit Bejagungsschneise 
(Kulturpflanze)
7 181   
  Hirse
8 411    Silomais (Als 
Hauptfutter)
9 980 
Sudangras



print(KennArt[2:3,], justify=left) 
doesn't leftify either, so it's not the German letters' fault.

format(KennArt, justify=left)
does the job mostly, but the column names are still rightified...
This solution is fine for me now, but I'm still wondering...


sessionInfo() returns:

R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    
LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C    LC_TIME=German_Germany.1252    

attached base packages:
[1] graphics  grDevices datasets  utils stats methods   base 

other attached packages:
[1] foreign_0.8-52 fortunes_1.5-0 BerryFunctions_1.0 evd_2.3-0 

loaded via a namespace (and not attached):
[1] tools_2.15.1



Thanks ahead,
Berry


  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to combine conditional argument and logical argument in R to create subset of data...

2013-03-06 Thread arun
Hi,
No problem.
V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
 length(V1)
#[1] 30

 V2- c(1:30) #should be the same length as V1
Tem1- cbind(V1,V2)
Tem2-Tem1[1:20,]

Tem1[!Tem1[,2]%in%Tem2[,2],]
 #  V1 V2
 #[1,] 222 21
 #[2,] 222 22
 #[3,] 222 23
 #[4,] 222 24
 #[5,] 222 25
 #[6,] 333 26
 #[7,] 333 27
 #[8,] 333 28
 #[9,] 333 29
#[10,] 333 30

#or
subset(Tem1,!V2%in% Tem2[,2])
#or
 Tem1[is.na(match(Tem1[,2],Tem2[,2])),]
 #  V1 V2
 #[1,] 222 21
 #[2,] 222 22
 #[3,] 222 23
 #[4,] 222 24
 #[5,] 222 25
 #[6,] 333 26
 #[7,] 333 27
 #[8,] 333 28
 #[9,] 333 29
#[10,] 333 30
A.K.





From: HJ YAN yhj...@googlemail.com
To: arun smartpink...@yahoo.com 
Sent: Wednesday, March 6, 2013 10:33 AM
Subject: Re: [R] How to combine conditional argument and logical argument in R 
to create subset of data...


Thank you SO MUCH Arun!!! 

That's brilliant-- I've learnt some very useful new R command now, e.g. 
'do.call' and 'split'. And I see where my code went wrong now. 

 I do appreciate greatly for your prompt reply.

Also, I wonder if there exist a package can find difference between two data 
frames, e.g. one is a subset of the other? e.g. 

 V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
 V2-c(1:23)
Tem1-cbind(V1,V2)

Tem2-Tem1[1:20,]


How do I get outcome like 

[21,] 333 21
[22,] 333 22
[23,] 333 23


P.S. I used 'setdiff' before, but seems it only works for vectors but not for 
dataframe??


Sorry for so many questions today, as I'm coding for a work deadline tonight.


Many thanks!
Cheers
HJ







On Wed, Mar 6, 2013 at 1:55 PM, arun smartpink...@yahoo.com wrote:

Hi,
You can also try this:
 Tem3- list()
 for(i in unique(Tem1[,1])) {
 Tem3[[i]]- subset(Tem1,Tem1[,1]==i)
 Tem4- do.call(rbind,Tem3)
 }
head(Tem4)
#  V1 V2
#[1,] 111  1
#[2,] 111  2
#[3,] 111  3
#[4,] 111  4
#[5,] 111 13
#[6,] 111 14


#or
Tem3-c(NA,NA)
 for(i in unique(Tem1[,1])) {
 Tem2- subset(Tem1, Tem1[,1]==i)
 Tem3- rbind(Tem3,Tem2)
 Tem5- Tem3[-1,]
 }
head(Tem5)
#  V1 V2
# 111  1
# 111  2
# 111  3
# 111  4
# 111 13
# 111 14

A.K.



From: HJ YAN yhj...@googlemail.com

To: arun smartpink...@yahoo.com
Cc: r-help@r-project.org
Sent: Wednesday, March 6, 2013 8:24 AM
Subject: Re: [R] How to combine conditional argument and logical argument in R 
to create subset of data...



Hi Arun


Thank you so much for the help, that's really helpful!!

Also I have a quick question about the code below where I can not see why it 
doesn't work...

I know the I shou

V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3))
V2-c(1:23)
Tem1-cbind(V1,V2)


So Tem 1 looks like...
 Tem1
       V1 V2
 [1,] 111  1
 [2,] 111  2
 [3,] 111  3
 [4,] 111  4
 [5,] 222  5
 [6,] 222  6
 [7,] 222  7
 [8,] 222  8
 [9,] 333  9
[10,] 333 10
[11,] 333 11
[12,] 333 12
[13,] 111 13
[14,] 111 14
[15,] 111 15
[16,] 111 16
[17,] 222 17
[18,] 222 18
[19,] 222 19
[20,] 222 20
[21,] 333 21
[22,] 333 22
[23,] 333 23

I would like the outcome to be...

      V1 V2

     111  1
     111  2
     111  3
     111  4
     111 13
     111 14
     111 15
     111 16
     222  5
     222  6
     222  7
     222  8
     222 17
     222 18
     222 19
     222 20
     333  9
     333 10
     333 11
     333 12
     333 21
     333 22
     333 23


So I tried code as below 
--
Tem3-c(NA,NA)
for(i in length(unique(Tem1[,1]))){
Tem2-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i])
Tem3-rbind(Tem3,Tem2)
Tem3
}
Tem4-Tem3[-1,]
---

And only get this...


 V1 V2
 333  9
 333 10
 333 11
 333 12
 333 21
 333 22
 333 23


I tried to run the code step by step, e.g. letting i=1, then i=2, then i= 3, 
and updating my Tem3, I did get what I wanted, but wondered why in the loop 
above it did not work...??


Many thanks in advance!

HJ















On Wed, Mar 6, 2013 at 4:36 AM, arun smartpink...@yahoo.com wrote:

Hi,

 b[b[,4]15  (b[,1]4|is.na(b[,1]))  (b[,2]4|is.na(b[,2])),]
 #    [,1] [,2] [,3] [,4] [,5]
#[1,]    6   NA   NA   16   20
#[2,]   NA    5   NA   17   21
A.K.



- Original Message -
From: HJ YAN yhj...@googlemail.com
To: r-help@r-project.org
Cc:
Sent: Tuesday, March 5, 2013 9:33 PM
Subject: [R] How to combine conditional argument and logical argument in R to 
create subset of data...

Dear R user

I have data created using code below

b-matrix(2:21,nrow=4)
b[,1:3]=NA
b[4,2]=5
b[3,1]=6

Now the data is

 b
         [,1]  [,2]   [,3]  [,4]  [,5]
[1,]   NA   NA   NA   14   18
[2,]   NA   NA   NA   15   19
[3,]      6   NA   NA   16   20
[4,]   NA    5     NA    17   21


I want to keep data in column 4 greater than 15 and the value in column 1 
2 either greater than 4 or is 'NA'. So I would like to have
my outcome as below...

[3,]   6   NA NA 16 20
[4,] NA 5 NA 17 21

I thought something like the code below gonna to work but it only returns
the last row,e.g NA 5 NA 17 21. ...

bb-b[which( (b[,2]4 | b[,2]==NA)  (b[,1]4 | b[,1]==NA)  

Re: [R] How to combine conditional argument and logical argument in R to create subset of data...

2013-03-06 Thread arun


Just to add:

Tem1[Tem1[,2]%in%setdiff(Tem1[,2],Tem2[,2]),]
A.K.

- Original Message -
From: arun smartpink...@yahoo.com
To: HJ YAN yhj...@googlemail.com
Cc: R help r-help@r-project.org
Sent: Wednesday, March 6, 2013 11:06 AM
Subject: Re: [R] How to combine conditional argument and logical argument in R 
to create subset of data...

Hi,
No problem.
V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
 length(V1)
#[1] 30

 V2- c(1:30) #should be the same length as V1
Tem1- cbind(V1,V2)
Tem2-Tem1[1:20,]

Tem1[!Tem1[,2]%in%Tem2[,2],]
 #  V1 V2
 #[1,] 222 21
 #[2,] 222 22
 #[3,] 222 23
 #[4,] 222 24
 #[5,] 222 25
 #[6,] 333 26
 #[7,] 333 27
 #[8,] 333 28
 #[9,] 333 29
#[10,] 333 30

#or
subset(Tem1,!V2%in% Tem2[,2])
#or
 Tem1[is.na(match(Tem1[,2],Tem2[,2])),]
 #  V1 V2
 #[1,] 222 21
 #[2,] 222 22
 #[3,] 222 23
 #[4,] 222 24
 #[5,] 222 25
 #[6,] 333 26
 #[7,] 333 27
 #[8,] 333 28
 #[9,] 333 29
#[10,] 333 30
A.K.





From: HJ YAN yhj...@googlemail.com
To: arun smartpink...@yahoo.com 
Sent: Wednesday, March 6, 2013 10:33 AM
Subject: Re: [R] How to combine conditional argument and logical argument in R 
to create subset of data...


Thank you SO MUCH Arun!!! 

That's brilliant-- I've learnt some very useful new R command now, e.g. 
'do.call' and 'split'. And I see where my code went wrong now. 

 I do appreciate greatly for your prompt reply.

Also, I wonder if there exist a package can find difference between two data 
frames, e.g. one is a subset of the other? e.g. 

 V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
 V2-c(1:23)
Tem1-cbind(V1,V2)

Tem2-Tem1[1:20,]


How do I get outcome like 

[21,] 333 21
[22,] 333 22
[23,] 333 23


P.S. I used 'setdiff' before, but seems it only works for vectors but not for 
dataframe??


Sorry for so many questions today, as I'm coding for a work deadline tonight.


Many thanks!
Cheers
HJ







On Wed, Mar 6, 2013 at 1:55 PM, arun smartpink...@yahoo.com wrote:

Hi,
You can also try this:
 Tem3- list()
 for(i in unique(Tem1[,1])) {
 Tem3[[i]]- subset(Tem1,Tem1[,1]==i)
 Tem4- do.call(rbind,Tem3)
 }
head(Tem4)
#  V1 V2
#[1,] 111  1
#[2,] 111  2
#[3,] 111  3
#[4,] 111  4
#[5,] 111 13
#[6,] 111 14


#or
Tem3-c(NA,NA)
 for(i in unique(Tem1[,1])) {
 Tem2- subset(Tem1, Tem1[,1]==i)
 Tem3- rbind(Tem3,Tem2)
 Tem5- Tem3[-1,]
 }
head(Tem5)
#  V1 V2
# 111  1
# 111  2
# 111  3
# 111  4
# 111 13
# 111 14

A.K.



From: HJ YAN yhj...@googlemail.com

To: arun smartpink...@yahoo.com
Cc: r-help@r-project.org
Sent: Wednesday, March 6, 2013 8:24 AM
Subject: Re: [R] How to combine conditional argument and logical argument in R 
to create subset of data...



Hi Arun


Thank you so much for the help, that's really helpful!!

Also I have a quick question about the code below where I can not see why it 
doesn't work...

I know the I shou

V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3))
V2-c(1:23)
Tem1-cbind(V1,V2)


So Tem 1 looks like...
 Tem1
       V1 V2
 [1,] 111  1
 [2,] 111  2
 [3,] 111  3
 [4,] 111  4
 [5,] 222  5
 [6,] 222  6
 [7,] 222  7
 [8,] 222  8
 [9,] 333  9
[10,] 333 10
[11,] 333 11
[12,] 333 12
[13,] 111 13
[14,] 111 14
[15,] 111 15
[16,] 111 16
[17,] 222 17
[18,] 222 18
[19,] 222 19
[20,] 222 20
[21,] 333 21
[22,] 333 22
[23,] 333 23

I would like the outcome to be...

      V1 V2

     111  1
     111  2
     111  3
     111  4
     111 13
     111 14
     111 15
     111 16
     222  5
     222  6
     222  7
     222  8
     222 17
     222 18
     222 19
     222 20
     333  9
     333 10
     333 11
     333 12
     333 21
     333 22
     333 23


So I tried code as below 
--
Tem3-c(NA,NA)
for(i in length(unique(Tem1[,1]))){
Tem2-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i])
Tem3-rbind(Tem3,Tem2)
Tem3
}
Tem4-Tem3[-1,]
---

And only get this...


 V1 V2
 333  9
 333 10
 333 11
 333 12
 333 21
 333 22
 333 23


I tried to run the code step by step, e.g. letting i=1, then i=2, then i= 3, 
and updating my Tem3, I did get what I wanted, but wondered why in the loop 
above it did not work...??


Many thanks in advance!

HJ















On Wed, Mar 6, 2013 at 4:36 AM, arun smartpink...@yahoo.com wrote:

Hi,

 b[b[,4]15  (b[,1]4|is.na(b[,1]))  (b[,2]4|is.na(b[,2])),]
 #    [,1] [,2] [,3] [,4] [,5]
#[1,]    6   NA   NA   16   20
#[2,]   NA    5   NA   17   21
A.K.



- Original Message -
From: HJ YAN yhj...@googlemail.com
To: r-help@r-project.org
Cc:
Sent: Tuesday, March 5, 2013 9:33 PM
Subject: [R] How to combine conditional argument and logical argument in R to 
create subset of data...

Dear R user

I have data created using code below

b-matrix(2:21,nrow=4)
b[,1:3]=NA
b[4,2]=5
b[3,1]=6

Now the data is

 b
         [,1]  [,2]   [,3]  [,4]  [,5]
[1,]   NA   NA   NA   14   18
[2,]   NA   NA   NA   15   19
[3,]      6   NA   NA   16   20
[4,]   NA    5     NA    17   21


I want to keep data in column 

Re: [R] print justify

2013-03-06 Thread MacQueen, Don
I don't know about justify as an arg to print, but the following should
qualify as a hint.

 format(c('a','aa','aaa'), justify='left')
[1] a   aa  aaa
 
 tmp - data.frame(a=c('a','aa','aaa'))
 print(tmp,justify='left')
a
1   a
2  aa
3 aaa
 
 
 tmp$b - format(c('a','aa','aaa'),justify='left')
 tmp
a   b
1   a a  
2  aa aa 
3 aaa aaa


-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 3/6/13 8:03 AM, Berry Boessenkool berryboessenk...@hotmail.com
wrote:




Hi everyone,

I'm trying to print a table justified to the left, but it doesn't work.
Any hints?

KennArt - data.frame(NR=c(171,172,174,175,176,177,181,411,980),
TYP=c(Körnermais,
 Corn Cob Mix, Zuckermais, Mischanbau (Silo)Mais/Sonnenblumen,
 Mais mit Bejagungsschneise in gutem landwirtschaftlichen und
ökologischen Zustand,
 Mais mit Bejagungsschneise (Kulturpflanze), Hirse, Silomais (Als
Hauptfutter), Sudangras))


print(KennArt, justify=left)

still justifies to the right:

   NR  
 TYP
1 171  
  Körnermais
2 172  
Corn Cob Mix
3 174  
  Zuckermais
4 175Mischanbau
(Silo)Mais/Sonnenblumen
5 176 Mais mit Bejagungsschneise in gutem landwirtschaftlichen und
ökologischen Zustand
6 177Mais mit Bejagungsschneise
(Kulturpflanze)
7 181  
   Hirse
8 411Silomais (Als
Hauptfutter)
9 980  
   Sudangras



print(KennArt[2:3,], justify=left)
doesn't leftify either, so it's not the German letters' fault.

format(KennArt, justify=left)
does the job mostly, but the column names are still rightified...
This solution is fine for me now, but I'm still wondering...


sessionInfo() returns:

R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252
LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=CLC_TIME=German_Germany.1252

attached base packages:
[1] graphics  grDevices datasets  utils stats methods   base

other attached packages:
[1] foreign_0.8-52 fortunes_1.5-0 BerryFunctions_1.0 evd_2.3-0


loaded via a namespace (and not attached):
[1] tools_2.15.1



Thanks ahead,
Berry


  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CARET and NNET fail to train a model when the input is high dimensional

2013-03-06 Thread James Jong
Thank you Max. I presume that in order to use caret with nnet and MaxNWts,
I would have to write my custom method for train that supports this new
argument.

From what I read, when writing my custom method, I would need to define
functions parameters, model, prediction, prob and sort and pass
them to trainControl.

However, If all I need is a new parameters function (in order to pass the
MaxNWTs argument to nnet), is there a way to reuse the other functions
(model, prediction, prob and sort) that are already defined for the
nnet method?

James


On Wed, Mar 6, 2013 at 9:59 AM, Max Kuhn mxk...@gmail.com wrote:

 James,

 I did a fresh install from CRAN to get caret_5.15-61 and ran your code
 with method.name = nnet and grid.len = 3.

 I don't get an error, although there were issues:

In nominalTrainWorkflow(dat = trainData, info = trainInfo,  ... :
  There were missing values in resampled performance measures.

 The results had:

 Resampling results across tuning parameters:

   size  decay  ROCSens   Spec   ROC SD   Sens SD  Spec SD
   1 0  0.521  0.52   0.521  0.0148   0.0312   0.00901
   1 1e-04  0.513  0.528  0.498  0.00616  0.00386  0.00552
   1 0.10.515  0.522  0.514  0.0169   0.0284   0.0426
   3 0  NaNNaNNaNNA   NA   NA
   3 1e-04  NaNNaNNaNNA   NA   NA
   3 0.1NaNNaNNaNNA   NA   NA
   5 0  NaNNaNNaNNA   NA   NA
   5 1e-04  NaNNaNNaNNA   NA   NA
   5 0.1NaNNaNNaNNA   NA   NA

 To test more, I ran:

 test - nnet(trX, trY, size = 3, decay = 0)
Error in nnet.default(trX, trY, size = 3, decay = 0) :
  too many (2107) weights

 So, you need to pass in MaxNWts to nnet() with a value that let's you fit
 the model. Off the top of my head, you could use something like:

MaxNWts  = length(levels(trY))*(max(my.grid$.size) * (nCol + 1) +
 max(my.grid$.size) + 1)

 Also, this one of the methods for getting help (the other is to just email
 me). I also try to keep up on stack exchange too.

 Max



 On Tue, Mar 5, 2013 at 9:47 PM, James Jong ribonucle...@gmail.com wrote:

 The following code fails to train a nnet model in a random dataset using
 caret:

 nR - 700
 nCol - 2000
   myCtrl - trainControl(method=cv, number=3, preProcOptions=NULL,
 classProbs = TRUE, summaryFunction = twoClassSummary)
   trX - data.frame(replicate(nR, rnorm(nCol)))
   trY - runif(1)*trX[,1]*trX[,2]^2+runif(1)*trX[,3]/trX[,4]
   trY - as.factor(ifelse(sign(trY)0,'X1','X0'))
   my.grid - createGrid(method.name, grid.len, data=trX)
   my.model - train(trX,trY,method=method.name
 ,trace=FALSE,trControl=myCtrl,tuneGrid=my.grid,
 metric=ROC)
   print(Done)

 The error I get is:
 task 2 failed - arguments imply differing number of rows: 1334, 666

 However, everything works if I reduce nR to, say 20.

 Any thoughts on what may be causing this? Is there a place where I could
 report this bug other than this mailing list?

 Here is my session info:
  sessionInfo()
 R version 2.15.2 (2012-10-26)
 Platform: x86_64-unknown-linux-gnu (64-bit)

 locale:
 [1] C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] nnet_7.3-5  pROC_1.5.4  caret_5.15-052  foreach_1.4.0
 [5] cluster_1.14.3  plyr_1.8reshape2_1.2.2  lattice_0.20-13

 loaded via a namespace (and not attached):
 [1] codetools_0.2-8 compiler_2.15.2 grid_2.15.2 iterators_1.0.6
 [5] stringr_0.6.2   tools_2.15.2

 Thanks,

 James

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --

 Max


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm Regression takes 24+ GB RAM - Error message

2013-03-06 Thread Jonas125
The datatable (and the split obviously) only contain characters and numeric
data.

I found that 4 regression in a row work if I don't use the calculated
columns as variables but 2 of the original columns. 
RAM usage stays below 3GB!
-- Why does R has such problems with the calculated columns? Their
calculation is already done before the regression starts. 

It's like this:
Create the calculated columns:
Dataset$ExtraColumn1 - Dataset$ColumnA / Dataset$ColumnB
Dataset$ExtraColumn2 - Dataset$ColumnC / Dataset$ColumnD

Perform the split of the dataset inc. calculated columns (the criteria for
the split have a hierarchy):
Datasplit - split(Dataset, paste(Dataset$ColumnE, Dataset$ColumnE))

Perform the regression on the splitted data:
Regression1 - lapply(Datasplit, function(d) lm(ExtraColumn1 ~ ExtraColumn2,
d, na.action = na.omit, singular.ok = TRUE))

BTW: There are no NA values in the data source.

What is my mistake?

When I calculate the columns I might divide by zero (=inf). Could that
create the problem in the regression?

Thanks,
Jonas








--
View this message in context: 
http://r.789695.n4.nabble.com/lm-Regression-takes-24-GB-RAM-Error-message-tp4660434p4660496.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm Regression takes 24+ GB RAM - Error message

2013-03-06 Thread Milan Bouchet-Valat
Le mercredi 06 mars 2013 à 08:31 -0800, Jonas125 a écrit :
 The datatable (and the split obviously) only contain characters and numeric
 data.
 
 I found that 4 regression in a row work if I don't use the calculated
 columns as variables but 2 of the original columns. 
 RAM usage stays below 3GB!
 -- Why does R has such problems with the calculated columns? Their
 calculation is already done before the regression starts. 
 
 It's like this:
 Create the calculated columns:
 Dataset$ExtraColumn1 - Dataset$ColumnA / Dataset$ColumnB
 Dataset$ExtraColumn2 - Dataset$ColumnC / Dataset$ColumnD
 
 Perform the split of the dataset inc. calculated columns (the criteria for
 the split have a hierarchy):
 Datasplit - split(Dataset, paste(Dataset$ColumnE, Dataset$ColumnE))
 
 Perform the regression on the splitted data:
 Regression1 - lapply(Datasplit, function(d) lm(ExtraColumn1 ~ ExtraColumn2,
 d, na.action = na.omit, singular.ok = TRUE))
 
 BTW: There are no NA values in the data source.
 
 What is my mistake?
What's the value of length(Datasplit)? Have you tried running
regressions manually on Datasplit[[1]] and calling object.size() on the
result to see how large it is?


Regards

 When I calculate the columns I might divide by zero (=inf). Could that
 create the problem in the regression?
 
 Thanks,
 Jonas
 
 
 
 
 
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/lm-Regression-takes-24-GB-RAM-Error-message-tp4660434p4660496.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About basic logical operators

2013-03-06 Thread William Dunlap
   x[x0.2]
[1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 
  -0.78

This is a bit off the original topic, but you should really put spaces around 
the .
Otherwise you might be surprised when you compare x to -0.2 instead of +0.2:
x-seq(-1,1,by=0.02)
x[x-0.2] 
   numeric(0)
x
   [1] 0.2


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Duncan Murdoch
 Sent: Wednesday, March 06, 2013 7:57 AM
 To: Victor hyk
 Cc: r-help@r-project.org
 Subject: Re: [R] About basic logical operators
 
 On 05/03/2013 7:53 PM, Victor hyk wrote:
  Hello everyone,
 I have a basic question regarding logical operators.
   x-seq(-1,1,by=0.02)
   x
 [1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 
  -0.78
[13] -0.76 -0.74 -0.72 -0.70 -0.68 -0.66 -0.64 -0.62 -0.60 -0.58 -0.56 
  -0.54
[25] -0.52 -0.50 -0.48 -0.46 -0.44 -0.42 -0.40 -0.38 -0.36 -0.34 -0.32 
  -0.30
[37] -0.28 -0.26 -0.24 -0.22 -0.20 -0.18 -0.16 -0.14 -0.12 -0.10 -0.08 
  -0.06
[49] -0.04 -0.02  0.00  0.02  0.04  0.06  0.08  0.10  0.12  0.14  0.16  
  0.18
[61]  0.20  0.22  0.24  0.26  0.28  0.30  0.32  0.34  0.36  0.38  0.40  
  0.42
[73]  0.44  0.46  0.48  0.50  0.52  0.54  0.56  0.58  0.60  0.62  0.64  
  0.66
[85]  0.68  0.70  0.72  0.74  0.76  0.78  0.80  0.82  0.84  0.86  0.88  
  0.90
[97]  0.92  0.94  0.96  0.98  1.00
   x[x=0.02]
[1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 
  -0.78
  [13] -0.76 -0.74 -0.72 -0.70 -0.68 -0.66 -0.64 -0.62 -0.60 -0.58 -0.56 -0.54
  [25] -0.52 -0.50 -0.48 -0.46 -0.44 -0.42 -0.40 -0.38 -0.36 -0.34 -0.32 -0.30
  [37] -0.28 -0.26 -0.24 -0.22 -0.20 -0.18 -0.16 -0.14 -0.12 -0.10 -0.08 -0.06
  [49] -0.04 -0.02  0.00
   x[x0.2]
[1] -1.00 -0.98 -0.96 -0.94 -0.92 -0.90 -0.88 -0.86 -0.84 -0.82 -0.80 
  -0.78
  [13] -0.76 -0.74 -0.72 -0.70 -0.68 -0.66 -0.64 -0.62 -0.60 -0.58 -0.56 -0.54
  [25] -0.52 -0.50 -0.48 -0.46 -0.44 -0.42 -0.40 -0.38 -0.36 -0.34 -0.32 -0.30
  [37] -0.28 -0.26 -0.24 -0.22 -0.20 -0.18 -0.16 -0.14 -0.12 -0.10 -0.08 -0.06
  [49] -0.04 -0.02  0.00  0.02  0.04  0.06  0.08  0.10  0.12  0.14  0.16  0.18
  [61]  0.20
  
Why does x[x=0.02] return  no 0.02
 
 You don't have a 0.02 in your dataset.  Evaluate x[52] - 0.02 and you
 won't get zero due to rounding (as Jeff said, see FAQ 7.31).
  but x[x0.2] return a subsample with 0.02?
 
 You don't have 0.2, either.  Evaluate x[61] - 0.2 and you get a negative
 value.
 
 Duncan Murdoch
 
Anyone who can tell me why?
Thanks!
 
Victor
 
  [[alternative HTML version deleted]]
 
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chi square exact test

2013-03-06 Thread Knut Krueger

Am 06.03.2013 14:27, schrieb Nicole Ford:
Dear Nicole,
my be you are wondering about, but I know Google an I am using google 
before I am asking here.


If you are more familiar with googl,e please help me to find the search 
term where I can find

the R function for
chi square exact usable for one column test for a sample size less than 6

You are welcome to use this search:

http://www.giyf.com/chi%20square%20exact


Thanks in advane Knut



A quick google search produces multiple results.  Good luck. :)

~Nicole Ford
Ph.D. Student
Graduate Assistant/ Instructor
Department of Government and International Affairs
University of South Florida
office: SOC 012M


Sent from my iPhone

On Mar 6, 2013, at 6:30 AM, Knut Krueger r...@knut-krueger.de wrote:


SPPS is offering a chi square exact test for one dimensional data with small 
sample size (6).

What is the comparable function in R?

Kind Regards Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rainbow producing colors that do not differ sufficiently

2013-03-06 Thread Fisher Dennis
R 2.15.2
OS X

Colleagues,

I often use rainbow to select colors.  I encountered a surprise with 
rainbow(11).   It yielded three greens (in positions 4-6).  The first two of 
these are quite similar.  The man pages suggest that this might be the case:
equispaced hues in RGB space tend to cluster at the red, green and blue 
primaries

The following code illustrates the problem -- the colors labeled 4 and 5 are 
quite similar.
plot(1, type=n, xlim=c(1, 10), ylim=c(0, 1), axes=F, xlab=, ylab=)
for (which in 3:7)  
{
rect(which - 1, 0, which, 1, border=NA, col=rainbow(11)[which])
text(which - 0.5, 0.8, which)
text(which - 0.5, 0.2, rainbow(11)[which], srt=90)
}

In this case, I overcame the problem by replacing one element on the rainbow 
vector with a different green.  

Is there some better approach to this by which I could automate the entire 
process but prevent this similarity of colors?

Dennis

Dennis Fisher MD
P  (The P Less Than Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm Regression takes 24+ GB RAM - Error message

2013-03-06 Thread Jonas125
Length(Datasplit) = 7100

I did a regression for Datasplit[[1]] and the calculated columns -- the
object size is 70 MB. Quite large

Assuming that R cannot handle inf values in regressions (didn't have the
time to google it)
How can I avoid the calculation of infinite values? Like If the denominator
would be zero, choose 0.001 as the denominator instead. 
Dataset[is.infinite(Dataset)] - 0 does not work for me -- default method
not implemented for type 'list' 
class(Dataset) = data.frame



--
View this message in context: 
http://r.789695.n4.nabble.com/lm-Regression-takes-24-GB-RAM-Error-message-tp4660434p4660501.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm Regression takes 24+ GB RAM - Error message

2013-03-06 Thread Milan Bouchet-Valat
Le mercredi 06 mars 2013 à 09:18 -0800, Jonas125 a écrit :
 Length(Datasplit) = 7100
 
 I did a regression for Datasplit[[1]] and the calculated columns -- the
 object size is 70 MB. Quite large
7100*70/1024 = 485 (GB)

No wonder why you run out of memory quite fast.

You probably do not need to store the whole lm objects: usually you need
coefficients, R-squared, things like that. So instead of returning the
objects, return a vector or a list with only the elements you need, you
will save much space.

And if you really need the objects, set these lm() arguments to FALSE to
make the result smaller:
model, x, y, qr: logicals.  If ‘TRUE’ the corresponding components of
  the fit (the model frame, the model matrix, the response, the
  QR decomposition) are returned.

 Assuming that R cannot handle inf values in regressions (didn't have the
 time to google it)
 How can I avoid the calculation of infinite values? Like If the denominator
 would be zero, choose 0.001 as the denominator instead. 
 Dataset[is.infinite(Dataset)] - 0 does not work for me -- default method
 not implemented for type 'list' 
 class(Dataset) = data.frame
I don't understand why you think infinite values can trigger a memory
problem. Why don't you just try it?

 lm(c(1, Inf) ~ c(1, 2))
Erreur dans lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  NA/NaN/Inf in 'y'
 lm(c(1, 2) ~ c(1, Inf))
Erreur dans lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  NA/NaN/Inf in 'x'

So, if anything, this would stop your lapply() call sooner or later, and
save your machine from freezing.



Regards


 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/lm-Regression-takes-24-GB-RAM-Error-message-tp4660434p4660501.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chi square exact test

2013-03-06 Thread Milan Bouchet-Valat
Le mercredi 06 mars 2013 à 18:03 +0100, Knut Krueger a écrit :
 Am 06.03.2013 14:27, schrieb Nicole Ford:
 Dear Nicole,
 my be you are wondering about, but I know Google an I am using google 
 before I am asking here.
 
 If you are more familiar with googl,e please help me to find the search 
 term where I can find
 the R function for
 chi square exact usable for one column test for a sample size less than 6
 
 You are welcome to use this search:
 
 http://www.giyf.com/chi%20square%20exact
 
 
 Thanks in advane Knut
See ?fisher.test.


Regards

  A quick google search produces multiple results.  Good luck. :)
 
  ~Nicole Ford
  Ph.D. Student
  Graduate Assistant/ Instructor
  Department of Government and International Affairs
  University of South Florida
  office: SOC 012M
 
 
  Sent from my iPhone
 
  On Mar 6, 2013, at 6:30 AM, Knut Krueger r...@knut-krueger.de wrote:
 
  SPPS is offering a chi square exact test for one dimensional data with 
  small sample size (6).
 
  What is the comparable function in R?
 
  Kind Regards Knut
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chi square exact test

2013-03-06 Thread Knut Krueger
Am 06.03.2013 18:29, schrieb Milan Bouchet-Valat:
 Le mercredi 06 mars 2013 à 18:03 +0100, Knut Krueger a écrit :
 Am 06.03.2013 14:27, schrieb Nicole Ford:
 Dear Nicole,
 my be you are wondering about, but I know Google an I am using google
 before I am asking here.

 If you are more familiar with googl,e please help me to find the search
 term where I can find
 the R function for
 chi square exact usable for one column test for a sample size less than 6

 You are welcome to use this search:

 http://www.giyf.com/chi%20square%20exact


 Thanks in advane Knut
 See ?fisher.test.
fisher test needs two columns I need  a one column exact test
|x| 

either a two-dimensional contingency table in matrix form, or a factor 
object.

|y| 

a factor object; ignored if |x| is a matrix.



Knut

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Random Sampling

2013-03-06 Thread Angelo Scozzarella Tiscali
When the population values are not distributed symmetrically about the mean, 
reporting the mean and standard deviation can give the reader an inaccurate 
impression of the distribution of values in the population. 
I'd like generating random samples with same mean and standard deviation, but 
not necessarily same distribution.

Thanks

Angelo
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?

2013-03-06 Thread John Kane
For the simplist of the issues use  scale_shape(solid = FALSE) to get hollow 
points
 Using your data (below) this seems to work
p1  -  ggplot(SummAB, aes(factor3, mean,
  colour = factor1, group = factor1,shape = factor1)) + 
scale_y_continuous(guide_legend(legend.position=c(4 ,6))) +
   scale_shape(solid = FALSE) + guides(colour = guide_legend(title.position 
= right)) +
   geom_point(aes(shape=factor(factor1)), color=black, fill=white,
position = dodge, width = 0.3, size=3) +
   geom_line(aes(linetype=factor1), color = black, size = 0.5) +
   geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
position = dodge, color = black, size=0.3) +
  theme_bw() +
   ylab(expression(paste(my measured stuff))) +
   xlab(factor3) + ggtitle() +
  labs(color = factor1, shape = factor1, group = factor1,
   linetype = factor1)
p1

John Kane
Kingston ON Canada


 -Original Message-
 From: a...@ecology.su.se
 Sent: Wed, 06 Mar 2013 13:32:42 +0100
 To: r-help@r-project.org
 Subject: [R] Ggplot2: Moving legend, change fill and removal of space
 between plots when using grid.arrange() possible use of facet_grid?
 
 Hi,
 
 # For publications, I am not allowed to repeat the axes. I have tried to
 remove the axes using:
 # yaxt=n, but it did not work. I have not understood how to do this in
 ggplot2. Can you help me?
 # I also do not want loads of space between the graphs (see below script
 with Dummy Data).
 # If I could make it look like the examples on the (nice) examples page:
 # http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html
 # using the facet_grid(), I would be very very happy.
 
 # I also do not want the gemoetric points to be filled and the
 fill=white
 commande
 # does not seem to work - why? and are there alternatives?
 
 #Furthermore, I would like to add legends to inside the plot area instead
 of
 on the side. Like when you use plotrix() and brkdn.plot:
 legend(topright, c(A, B), pch=c(0,1), bg=white,
lty = 1:2, cex=1, bty=n)
 # This did not work in ggplot2. What are my alternatives. I have
 extensively
 searched the internet and have I missed something obvious, it was due to
# tiredness and not to lazyness.
 
 # Some dummy data:
 mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)),
   factor2 = factor(rep(c(1:5), each = 16)),
   factor3 = factor(rep(c(1:4), each = 4)),
   var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40),
   sd = rep(c(1, 2, 3), each = 20)),
   var2 = rnorm(120, mean = rep(c(6, 7, 8), each = 40),
   sd = rep(c(1, 2, 3), each = 20)))
 
 
 # Splitting data into 3 data frames (based on factor1)
 # If I could do this using for example facet_wrap() or facet_grid(), I
 would
 be very
 # happy! I have tried but failed that method.
 
 DataAB - mydata[(mydata$factor1) %in% c(A, B), ]
 DataCD - mydata[(mydata$factor1) %in% c(C, D), ]
 DataEF - mydata[(mydata$factor1) %in% c(E, F), ]
 DataAB
 library(plyr)
 library(ggplot2)
 
 #Plot: levels A and B:
 # Summary (means etc)
 SummAB  -   ddply(DataAB, .(factor3,factor1), summarize,
mean = mean(var1, na.rm = FALSE),
sdv = sd(var1, na.rm = FALSE),
se = 1.96*(sd(var1,
 na.rm=FALSE)/sqrt(length(var1
 SummAB
 p1  -  ggplot(SummAB, aes(factor3, mean,
colour = factor1, group = factor1,
shape = factor1)) +
   geom_point(aes(shape=factor(factor1)), color=black, fill=white,
  position = dodge, width = 0.3, size=3) +
   geom_line(aes(linetype=factor1), color = black, size = 0.5) +
   geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
 position = dodge, color = black, size=0.3) +
   theme_bw() +
   ylab(expression(paste(my measured stuff))) +
   xlab(factor3) + ggtitle() +
   labs(color = factor1, shape = factor1, group = factor1,
linetype = factor1)
 p1
 
 #Plot: levels C and D:
 # Summary (means etc)
 SummCD  -   ddply(DataCD, .(factor3,factor1), summarize,
mean = mean(var1, na.rm = FALSE),
sdv = sd(var1, na.rm = FALSE),
se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1
 
 p2  -  ggplot(SummCD, aes(factor3, mean,
colour = factor1, group = factor1,
shape = factor1)) +
   geom_point(aes(shape=factor(factor1)), color=black, fill=white,
  position = dodge, width = 0.3, size=3) +
   geom_line(aes(linetype=factor1), color = black, size = 0.5) +
   geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
 position = dodge, color = black, size=0.3) +
   theme_bw() +
   ylab(expression(paste(my measured stuff))) +
   xlab(factor3) + ggtitle() +
   labs(color = factor1, shape = 

Re: [R] need help using read.fortran

2013-03-06 Thread Duncan Murdoch

On 06/03/2013 12:57 PM, jsdroyster wrote:

Hello kind and R-knowledgeable souls!

I am trying to use read.fortran to read in old datasets in 80-column-card format
with no separators between variables (just 80 columns of solid digits).
I comprehend the instructions for specifying the columns for each variable, but
I can't understand how to assign the variable names after reading the help pages
for read.fortran, read.fwf and read.table.

I tried putting a col.names section in the read.fortran statement:

AN35 -data.frame(read.fortran(filename,c(I9,4I2,2I1,3I2,I1,16I2,
I1,I5,I1,I3,I2,  A4,3A1,A2,A1),
   header = FALSE,skip=0,sep=@,
   col.names =
paste(idno,empmo,empyr,birthmo,birthyr,sex,race,teno,testmo,testyr,
  testtyp,L500,L1k,L2k,L3k,L4k,L6k,L8k,RT1k,R500,R1k,R2k,R3k,R4k,R6k,R8k,
  HPD,dept,shift,TWA,envclas,jobcode,hobby.med.STS,audclas,disp),


The paste() call will try to find variables with those names, and 
concatenate their contents.  That's not what you want.  You want 
something like


col.names = c(idno, empmo, )



   row.names = (idno) ))


I also tried a separate dimnames statement like so:

dimnames(AN35)[[2]] -
c(idno,empmo,empyr,birthmo,birthyr,sex,race,
 
teno,testmo,testyr,testtyp,L500,L1k,L2k,L3k,L4k,L6k,L8k,RT1k,


 R500,R1k,R2k,R3k,R4k,R6k,R8k,HPD,dept,shift,TWA,
 envclas,jobcode,hobby,med,STS,audclas,disp)


I copied this from some documentation but I have no clue what the [[2]]  means.


dimnames() is a function that returns a list of row names and column 
names.  The column names are the second component, so


dimnames(AN35)[[2]] - something

changes the column names.

Duncan Murdoch


If anyone has one good example that would help me a lot!
Thanks in advance!
Julie
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] need help using read.fortran

2013-03-06 Thread jsdroyster
Hello kind and R-knowledgeable souls!

I am trying to use read.fortran to read in old datasets in 80-column-card 
format 
with no separators between variables (just 80 columns of solid digits).
I comprehend the instructions for specifying the columns for each variable, but 
I can't understand how to assign the variable names after reading the help 
pages 
for read.fortran, read.fwf and read.table.

I tried putting a col.names section in the read.fortran statement:

AN35 -data.frame(read.fortran(filename,c(I9,4I2,2I1,3I2,I1,16I2,
I1,I5,I1,I3,I2,  A4,3A1,A2,A1),
  header = FALSE,skip=0,sep=@,
  col.names = 
paste(idno,empmo,empyr,birthmo,birthyr,sex,race,teno,testmo,testyr,
 testtyp,L500,L1k,L2k,L3k,L4k,L6k,L8k,RT1k,R500,R1k,R2k,R3k,R4k,R6k,R8k,
 HPD,dept,shift,TWA,envclas,jobcode,hobby.med.STS,audclas,disp),
  row.names = (idno) ))


I also tried a separate dimnames statement like so:

dimnames(AN35)[[2]] - 
c(idno,empmo,empyr,birthmo,birthyr,sex,race,

teno,testmo,testyr,testtyp,L500,L1k,L2k,L3k,L4k,L6k,L8k,RT1k,

R500,R1k,R2k,R3k,R4k,R6k,R8k,HPD,dept,shift,TWA,
envclas,jobcode,hobby,med,STS,audclas,disp)


I copied this from some documentation but I have no clue what the [[2]]  means.

If anyone has one good example that would help me a lot!
Thanks in advance!
Julie
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] robustbase adjbox segfault - memory not mapped

2013-03-06 Thread Baan

Glad to know. Thanks.

Regards
Baan

On Wednesday 06 March 2013 02:15 PM, Martin Maechler wrote:

B == Baan  baanba...@gmail.com
 on Mon, 4 Mar 2013 22:47:10 +0530 writes:

 B Thank you Martin. Look forward to the fix.

Committed to the R-forge version of robustbase.

It was a simple integer overflow, indeed,
necessarily happening when the sample size was = 2^16.5.

I'm planning to submit  robustbase_0.9-7  to CRAN today.
Martin

 B Regards
 B Baan


 B On Monday 04 March 2013 10:19 PM, Martin Maechler wrote:
  B == Baan  baanba...@gmail.com
  on Mon, 4 Mar 2013 15:02:02 +0530 writes:
 B Hi, I encountered a segfault, memory not mapped error
 B when using adjbox in robustbase. In trying to recreate
 B the issue I found that the error occurs only for large
 B sample size. Here is the code.
 
   require(robustbase)
 B Loading required package: robustbase
   x - rnorm(10)
   y - rep(1, 10)
   adjbox(x ~ y) ## gives a plot
   x - rnorm(1)
   y - rep(1, 1)
   adjbox(x ~ y) ## gives a plot
   x - rnorm(10)
   y - rep(1, 10)
   adjbox(x ~ y)
 
 B *** caught segfault ***
 B address 0xfffcc47af530, cause 'memory not mapped'
 
 
 B Traceback:
 B 1: .C(mc_C, x, n, eps = eps, iter = c.iter, medc = double(1))
 B 2: mcComp(x, doReflect, eps1 = eps1, eps2 = eps2, maxit = maxit,
 B trace.lev = trace.lev)
 B 3: mc.default(x, ..., na.rm = TRUE)
 B 4: mc(x, ..., na.rm = TRUE)
 B 5: adjboxStats(unclass(groups[[i]]), coef = range, doReflect = 
doReflect)
 B 6: adjbox.default(split(mf[[response]], mf[-response]), ...)
 B 7: adjbox(split(mf[[response]], mf[-response]), ...)
 B 8: adjbox.formula(x ~ y)
 B 9: adjbox(x ~ y)
 
  Indeed, I (as maintainer of robustbase) can reproduce the
  segfault *even* though you did not specify the random seed...
 
  So this should be fixed ... hopefully within a week or so,
  but I am not promising anything, given my busy schedule!
 
  Martin Maechler,
  ETH Zurich
 
  []
 
 B My setup details:
 
 B R --version
 B R version 2.15.2 (2012-10-26) -- Trick or Treat
 
 B Package:robustbase
 B Version:0.9-5
 B Date:   2012-03-01
 B Packaged:   2013-03-01 16:34:03 UTC; maechler
 B NeedsCompilation:   yes
 B Repository: CRAN
 B Date/Publication:   2013-03-01 18:31:33
 B Built:  R 2.15.2; x86_64-pc-linux-gnu; 2013-03-04 05:54:20
 B UTC; unix
 
 
 B Platform: x86_64-pc-linux-gnu (64-bit)
 B uname -a
 B Linux R 2.6.32-5-amd64 #1 SMP Mon Feb 25 00:26:11 UTC 2013 x86_64 
GNU/Linux
 B Debian squeeze
 
 B Could someone pls help.
 
 B Regards
 B Baan


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Graph from Glantz

2013-03-06 Thread Angelo Scozzarella Tiscali
Hi,

I'd like to draw a graph like this one from Stanton Glantz book, Primer of 
Biostatistics.





Thanks

Angelo
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?

2013-03-06 Thread John Kane
Placing a legend.  

 z - ggplot(mtcars, aes(wt, mpg, colour = factor(cyl))) + geom_point()
  z + theme(legend.position = c(.5, .5))
  
Currently this does not appear to work in RStudio but seems fine if I use gedit 
or if I run R in a terminal session.  

John Kane
Kingston ON Canada


 -Original Message-
 From: a...@ecology.su.se
 Sent: Wed, 06 Mar 2013 13:32:42 +0100
 To: r-help@r-project.org
 Subject: [R] Ggplot2: Moving legend, change fill and removal of space
 between plots when using grid.arrange() possible use of facet_grid?
 
 Hi,
 
 # For publications, I am not allowed to repeat the axes. I have tried to
 remove the axes using:
 # yaxt=n, but it did not work. I have not understood how to do this in
 ggplot2. Can you help me?
 # I also do not want loads of space between the graphs (see below script
 with Dummy Data).
 # If I could make it look like the examples on the (nice) examples page:
 # http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html
 # using the facet_grid(), I would be very very happy.
 
 # I also do not want the gemoetric points to be filled and the
 fill=white
 commande
 # does not seem to work - why? and are there alternatives?
 
 #Furthermore, I would like to add legends to inside the plot area instead
 of
 on the side. Like when you use plotrix() and brkdn.plot:
 legend(topright, c(A, B), pch=c(0,1), bg=white,
lty = 1:2, cex=1, bty=n)
 # This did not work in ggplot2. What are my alternatives. I have
 extensively
 searched the internet and have I missed something obvious, it was due to
# tiredness and not to lazyness.
 
 # Some dummy data:
 mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)),
   factor2 = factor(rep(c(1:5), each = 16)),
   factor3 = factor(rep(c(1:4), each = 4)),
   var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40),
   sd = rep(c(1, 2, 3), each = 20)),
   var2 = rnorm(120, mean = rep(c(6, 7, 8), each = 40),
   sd = rep(c(1, 2, 3), each = 20)))
 
 
 # Splitting data into 3 data frames (based on factor1)
 # If I could do this using for example facet_wrap() or facet_grid(), I
 would
 be very
 # happy! I have tried but failed that method.
 
 DataAB - mydata[(mydata$factor1) %in% c(A, B), ]
 DataCD - mydata[(mydata$factor1) %in% c(C, D), ]
 DataEF - mydata[(mydata$factor1) %in% c(E, F), ]
 DataAB
 library(plyr)
 library(ggplot2)
 
 #Plot: levels A and B:
 # Summary (means etc)
 SummAB  -   ddply(DataAB, .(factor3,factor1), summarize,
mean = mean(var1, na.rm = FALSE),
sdv = sd(var1, na.rm = FALSE),
se = 1.96*(sd(var1,
 na.rm=FALSE)/sqrt(length(var1
 SummAB
 p1  -  ggplot(SummAB, aes(factor3, mean,
colour = factor1, group = factor1,
shape = factor1)) +
   geom_point(aes(shape=factor(factor1)), color=black, fill=white,
  position = dodge, width = 0.3, size=3) +
   geom_line(aes(linetype=factor1), color = black, size = 0.5) +
   geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
 position = dodge, color = black, size=0.3) +
   theme_bw() +
   ylab(expression(paste(my measured stuff))) +
   xlab(factor3) + ggtitle() +
   labs(color = factor1, shape = factor1, group = factor1,
linetype = factor1)
 p1
 
 #Plot: levels C and D:
 # Summary (means etc)
 SummCD  -   ddply(DataCD, .(factor3,factor1), summarize,
mean = mean(var1, na.rm = FALSE),
sdv = sd(var1, na.rm = FALSE),
se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1
 
 p2  -  ggplot(SummCD, aes(factor3, mean,
colour = factor1, group = factor1,
shape = factor1)) +
   geom_point(aes(shape=factor(factor1)), color=black, fill=white,
  position = dodge, width = 0.3, size=3) +
   geom_line(aes(linetype=factor1), color = black, size = 0.5) +
   geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
 position = dodge, color = black, size=0.3) +
   theme_bw() +
   ylab(expression(paste(my measured stuff))) +
   xlab(factor3) + ggtitle() +
   labs(color = factor1, shape = factor1, group = factor1,
linetype = factor1)
 p2
 
 #Plot: levels C and D:
 # Summary (means etc)
 SummEF  -   ddply(DataEF, .(factor3,factor1), summarize,
mean = mean(var1, na.rm = FALSE),
sdv = sd(var1, na.rm = FALSE),
se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1
 
 p3  -  ggplot(SummEF, aes(factor3, mean,
colour = factor1, group = factor1,
shape = factor1)) +
   geom_point(aes(shape=factor(factor1)), color=black, fill=white,
 #Why
 is the fill commando not working?
  position = dodge, width = 0.3, size=3) +
   

Re: [R] chi square exact test

2013-03-06 Thread Milan Bouchet-Valat
Le mercredi 06 mars 2013 à 18:38 +0100, Knut Krueger a écrit :
 Am 06.03.2013 18:29, schrieb Milan Bouchet-Valat:
  Le mercredi 06 mars 2013 à 18:03 +0100, Knut Krueger a écrit :
  Am 06.03.2013 14:27, schrieb Nicole Ford:
  Dear Nicole,
  my be you are wondering about, but I know Google an I am using google
  before I am asking here.
 
  If you are more familiar with googl,e please help me to find the search
  term where I can find
  the R function for
  chi square exact usable for one column test for a sample size less than 6
 
  You are welcome to use this search:
 
  http://www.giyf.com/chi%20square%20exact
 
 
  Thanks in advane Knut
  See ?fisher.test.
 fisher test needs two columns I need  a one column exact test
 |x|   
 
 either a two-dimensional contingency table in matrix form, or a factor 
 object.
 
 |y|   
 
 a factor object; ignored if |x| is a matrix.
Sorry, I missed that part. Can you tell us more about the test you do in
SPSS? Are you testing the adequacy of a given distribution to the data?
In short: what do you test?

Is that test documented somewhere? I found this document, but there does
not seem to be such a test there:
http://www.sussex.ac.uk/its/pdfs/SPSS_Exact_Tests_20.pdf


Regards

 
 Knut
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Graph from Glantz

2013-03-06 Thread John Kane
No link and/ no attached file.  The list tends to strip most attachments to 
reduce virus attacks.  

John Kane
Kingston ON Canada


 -Original Message-
 From: angeloscozzare...@tiscali.it
 Sent: Wed, 6 Mar 2013 19:53:18 +0100
 To: r-help@r-project.org
 Subject: [R] Graph from Glantz
 
 Hi,
 
 I'd like to draw a graph like this one from Stanton Glantz book, Primer
 of Biostatistics.
 
 
 
 
 
 Thanks
 
 Angelo
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks  orcas on your 
desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with a function and text

2013-03-06 Thread Eliano Marques
Hi, can I understand why this message was rejected ?
Thanks,
Eliano

Sent from my iPhone

On 6 Mar 2013, at 19:18, Eliano eliano.m.marq...@gmail.com wrote:

 Hi everyone,

 I am writing some code to generate a function. I am passing that code to a
 dataset which i'm importing in R, e.g.
 Test=read.table('C:/test.txt', header=F, sep='\t', na.strings='NA', dec='.',
 strip.white=TRUE)
 Test

 V1
 (if(nclusters0){OptmizationInputs[3,3]*beta[1]}else{0})+
 (if(nclusters1){OptmizationInputs[3,3]*beta[1]}else{0})+
 V1 has inside a code for a function.

 I'm having problems with 2 things:

 1 - I need to take out from V1 all  that appears in the text, i tried a
 replace but did not work.
 Test=replace(Test,'  ', ' ')  , did not work.

 2 - Writing a function like this :

 nlog=function(par)
{
beta=par[1:n]
Measure=Test[1]  # would this read the text?
return(Measure)
}

 So i need to use that code inside the function as above.
 Any suggestion on how you would do this?

 Kind Regards,
 Eliano



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523.html
 Sent from the R help mailing list archive at Nabble.com.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?

2013-03-06 Thread John Kane
Replying to my own post RStudio is doing this fine once I had rebooted R.  I 
must have had some strange stuff loaded that I had not realised was there.

John Kane
Kingston ON Canada


 -Original Message-
 From: jrkrid...@inbox.com
 Sent: Wed, 6 Mar 2013 11:16:28 -0800
 To: a...@ecology.su.se, r-help@r-project.org
 Subject: Re: [R] Ggplot2: Moving legend, change fill and removal of space
 between plots when using grid.arrange() possible use of facet_grid?
 
 Placing a legend.
 
  z - ggplot(mtcars, aes(wt, mpg, colour = factor(cyl))) + geom_point()
   z + theme(legend.position = c(.5, .5))
 
 Currently this does not appear to work in RStudio but seems fine if I use
 gedit or if I run R in a terminal session.
 
 John Kane
 Kingston ON Canada
 
 
 -Original Message-
 From: a...@ecology.su.se
 Sent: Wed, 06 Mar 2013 13:32:42 +0100
 To: r-help@r-project.org
 Subject: [R] Ggplot2: Moving legend, change fill and removal of space
 between plots when using grid.arrange() possible use of facet_grid?
 
 Hi,
 
 # For publications, I am not allowed to repeat the axes. I have tried to
 remove the axes using:
 # yaxt=n, but it did not work. I have not understood how to do this in
 ggplot2. Can you help me?
 # I also do not want loads of space between the graphs (see below script
 with Dummy Data).
 # If I could make it look like the examples on the (nice) examples page:
 # http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html
 # using the facet_grid(), I would be very very happy.
 
 # I also do not want the gemoetric points to be filled and the
 fill=white
 commande
 # does not seem to work - why? and are there alternatives?
 
 #Furthermore, I would like to add legends to inside the plot area
 instead
 of
 on the side. Like when you use plotrix() and brkdn.plot:
 legend(topright, c(A, B), pch=c(0,1), bg=white,
lty = 1:2, cex=1, bty=n)
 # This did not work in ggplot2. What are my alternatives. I have
 extensively
 searched the internet and have I missed something obvious, it was due to
# tiredness and not to lazyness.
 
 # Some dummy data:
 mydata- data.frame(factor1 = factor(rep(LETTERS[1:6], each = 80)),
   factor2 = factor(rep(c(1:5), each = 16)),
   factor3 = factor(rep(c(1:4), each = 4)),
   var1 = rnorm(120, mean = rep(c(0, 3, 5), each = 40),
   sd = rep(c(1, 2, 3), each = 20)),
   var2 = rnorm(120, mean = rep(c(6, 7, 8), each = 40),
   sd = rep(c(1, 2, 3), each = 20)))
 
 
 # Splitting data into 3 data frames (based on factor1)
 # If I could do this using for example facet_wrap() or facet_grid(), I
 would
 be very
 # happy! I have tried but failed that method.
 
 DataAB - mydata[(mydata$factor1) %in% c(A, B), ]
 DataCD - mydata[(mydata$factor1) %in% c(C, D), ]
 DataEF - mydata[(mydata$factor1) %in% c(E, F), ]
 DataAB
 library(plyr)
 library(ggplot2)
 
 #Plot: levels A and B:
 # Summary (means etc)
 SummAB  -   ddply(DataAB, .(factor3,factor1), summarize,
mean = mean(var1, na.rm = FALSE),
sdv = sd(var1, na.rm = FALSE),
se = 1.96*(sd(var1,
 na.rm=FALSE)/sqrt(length(var1
 SummAB
 p1  -  ggplot(SummAB, aes(factor3, mean,
colour = factor1, group = factor1,
shape = factor1)) +
   geom_point(aes(shape=factor(factor1)), color=black, fill=white,
  position = dodge, width = 0.3, size=3) +
   geom_line(aes(linetype=factor1), color = black, size = 0.5) +
   geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
 position = dodge, color = black, size=0.3) +
   theme_bw() +
   ylab(expression(paste(my measured stuff))) +
   xlab(factor3) + ggtitle() +
   labs(color = factor1, shape = factor1, group = factor1,
linetype = factor1)
 p1
 
 #Plot: levels C and D:
 # Summary (means etc)
 SummCD  -   ddply(DataCD, .(factor3,factor1), summarize,
mean = mean(var1, na.rm = FALSE),
sdv = sd(var1, na.rm = FALSE),
se = 1.96*(sd(var1, na.rm=FALSE)/sqrt(length(var1
 
 p2  -  ggplot(SummCD, aes(factor3, mean,
colour = factor1, group = factor1,
shape = factor1)) +
   geom_point(aes(shape=factor(factor1)), color=black, fill=white,
  position = dodge, width = 0.3, size=3) +
   geom_line(aes(linetype=factor1), color = black, size = 0.5) +
   geom_errorbar(aes(ymin = mean - sdv , ymax = mean + sdv), width = 0.3,
 position = dodge, color = black, size=0.3) +
   theme_bw() +
   ylab(expression(paste(my measured stuff))) +
   xlab(factor3) + ggtitle() +
   labs(color = factor1, shape = factor1, group = factor1,
linetype = factor1)
 p2
 
 #Plot: levels C and D:
 # Summary (means etc)
 SummEF  -   ddply(DataEF, .(factor3,factor1), summarize,
mean = 

Re: [R] Learning the R way – A Wish

2013-03-06 Thread Patrick Burns

On 06/03/2013 07:20, Andrew Hoerner wrote:

Dear Patrick--
After the official Core Team's R manuals and the individual function
help pages, I have found The R Inferno to be the single most useful
piece of documentation when I have gotten stuck with a R problems. It is
the only introduction that seems to be aware of the ambiguities present
in the official documentation and of some of the ways one can get stuck
in traps of misunderstanding. Plus, it is enjoyably witty.

When I first started using it, I found it ranged from very useful to
pretty frustrating. I did not always understand what the examples you
presented were trying to say. It is still true that I occasionally wish
for a little more discursive explanatory style, but as time goes by I


Actually I find myself sometimes thinking the same thing.

Pat



find that I am increasingly likely to get the point just from the example.

Many thanks, Andrew


On Tue, Mar 5, 2013 at 1:46 AM, Patrick Burns pbu...@pburns.seanet.com
mailto:pbu...@pburns.seanet.com wrote:

Andrew,

That sounds like a sensible document you propose.
Perhaps I'll do a few blog posts along that vein -- thanks.

I presume you know of 'The R Inferno', which does
a little of what you want.

Pat



On 04/03/2013 23:42, andrewH wrote:

There is something that I wish I had that I think would help me
a lot to be a
better R programmer, that I think would probably help many
others as well.
I put the wish out there in the hopes that someone might think
it was worth
doing at some point.

I wish I had the code of some substantial, widely used package –
lm, say –
heavily annotated and explained at roughly the level of R
knowledge of
someone who has completed an intro statistics course using R and
picked up
some R along the way.  The idea is that you would say what the
various
blocks of code are doing, why the authors chose to do it this
way rather
than some other way, point out coding techniques that save time
or memory or
prevent errors relative to alternatives, and generally, to
explain what it
does and point out and explain as many of the smarter features
as possible.
Ideally, this would include a description at least at the
conceptual level
if not at the code level of the major C functions that the
package calls, so
that you understand at least what is happening at that level, if
not the
nitty-gritty details of coding.

I imagine this as a piece of annotated code, but maybe it could
be a video
of someone, or some couple of people, scrolling through the code
and talking
about it. Or maybe something more like a wiki page, with various
people
contributing explanations for different lines, sections, and
practices.

I am learning R on my own from books and the internet, and I
think I would
learn a lot from a chatty line-by-line description of some
substantial block
of code by someone who really knows what he or she is doing –
perhaps with a
little feedback from some people who are new about where they
get lost in
the description.

There are a couple of particular things that I personally would
hope to get
out of this.  First, there are lots of instances of good coding
practice
that I think most people pick up from other programmers or by having
individual bits of code explained to them that are pretty hard
to get from
books and help files.  I think this might be a good way to get
at them.

Second, there are a whole bunch of functions in R that I call
meta-programming functions – don’t know if they have a more
proper name.
These are things that are intended primarily to act on R
language objects or
to control how R objects are evaluated. They include functions
like call,
match.call, parse and deparse, deparen, get, envir, substitute,
eval, etc.
Although I have read the individual documentation for many of
these command,
and even used most of them, I don’t think I have any fluency
with them, or
understand well how and when to code with them.  I think reading a
good-sized hunk of code that uses these functions to do a lot of
things that
packages often need to do in the best-practice or standard R
way, together
with comments that describe and explain them would help a lot
with that.
(There is a good smaller-scale example of this in Friedrich Leisch’s
tutorial on creating R packages).

These are things I think I probably share with many others. I

Re: [R] Troubles with labeling x axis

2013-03-06 Thread Peter Ehlers

On 2013-03-06 06:07, iDa wrote:

Hi!

I have problems with labeling x axis while plotting time series data. I have
40 monthly measurement. One period lasts 4 months. I'd like to have 40 ticks
on x axis (10 larger, the rest smaller) and labels just at the beginning of
each period, just like in the image
http://r.789695.n4.nabble.com/file/n4660465/2221.jpg

My code leaves x axis empty:


data - read.csv(file=CSV files/Komen.csv, head=TRUE, sep=;)
dataTimeSeries - ts(data, frequency=12, start=c(2000,4))
dataTimeSeries

  Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2000   7  45  47   3  24 132  35  32  28
2001 161  48  31  33 161 154 420  19 149  44  54  16
2002 152  94  43  64 193  85  98  77 236  87  72  47
2003 196 120  51  27 143  99  56

require(graphics)
plot.ts(dataTimeSeries, xaxt=n, xlab= Perioda, ylab= Opazovane
vrednosti, type='l', col='red')
axis(side=1, at=seq(1,40,4), labels=seq(1,10,1))


Thanks in advance for any help!


Have a look at what

  par(usr)

gives to see that your at setting makes no sense.

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Course: Beginner's Guide to MCMC, GLM and GAM with R

2013-03-06 Thread Highland Statistics Ltd


There are a few places left on the following course:   Beginner's Guide 
to MCMC, GLM and GAM with R



When:  10 - 13 June 2013
Where: SAMS, Oban, Scotland


Further information: http://www.highstat.com/statscourse.htm
Flyer: http://www.highstat.com/Courses/Flyer2013June_SAMS.pdf

Kind regards,

Alain Zuur

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Issue when reading a table into R

2013-03-06 Thread Paul Bernal
Hello everyone,

I was reading a table into R, and when trying to retrieve it the following
message appeared:

 [ reached getOption(max.print) -- omitted 469376 rows ]

Does this mean that R left out 469376 rows? Or R is taking those 469376
rows as well and the limitation is only for printing purposes?

Thanks in advance for any help,

Best regards,

Paul

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue when reading a table into R

2013-03-06 Thread Rich Shepard

On Wed, 6 Mar 2013, Paul Bernal wrote:


I was reading a table into R, and when trying to retrieve it the following
message appeared:

[ reached getOption(max.print) -- omitted 469376 rows ]

Does this mean that R left out 469376 rows? Or R is taking those 469376
rows as well and the limitation is only for printing purposes?


Paul,

  I see this message when I look at the contents of a data frame that is
very large. The data are all there but there is a limit to the number of
rows that will be 'printed' to the display.

  If you use the str() function you'll see the number of rows as well as
descriptions of the column contents.

HTH,

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue when reading a table into R

2013-03-06 Thread Duncan Murdoch

On 06/03/2013 3:58 PM, Paul Bernal wrote:

Hello everyone,

I was reading a table into R, and when trying to retrieve it the following
message appeared:

  [ reached getOption(max.print) -- omitted 469376 rows ]

Does this mean that R left out 469376 rows? Or R is taking those 469376
rows as well and the limitation is only for printing purposes?


Only for printing.  You can find out what the object looks like 
internally by str(x) or similar function.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue when reading a table into R

2013-03-06 Thread jim holtman
just a limitation on the printing of the data to the console.  Change the
'max.print' option if you want more lines output to the console.

On Wed, Mar 6, 2013 at 3:58 PM, Paul Bernal paulberna...@gmail.com wrote:

 Hello everyone,

 I was reading a table into R, and when trying to retrieve it the following
 message appeared:

  [ reached getOption(max.print) -- omitted 469376 rows ]

 Does this mean that R left out 469376 rows? Or R is taking those 469376
 rows as well and the limitation is only for printing purposes?

 Thanks in advance for any help,

 Best regards,

 Paul

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to get xmlToList() to retry if http fails

2013-03-06 Thread Waichler, Scott R
Hi,

I am using xmlToList() in a loop with a call to a webservice, per the code 
below.  

  # Loop thru target locs
  for(i in 1:num.target.locs) {
url - paste(sep=/, http://www.earthtools.org/timezone;, lat[i], lon[i])
tmp - xmlToList(url)
df$time.offset[i] - tmp$offset
system(sleep 1)  # wait 1 second per requirements of above web service
  }  # end loop thru target locations

Failure struck midway through my loop, with the message below.

failed to load HTTP resource
Error: 1: failed to load HTTP resource

I presume that the webservice failed to respond in this instance.  How can I 
trap the error and have it retry after waiting a second or two, instead of 
exiting?

Thanks.  --Scott Waichler
Pacific Northwest National Laboratory
Richland, WA, USA
scott.waich...@pnnl.gov

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chi square exact test

2013-03-06 Thread David L Carlson
Actually, the http://www.sussex.ac.uk/its/pdfs/SPSS_Exact_Tests_20.pdf file 
indicates that for small samples and a one-way chi square test, SPSS uses a 
multinomial distribution to tabulate the distribution of chi square for a given 
N, K, and probability of membership in each group. In package stats, the 
dmultinom() function can be used to accomplish this. The last example on the 
help page shows the steps. 

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Milan Bouchet-Valat
 Sent: Wednesday, March 06, 2013 1:17 PM
 To: Knut Krueger
 Cc: r-h...@stat.math.ethz.ch
 Subject: Re: [R] chi square exact test
 
 Le mercredi 06 mars 2013 à 18:38 +0100, Knut Krueger a écrit :
  Am 06.03.2013 18:29, schrieb Milan Bouchet-Valat:
   Le mercredi 06 mars 2013 à 18:03 +0100, Knut Krueger a écrit :
   Am 06.03.2013 14:27, schrieb Nicole Ford:
   Dear Nicole,
   my be you are wondering about, but I know Google an I am using
 google
   before I am asking here.
  
   If you are more familiar with googl,e please help me to find the
 search
   term where I can find
   the R function for
   chi square exact usable for one column test for a sample size less
 than 6
  
   You are welcome to use this search:
  
   http://www.giyf.com/chi%20square%20exact
  
  
   Thanks in advane Knut
   See ?fisher.test.
  fisher test needs two columns I need  a one column exact test
  |x|
 
  either a two-dimensional contingency table in matrix form, or a
 factor
  object.
 
  |y|
 
  a factor object; ignored if |x| is a matrix.
 Sorry, I missed that part. Can you tell us more about the test you do
 in
 SPSS? Are you testing the adequacy of a given distribution to the data?
 In short: what do you test?
 
 Is that test documented somewhere? I found this document, but there
 does
 not seem to be such a test there:
 http://www.sussex.ac.uk/its/pdfs/SPSS_Exact_Tests_20.pdf
 
 
 Regards
 
 
  Knut
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue when reading a table into R

2013-03-06 Thread Sarah Goslee
Since nobody else has mentioned it: if you are seeing that message
when you are reading data in, then you probably failed to assign the
data to an R object.

mydata - read.table(somefile) # correct
read.table(somefile) # will simply print your data to the console, not save it

I'm not entirely sure what you meant by retrieve so maybe you
already knew this.

You can use e.g.
dim(mydata)
to find out whether it's the size you expect.

Sarah

On Wed, Mar 6, 2013 at 3:58 PM, Paul Bernal paulberna...@gmail.com wrote:
 Hello everyone,

 I was reading a table into R, and when trying to retrieve it the following
 message appeared:

  [ reached getOption(max.print) -- omitted 469376 rows ]

 Does this mean that R left out 469376 rows? Or R is taking those 469376
 rows as well and the limitation is only for printing purposes?

 Thanks in advance for any help,

 Best regards,

 Paul

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to combine conditional argument and logical argument in R to create subset of data...

2013-03-06 Thread arun


Hi,
How about this:

indxTem1-paste0(Tem1[,1],Tem1[,2])
 indxTem2-paste0(Tem2[,1],Tem2[,2])
Tem1[!indxTem1%in%indxTem2,]
#   V1 V2
 #[1,] 333 11
 #[2,] 111 16
 #[3,] 111 17
 #[4,] 111 20
 #[5,] 222 21
 #[6,] 222 22
 #[7,] 222 23
 #[8,] 222  1
 #[9,] 222  2
#[10,] 333  3
#[11,] 333  4
#[12,] 333  5
#[13,] 333  6
#[14,] 333  7


A.K.

From: HJ YAN yhj...@googlemail.com
To: arun smartpink...@yahoo.com 
Cc: r-help@r-project.org 
Sent: Wednesday, March 6, 2013 4:09 PM
Subject: Re: [R] How to combine conditional argument and logical argument in R 
to create subset of data...


Dear Arun


Thanks a million for your prompt reply and I love all four ways in your reply. 

Tried the code and just realised an issue here:   in my real work, my data is 
about 4GB large and I'm sure that there are many duplicated values in V2, so 
that is to say my V1 and V2 should be something like


V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)  # V1 here are some data index 
with lots of repeated numeric values
V2-c(1:23, 1:7)  # there are also duplicated values in V2
Tem1-cbind(V1,V2)
Tem2-Tem1[c(1:10,12:15,18:19),] # I know that Tem2 is a subset of Tem1...


So how do I get outcome of the difference of Tem1 and Tem2 if the values in V2 
having duplicates?

  V1 V2
 333 11
 111 16
 111 17
 111 20
 222 21
 222 22
 222 23
 222  1
 222  2
 333  3
 333  4
 333  5
 333  6
 333  7


Massive thanks
HJ





On Wed, Mar 6, 2013 at 4:12 PM, arun smartpink...@yahoo.com wrote:



Just to add:

Tem1[Tem1[,2]%in%setdiff(Tem1[,2],Tem2[,2]),]

A.K.

- Original Message -

From: arun smartpink...@yahoo.com
To: HJ YAN yhj...@googlemail.com
Cc: R help r-help@r-project.org
Sent: Wednesday, March 6, 2013 11:06 AM
Subject: Re: [R] How to combine conditional argument and logical argument in R 
to create subset of data...

Hi,
No problem.
V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
 length(V1)
#[1] 30

 V2- c(1:30) #should be the same length as V1
Tem1- cbind(V1,V2)
Tem2-Tem1[1:20,]

Tem1[!Tem1[,2]%in%Tem2[,2],]
 #  V1 V2
 #[1,] 222 21
 #[2,] 222 22
 #[3,] 222 23
 #[4,] 222 24
 #[5,] 222 25
 #[6,] 333 26
 #[7,] 333 27
 #[8,] 333 28
 #[9,] 333 29
#[10,] 333 30

#or
subset(Tem1,!V2%in% Tem2[,2])
#or
 Tem1[is.na(match(Tem1[,2],Tem2[,2])),]
 #  V1 V2
 #[1,] 222 21
 #[2,] 222 22
 #[3,] 222 23
 #[4,] 222 24
 #[5,] 222 25
 #[6,] 333 26
 #[7,] 333 27
 #[8,] 333 28
 #[9,] 333 29
#[10,] 333 30
A.K.





From: HJ YAN yhj...@googlemail.com
To: arun smartpink...@yahoo.com
Sent: Wednesday, March 6, 2013 10:33 AM
Subject: Re: [R] How to combine conditional argument and logical argument in R 
to create subset of data...


Thank you SO MUCH Arun!!! 

That's brilliant-- I've learnt some very useful new R command now, e.g. 
'do.call' and 'split'. And I see where my code went wrong now. 

 I do appreciate greatly for your prompt reply.

Also, I wonder if there exist a package can find difference between two data 
frames, e.g. one is a subset of the other? e.g. 

 V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
 V2-c(1:23)
Tem1-cbind(V1,V2)

Tem2-Tem1[1:20,]


How do I get outcome like 

[21,] 333 21
[22,] 333 22
[23,] 333 23


P.S. I used 'setdiff' before, but seems it only works for vectors but not for 
dataframe??


Sorry for so many questions today, as I'm coding for a work deadline tonight.


Many thanks!
Cheers
HJ







On Wed, Mar 6, 2013 at 1:55 PM, arun smartpink...@yahoo.com wrote:

Hi,
You can also try this:
 Tem3- list()
 for(i in unique(Tem1[,1])) {
 Tem3[[i]]- subset(Tem1,Tem1[,1]==i)
 Tem4- do.call(rbind,Tem3)
 }
head(Tem4)
#  V1 V2
#[1,] 111  1
#[2,] 111  2
#[3,] 111  3
#[4,] 111  4
#[5,] 111 13
#[6,] 111 14


#or
Tem3-c(NA,NA)
 for(i in unique(Tem1[,1])) {
 Tem2- subset(Tem1, Tem1[,1]==i)
 Tem3- rbind(Tem3,Tem2)
 Tem5- Tem3[-1,]
 }
head(Tem5)
#  V1 V2
# 111  1
# 111  2
# 111  3
# 111  4
# 111 13
# 111 14

A.K.



From: HJ YAN yhj...@googlemail.com

To: arun smartpink...@yahoo.com
Cc: r-help@r-project.org
Sent: Wednesday, March 6, 2013 8:24 AM
Subject: Re: [R] How to combine conditional argument and logical argument in 
R to create subset of data...



Hi Arun


Thank you so much for the help, that's really helpful!!

Also I have a quick question about the code below where I can not see why it 
doesn't work...

I know the I shou

V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3))
V2-c(1:23)
Tem1-cbind(V1,V2)


So Tem 1 looks like...
 Tem1
       V1 V2
 [1,] 111  1
 [2,] 111  2
 [3,] 111  3
 [4,] 111  4
 [5,] 222  5
 [6,] 222  6
 [7,] 222  7
 [8,] 222  8
 [9,] 333  9
[10,] 333 10
[11,] 333 11
[12,] 333 12
[13,] 111 13
[14,] 111 14
[15,] 111 15
[16,] 111 16
[17,] 222 17
[18,] 222 18
[19,] 222 19
[20,] 222 20
[21,] 333 21
[22,] 333 22
[23,] 333 23

I would like the outcome to be...

      V1 V2

     111  1
     111  2
     111  3
     111  4
     111 13
     111 14
     111 15
     111 16
     222  5
     222  6
     222  

[R] Inverse function using FDA

2013-03-06 Thread zoe richards
Hi,

Does anyone know how (or whether or not it's possible) to output an inverse
of a functional object?  I haven't found a way, but since derivatives etc.
can be computed using the fda package it seems like this should be
possible using this package or another designed for functional data
analysis.

Thanks,
Zoe Richards

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to combine conditional argument and logical argument in R to create subset of data...

2013-03-06 Thread HJ YAN
Dear Arun

Thanks a million for your prompt reply and I love all four ways in your
reply.

Tried the code and just realised an issue here:   in my real work, my data
is about 4GB large and I'm sure that there are many duplicated values in
V2, so that is to say my V1 and V2 should be something like


V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)  # V1 here are some data
index with lots of repeated numeric values
V2-c(1:23, 1:7)  # there are also duplicated values in V2
Tem1-cbind(V1,V2)
Tem2-Tem1[c(1:10,12:15,18:19),] # I know that Tem2 is a subset of Tem1...


So how do I get outcome of the difference of Tem1 and Tem2 if the values in
V2 having duplicates?

  V1 V2
 333 11
 111 16
 111 17
 111 20
 222 21
 222 22
 222 23
 222  1
 222  2
 333  3
 333  4
 333  5
 333  6
 333  7


Massive thanks
HJ




On Wed, Mar 6, 2013 at 4:12 PM, arun smartpink...@yahoo.com wrote:



 Just to add:

 Tem1[Tem1[,2]%in%setdiff(Tem1[,2],Tem2[,2]),]
 A.K.

 - Original Message -
 From: arun smartpink...@yahoo.com
 To: HJ YAN yhj...@googlemail.com
 Cc: R help r-help@r-project.org
 Sent: Wednesday, March 6, 2013 11:06 AM
 Subject: Re: [R] How to combine conditional argument and logical argument
 in R to create subset of data...

 Hi,
 No problem.
 V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
  length(V1)
 #[1] 30

  V2- c(1:30) #should be the same length as V1
 Tem1- cbind(V1,V2)
 Tem2-Tem1[1:20,]

 Tem1[!Tem1[,2]%in%Tem2[,2],]
  #  V1 V2
  #[1,] 222 21
  #[2,] 222 22
  #[3,] 222 23
  #[4,] 222 24
  #[5,] 222 25
  #[6,] 333 26
  #[7,] 333 27
  #[8,] 333 28
  #[9,] 333 29
 #[10,] 333 30

 #or
 subset(Tem1,!V2%in% Tem2[,2])
 #or
  Tem1[is.na(match(Tem1[,2],Tem2[,2])),]
  #  V1 V2
  #[1,] 222 21
  #[2,] 222 22
  #[3,] 222 23
  #[4,] 222 24
  #[5,] 222 25
  #[6,] 333 26
  #[7,] 333 27
  #[8,] 333 28
  #[9,] 333 29
 #[10,] 333 30
 A.K.




 
 From: HJ YAN yhj...@googlemail.com
 To: arun smartpink...@yahoo.com
 Sent: Wednesday, March 6, 2013 10:33 AM
 Subject: Re: [R] How to combine conditional argument and logical argument
 in R to create subset of data...


 Thank you SO MUCH Arun!!!

 That's brilliant-- I've learnt some very useful new R command now, e.g.
 'do.call' and 'split'. And I see where my code went wrong now.

  I do appreciate greatly for your prompt reply.

 Also, I wonder if there exist a package can find difference between two
 data frames, e.g. one is a subset of the other? e.g.

  V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
  V2-c(1:23)
 Tem1-cbind(V1,V2)

 Tem2-Tem1[1:20,]


 How do I get outcome like

 [21,] 333 21
 [22,] 333 22
 [23,] 333 23


 P.S. I used 'setdiff' before, but seems it only works for vectors but not
 for dataframe??


 Sorry for so many questions today, as I'm coding for a work deadline
 tonight.


 Many thanks!
 Cheers
 HJ







 On Wed, Mar 6, 2013 at 1:55 PM, arun smartpink...@yahoo.com wrote:

 Hi,
 You can also try this:
  Tem3- list()
  for(i in unique(Tem1[,1])) {
  Tem3[[i]]- subset(Tem1,Tem1[,1]==i)
  Tem4- do.call(rbind,Tem3)
  }
 head(Tem4)
 #  V1 V2
 #[1,] 111  1
 #[2,] 111  2
 #[3,] 111  3
 #[4,] 111  4
 #[5,] 111 13
 #[6,] 111 14
 
 
 #or
 Tem3-c(NA,NA)
  for(i in unique(Tem1[,1])) {
  Tem2- subset(Tem1, Tem1[,1]==i)
  Tem3- rbind(Tem3,Tem2)
  Tem5- Tem3[-1,]
  }
 head(Tem5)
 #  V1 V2
 # 111  1
 # 111  2
 # 111  3
 # 111  4
 # 111 13
 # 111 14
 
 A.K.
 
 
 
 From: HJ YAN yhj...@googlemail.com
 
 To: arun smartpink...@yahoo.com
 Cc: r-help@r-project.org
 Sent: Wednesday, March 6, 2013 8:24 AM
 Subject: Re: [R] How to combine conditional argument and logical argument
 in R to create subset of data...
 
 
 
 Hi Arun
 
 
 Thank you so much for the help, that's really helpful!!
 
 Also I have a quick question about the code below where I can not see why
 it doesn't work...
 
 I know the I shou
 
 V1-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3))
 V2-c(1:23)
 Tem1-cbind(V1,V2)
 
 
 So Tem 1 looks like...
  Tem1
V1 V2
  [1,] 111  1
  [2,] 111  2
  [3,] 111  3
  [4,] 111  4
  [5,] 222  5
  [6,] 222  6
  [7,] 222  7
  [8,] 222  8
  [9,] 333  9
 [10,] 333 10
 [11,] 333 11
 [12,] 333 12
 [13,] 111 13
 [14,] 111 14
 [15,] 111 15
 [16,] 111 16
 [17,] 222 17
 [18,] 222 18
 [19,] 222 19
 [20,] 222 20
 [21,] 333 21
 [22,] 333 22
 [23,] 333 23
 
 I would like the outcome to be...
 
   V1 V2
 
  111  1
  111  2
  111  3
  111  4
  111 13
  111 14
  111 15
  111 16
  222  5
  222  6
  222  7
  222  8
  222 17
  222 18
  222 19
  222 20
  333  9
  333 10
  333 11
  333 12
  333 21
  333 22
  333 23
 
 
 So I tried code as below
 --
 Tem3-c(NA,NA)
 for(i in length(unique(Tem1[,1]))){
 Tem2-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i])
 Tem3-rbind(Tem3,Tem2)
 Tem3
 }
 Tem4-Tem3[-1,]
 ---
 
 And only get this...
 
 
  V1 V2
 

Re: [R] how to get xmlToList() to retry if http fails

2013-03-06 Thread Ben Tupper
Hi,

On Mar 6, 2013, at 4:12 PM, Waichler, Scott R wrote:

 Hi,
 
 I am using xmlToList() in a loop with a call to a webservice, per the code 
 below.  
 
  # Loop thru target locs
  for(i in 1:num.target.locs) {
url - paste(sep=/, http://www.earthtools.org/timezone;, lat[i], lon[i])
tmp - xmlToList(url)
df$time.offset[i] - tmp$offset
system(sleep 1)  # wait 1 second per requirements of above web service
  }  # end loop thru target locations
 
 Failure struck midway through my loop, with the message below.
 
 failed to load HTTP resource
 Error: 1: failed to load HTTP resource
 

You can wrap it in a try function as in the following (untested).  I have made 
the thing stop if the second try fails, but you may want to do something more 
useful.  Check out tryCatch, too.

 for(i in 1:num.target.locs) {
   url - paste(sep=/, http://www.earthtools.org/timezone;, lat[i], lon[i])
   tmp - try(xmlToList(url))
   if (inherits(tmp, try-error)) {
Sys.sleep(2)
tmp - try(xmlToList(url))
if (inherits(tmp, try-error)) stop(Error fetching data)
   }
   df$time.offset[i] - tmp$offset
   system(sleep 1)  # wait 1 second per requirements of above web service
 }  # end loop thru target locations

Cheers,
Ben




 I presume that the webservice failed to respond in this instance.  How can I 
 trap the error and have it retry after waiting a second or two, instead of 
 exiting?
 
 Thanks.  --Scott Waichler
 Pacific Northwest National Laboratory
 Richland, WA, USA
 scott.waich...@pnnl.gov
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inverse function using FDA

2013-03-06 Thread Peter Ehlers

On 2013-03-06 12:54, zoe richards wrote:

Hi,

Does anyone know how (or whether or not it's possible) to output an inverse
of a functional object?  I haven't found a way, but since derivatives etc.
can be computed using the fda package it seems like this should be
possible using this package or another designed for functional data
analysis.

Thanks,
Zoe Richards



What does your question mean? Possibly, you could 'invert' a mean
function, but I have no idea what that would accomplish. Can you
provide an example of just what you want to do?

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to combine conditional argument and logical argument in R to create subset of data...

2013-03-06 Thread arun
Hi,
I am not sure I understand it correctly.



In the example you gave, there are duplicated rows in Tem1, ie. (222 6 ), (222 
7), (333 11), but these rows are also present in Tem2
Is there any chance of triplicates etc..
Also, you wanted to have rows that are not common in Tem1 and Tem2. ie. (111 1) 
is the first row in both.
indxTem1-paste0(Tem1[,1],Tem1[,2])
 indxTem2-paste0(Tem2[,1],Tem2[,2])


 res-rbind(Tem1[!indxTem1%in%indxTem2,], Tem1[duplicated(Tem1),]) 
res
res
   V1 V2
# [1,] 333 12
 #[2,] 111 16
 #[3,] 111 17
 #[4,] 111 20
 #[5,] 222 21
 #[6,] 222 22
 #[7,] 222 23
 #[8,] 333  4
 #[9,] 333  5
#[10,] 333  6
#[11,] 333  7
#[12,] 222  6
#[13,] 222  7
#[14,] 333 11

In cases of more replicates (triplicates, etc...) how do you want to process.  
Also, here the duplicate rows were found only in Tem1.
A.K.


From: HJ YAN yhj...@googlemail.com
To: arun smartpink...@yahoo.com 
Cc: r-help@r-project.org 
Sent: Wednesday, March 6, 2013 5:36 PM
Subject: Re: [R] How to combine conditional argument and logical argument in R 
to create subset of data...


Hi Arun

Massive thanks for the hints of making use of 'paste0'!

But coincidentally there were no pair of data exactly same in indxTem1 and 
indxTem2 in the previous example. I changed data as below which is very likely 
to be in my real data...


V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)  # V1 here are some data index 
with lots of repeated numeric values
V2-c(1:23, 6,7,11,4,5,6,7)  # there are also duplicated values in V2
Tem1-cbind(V1,V2)
Tem2-Tem1[c(1:11,13:15,18:19),] # I know that Tem2 is a subset of Tem1...


And my target outcome is the difference between Tem1 and Tem2 as below:


  V1 V2

 333 12
 111 16
 111 17
 111 20
 222 21
 222 22
 222 23
 222  6
 222  7
 333 11
 333  4
 333  5
 333  6
 333  7

Many thanks
HJ



On Wed, Mar 6, 2013 at 9:29 PM, arun smartpink...@yahoo.com wrote:



Hi,
How about this:

indxTem1-paste0(Tem1[,1],Tem1[,2])
 indxTem2-paste0(Tem2[,1],Tem2[,2])
Tem1[!indxTem1%in%indxTem2,]
#   V1 V2
 #[1,] 333 11
 #[2,] 111 16
 #[3,] 111 17
 #[4,] 111 20
 #[5,] 222 21
 #[6,] 222 22
 #[7,] 222 23
 #[8,] 222  1
 #[9,] 222  2
#[10,] 333  3
#[11,] 333  4
#[12,] 333  5
#[13,] 333  6
#[14,] 333  7



A.K.

From: HJ YAN yhj...@googlemail.com
To: arun smartpink...@yahoo.com
Cc: r-help@r-project.org
Sent: Wednesday, March 6, 2013 4:09 PM

Subject: Re: [R] How to combine conditional argument and logical argument in R 
to create subset of data...


Dear Arun


Thanks a million for your prompt reply and I love all four ways in your reply. 

Tried the code and just realised an issue here:   in my real work, my data is 
about 4GB large and I'm sure that there are many duplicated values in V2, so 
that is to say my V1 and V2 should be something like


V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)  # V1 here are some data index 
with lots of repeated numeric values
V2-c(1:23, 1:7)  # there are also duplicated values in V2
Tem1-cbind(V1,V2)
Tem2-Tem1[c(1:10,12:15,18:19),] # I know that Tem2 is a subset of Tem1...


So how do I get outcome of the difference of Tem1 and Tem2 if the values in V2 
having duplicates?

  V1 V2
 333 11
 111 16
 111 17
 111 20
 222 21
 222 22
 222 23
 222  1
 222  2
 333  3
 333  4
 333  5
 333  6
 333  7


Massive thanks
HJ





On Wed, Mar 6, 2013 at 4:12 PM, arun smartpink...@yahoo.com wrote:



Just to add:

Tem1[Tem1[,2]%in%setdiff(Tem1[,2],Tem2[,2]),]

A.K.

- Original Message -

From: arun smartpink...@yahoo.com
To: HJ YAN yhj...@googlemail.com
Cc: R help r-help@r-project.org
Sent: Wednesday, March 6, 2013 11:06 AM
Subject: Re: [R] How to combine conditional argument and logical argument in 
R to create subset of data...

Hi,
No problem.
V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
 length(V1)
#[1] 30

 V2- c(1:30) #should be the same length as V1
Tem1- cbind(V1,V2)
Tem2-Tem1[1:20,]

Tem1[!Tem1[,2]%in%Tem2[,2],]
 #  V1 V2
 #[1,] 222 21
 #[2,] 222 22
 #[3,] 222 23
 #[4,] 222 24
 #[5,] 222 25
 #[6,] 333 26
 #[7,] 333 27
 #[8,] 333 28
 #[9,] 333 29
#[10,] 333 30

#or
subset(Tem1,!V2%in% Tem2[,2])
#or
 Tem1[is.na(match(Tem1[,2],Tem2[,2])),]
 #  V1 V2
 #[1,] 222 21
 #[2,] 222 22
 #[3,] 222 23
 #[4,] 222 24
 #[5,] 222 25
 #[6,] 333 26
 #[7,] 333 27
 #[8,] 333 28
 #[9,] 333 29
#[10,] 333 30
A.K.





From: HJ YAN yhj...@googlemail.com
To: arun smartpink...@yahoo.com
Sent: Wednesday, March 6, 2013 10:33 AM
Subject: Re: [R] How to combine conditional argument and logical argument in 
R to create subset of data...


Thank you SO MUCH Arun!!! 

That's brilliant-- I've learnt some very useful new R command now, e.g. 
'do.call' and 'split'. And I see where my code went wrong now. 

 I do appreciate greatly for your prompt reply.

Also, I wonder if there exist a package can find difference between two data 
frames, e.g. one is a subset of the other? e.g. 

 V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
 

Re: [R] Help with a function and text

2013-03-06 Thread David Winsemius

On Mar 6, 2013, at 11:25 AM, Eliano Marques wrote:

 Hi, can I understand why this message was rejected ?
 Thanks,
 Eliano

First hit on a Markmail search:

http://markmail.org/message/5xog3ayx4amprsdx?q=list:org%2Er-project%2Er-help+nabble+rejected

-- 
David.

 
 Sent from my iPhone
 
 On 6 Mar 2013, at 19:18, Eliano eliano.m.marq...@gmail.com wrote:
 
 Hi everyone,
 
 I am writing some code to generate a function. I am passing that code to a
 dataset which i'm importing in R, e.g.
 Test=read.table('C:/test.txt', header=F, sep='\t', na.strings='NA', dec='.',
 strip.white=TRUE)
 Test
 
 V1
 (if(nclusters0){OptmizationInputs[3,3]*beta[1]}else{0})+
 (if(nclusters1){OptmizationInputs[3,3]*beta[1]}else{0})+
 V1 has inside a code for a function.
 
 I'm having problems with 2 things:
 
 1 - I need to take out from V1 all  that appears in the text, i tried a
 replace but did not work.
 Test=replace(Test,'  ', ' ')  , did not work.
 
 2 - Writing a function like this :
 
 nlog=function(par)
   {
   beta=par[1:n]
   Measure=Test[1]  # would this read the text?
   return(Measure)
   }
 
 So i need to use that code inside the function as above.
 Any suggestion on how you would do this?
 
 Kind Regards,
 Eliano
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523.html
 Sent from the R help mailing list archive at Nabble.com.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inverse function using FDA

2013-03-06 Thread zoe richards
I am trying to register unemployment rate and inverse of inflation rate to
investigate the phillips
curvehttp://www.econlib.org/library/Enc/PhillipsCurve.html by
looking at the resulting warping function.




On Wed, Mar 6, 2013 at 5:17 PM, Peter Ehlers ehl...@ucalgary.ca wrote:

 On 2013-03-06 12:54, zoe richards wrote:

 Hi,

 Does anyone know how (or whether or not it's possible) to output an
 inverse
 of a functional object?  I haven't found a way, but since derivatives etc.
 can be computed using the fda package it seems like this should be
 possible using this package or another designed for functional data
 analysis.

 Thanks,
 Zoe Richards


 What does your question mean? Possibly, you could 'invert' a mean
 function, but I have no idea what that would accomplish. Can you
 provide an example of just what you want to do?

 Peter Ehlers



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to combine conditional argument and logical argument in R to create subset of data...

2013-03-06 Thread HJ YAN
Hi Arun

Massive thanks for the hints of making use of 'paste0'!

But coincidentally there were no pair of data exactly same in indxTem1 and
indxTem2 in the previous example. I changed data as below which is very
likely to be in my real data...


V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)  # V1 here are some data
index with lots of repeated numeric values
V2-c(1:23, 6,7,11,4,5,6,7)  # there are also duplicated values in V2
Tem1-cbind(V1,V2)
Tem2-Tem1[c(1:11,13:15,18:19),] # I know that Tem2 is a subset of Tem1...


And my target outcome is the difference between Tem1 and Tem2 as below:


  V1 V2

 333 12
 111 16
 111 17
 111 20
 222 21
 222 22
 222 23
 222  6
 222  7
 333 11
 333  4
 333  5
 333  6
 333  7

Many thanks
HJ



On Wed, Mar 6, 2013 at 9:29 PM, arun smartpink...@yahoo.com wrote:



 Hi,
 How about this:

 indxTem1-paste0(Tem1[,1],Tem1[,2])
  indxTem2-paste0(Tem2[,1],Tem2[,2])
 Tem1[!indxTem1%in%indxTem2,]
 #   V1 V2
  #[1,] 333 11
  #[2,] 111 16
  #[3,] 111 17
  #[4,] 111 20
  #[5,] 222 21
  #[6,] 222 22
  #[7,] 222 23
  #[8,] 222  1
  #[9,] 222  2
 #[10,] 333  3
 #[11,] 333  4
 #[12,] 333  5
 #[13,] 333  6
 #[14,] 333  7


 A.K.
 
 From: HJ YAN yhj...@googlemail.com
 To: arun smartpink...@yahoo.com
 Cc: r-help@r-project.org
 Sent: Wednesday, March 6, 2013 4:09 PM
 Subject: Re: [R] How to combine conditional argument and logical argument
 in R to create subset of data...


 Dear Arun


 Thanks a million for your prompt reply and I love all four ways in your
 reply.

 Tried the code and just realised an issue here:   in my real work, my data
 is about 4GB large and I'm sure that there are many duplicated values in
 V2, so that is to say my V1 and V2 should be something like


 V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)  # V1 here are some data
 index with lots of repeated numeric values
 V2-c(1:23, 1:7)  # there are also duplicated values in V2
 Tem1-cbind(V1,V2)
 Tem2-Tem1[c(1:10,12:15,18:19),] # I know that Tem2 is a subset of Tem1...


 So how do I get outcome of the difference of Tem1 and Tem2 if the values
 in V2 having duplicates?

   V1 V2
  333 11
  111 16
  111 17
  111 20
  222 21
  222 22
  222 23
  222  1
  222  2
  333  3
  333  4
  333  5
  333  6
  333  7


 Massive thanks
 HJ





 On Wed, Mar 6, 2013 at 4:12 PM, arun smartpink...@yahoo.com wrote:


 
 Just to add:
 
 Tem1[Tem1[,2]%in%setdiff(Tem1[,2],Tem2[,2]),]
 
 A.K.
 
 - Original Message -
 
 From: arun smartpink...@yahoo.com
 To: HJ YAN yhj...@googlemail.com
 Cc: R help r-help@r-project.org
 Sent: Wednesday, March 6, 2013 11:06 AM
 Subject: Re: [R] How to combine conditional argument and logical argument
 in R to create subset of data...
 
 Hi,
 No problem.
 V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
  length(V1)
 #[1] 30
 
  V2- c(1:30) #should be the same length as V1
 Tem1- cbind(V1,V2)
 Tem2-Tem1[1:20,]
 
 Tem1[!Tem1[,2]%in%Tem2[,2],]
  #  V1 V2
  #[1,] 222 21
  #[2,] 222 22
  #[3,] 222 23
  #[4,] 222 24
  #[5,] 222 25
  #[6,] 333 26
  #[7,] 333 27
  #[8,] 333 28
  #[9,] 333 29
 #[10,] 333 30
 
 #or
 subset(Tem1,!V2%in% Tem2[,2])
 #or
  Tem1[is.na(match(Tem1[,2],Tem2[,2])),]
  #  V1 V2
  #[1,] 222 21
  #[2,] 222 22
  #[3,] 222 23
  #[4,] 222 24
  #[5,] 222 25
  #[6,] 333 26
  #[7,] 333 27
  #[8,] 333 28
  #[9,] 333 29
 #[10,] 333 30
 A.K.
 
 
 
 
 
 From: HJ YAN yhj...@googlemail.com
 To: arun smartpink...@yahoo.com
 Sent: Wednesday, March 6, 2013 10:33 AM
 Subject: Re: [R] How to combine conditional argument and logical argument
 in R to create subset of data...
 
 
 Thank you SO MUCH Arun!!!
 
 That's brilliant-- I've learnt some very useful new R command now, e.g.
 'do.call' and 'split'. And I see where my code went wrong now.
 
  I do appreciate greatly for your prompt reply.
 
 Also, I wonder if there exist a package can find difference between two
 data frames, e.g. one is a subset of the other? e.g.
 
  V1-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
  V2-c(1:23)
 Tem1-cbind(V1,V2)
 
 Tem2-Tem1[1:20,]
 
 
 How do I get outcome like
 
 [21,] 333 21
 [22,] 333 22
 [23,] 333 23
 
 
 P.S. I used 'setdiff' before, but seems it only works for vectors but not
 for dataframe??
 
 
 Sorry for so many questions today, as I'm coding for a work deadline
 tonight.
 
 
 Many thanks!
 Cheers
 HJ
 
 
 
 
 
 
 
 On Wed, Mar 6, 2013 at 1:55 PM, arun smartpink...@yahoo.com wrote:
 
 Hi,
 You can also try this:
  Tem3- list()
  for(i in unique(Tem1[,1])) {
  Tem3[[i]]- subset(Tem1,Tem1[,1]==i)
  Tem4- do.call(rbind,Tem3)
  }
 head(Tem4)
 #  V1 V2
 #[1,] 111  1
 #[2,] 111  2
 #[3,] 111  3
 #[4,] 111  4
 #[5,] 111 13
 #[6,] 111 14
 
 
 #or
 Tem3-c(NA,NA)
  for(i in unique(Tem1[,1])) {
  Tem2- subset(Tem1, Tem1[,1]==i)
  Tem3- rbind(Tem3,Tem2)
  Tem5- Tem3[-1,]
  }
 head(Tem5)
 #  V1 V2
 # 111  1
 # 111  2
 # 111  3
 # 111  4
 # 111 13
 # 111 14
 
 A.K.
 
 
 
 From: HJ YAN yhj...@googlemail.com
 
 To: arun 

[R] Fwd: How to conditionally remove dataframe rows?

2013-03-06 Thread Francisco Carvalho Diniz
Hi,

I have a data frame with two columns. I need to remove duplicated rows in
first column, but I need to do it conditionally to values of the second
column.

Example:

Point_counts   Psi_Sp

1A   0
2A   1
3B   1
4B   2
5B   0
6C   1
7D   1
8D   2


I need to turn this data frame in one without duplicated rows at
point-counts (one visit per point) but maintain the ones with maximum value
at Psi_Sp, e.g. remove row 1 and maintain 2 or remove rows 3 and 5 and
maintain 4. At the end I want a data frame like the one below:

 Point_counts   Psi_Sp

1  A   1
2  B   2
3  C   0
4  D   2

How can I do it? I found several ways to edit data frames, but
unfortunately I cound not use none of them.

I appreciate

Francisco

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with a function and text

2013-03-06 Thread Eliano
Thanks. Btw are you able to help with my issue? Thanks, Eliano

Sent from my iPhone

On 6 Mar 2013, at 23:41, David Winsemius [via R] 
ml-node+s789695n4660547...@n4.nabble.com wrote:


On Mar 6, 2013, at 11:25 AM, Eliano Marques wrote:

 Hi, can I understand why this message was rejected ?
 Thanks,
 Eliano

First hit on a Markmail search:

http://markmail.org/message/5xog3ayx4amprsdx?q=list:org%2Er-project%2Er-help+nabble+rejected

-- 
David.


 Sent from my iPhone

 On 6 Mar 2013, at 19:18, Eliano [hidden 
 email]/user/SendEmail.jtp?type=nodenode=4660547i=0
wrote:

 Hi everyone,

 I am writing some code to generate a function. I am passing that code to
a
 dataset which i'm importing in R, e.g.
 Test=read.table('C:/test.txt', header=F, sep='\t', na.strings='NA',
dec='.',
 strip.white=TRUE)
 Test

 V1
 (if(nclusters0){OptmizationInputs[3,3]*beta[1]}else{0})+
 (if(nclusters1){OptmizationInputs[3,3]*beta[1]}else{0})+
 V1 has inside a code for a function.

 I'm having problems with 2 things:

 1 - I need to take out from V1 all  that appears in the text, i tried a
 replace but did not work.
 Test=replace(Test,'  ', ' ')  , did not work.

 2 - Writing a function like this :

 nlog=function(par)
   {
   beta=par[1:n]
   Measure=Test[1]  # would this read the text?
   return(Measure)
   }

 So i need to use that code inside the function as above.
 Any suggestion on how you would do this?

 Kind Regards,
 Eliano



 --
 View this message in context:
http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523.html
 Sent from the R help mailing list archive at Nabble.com.


 __
 [hidden email] /user/SendEmail.jtp?type=nodenode=4660547i=1 mailing
list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
[hidden email] /user/SendEmail.jtp?type=nodenode=4660547i=2 mailing
list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
 If you reply to this email, your message will be added to the discussion
below:
http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523p4660547.html
 To unsubscribe from Help with a function and text, click
herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4660523code=ZWxpYW5vLm0ubWFycXVlc0BnbWFpbC5jb218NDY2MDUyM3wtMTk0ODk5MDYy
.
NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml




--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523p4660548.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to transpose it in a fast way?

2013-03-06 Thread Peter Langfelder
On Wed, Mar 6, 2013 at 4:18 PM, Yao He yao.h.1...@gmail.com wrote:
 Dear all:

 I have a big data file of 6 columns and 6 rows like that:

 AA AC AA AA ...AT
 CC CC CT CT...TC
 ..
 .

 I want to transpose it and the output is a new like that
 AA CC 
 AC CC
 AA CT.
 AA CT.
 
 
 AT TC.

 The keypoint is  I can't read it into R by read.table() because the
 data is too large,so I try that:
 c-file(silygenotype.txt,r)
 geno_t-list()
 repeat{
   line-readLines(c,n=1)
   if (length(line)==0)break  #end of file
   line-unlist(strsplit(line,\t))
 geno_t-cbind(geno_t,line)
 }
  write.table(geno_t,xxx.txt)

 It works but it is too slow ,how to optimize it???

I hate to be negative, but this will also not work on a 6x 6
matrix. At some point R will complain either about the lack of memory
or about you trying to allocate a vector that is too long.

I think your best bet is to look at file-backed data packages (for
example, package bigmemory). Look at this URL:
http://cran.r-project.org/web/views/HighPerformanceComputing.html and
scroll down to  Large memory and out-of-memory data. Some of the
packages may have the functionality you are looking for and may do it
faster than your code.

If this doesn't help, you _may_ be able to make your code work, albeit
slowly, if you replace the cbind() by data.frame. cbind() will in this
case produce a matrix, and matrices are limited to 2^31 elements,
which is less than 6 times 6. A data.frame is a special type
of list and so _may_ be able to handle that many elements, given
enough system RAM. There are experts on this list who will correct me
if I'm wrong.

If you are on a linux system, you can use split (type man split at the
shell prompt to see help) to split the file into smaller chunks of say
5000 lines or so. Process each file separately, write it into a
separate output file, then use the linux utility paste to paste the
files side-by-side into the final output.

Further, if you want to make it faster, do not grow geno_t by
cbind'ing a new column to it in each iteration. Pre-allocate a matrix
or data frame of an appropriate number of rows and columns and fill it
out as you go. But it will still be slow, which I think is due to the
inherent slowness of readLines and possibly strsplit.

HTH,

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to transpose it in a fast way?

2013-03-06 Thread Ista Zahn
On Wed, Mar 6, 2013 at 7:56 PM, Peter Langfelder
peter.langfel...@gmail.com wrote:
 On Wed, Mar 6, 2013 at 4:18 PM, Yao He yao.h.1...@gmail.com wrote:
 Dear all:

 I have a big data file of 6 columns and 6 rows like that:

 AA AC AA AA ...AT
 CC CC CT CT...TC
 ..
 .

 I want to transpose it and the output is a new like that
 AA CC 
 AC CC
 AA CT.
 AA CT.
 
 
 AT TC.

 The keypoint is  I can't read it into R by read.table() because the
 data is too large,so I try that:
 c-file(silygenotype.txt,r)
 geno_t-list()
 repeat{
   line-readLines(c,n=1)
   if (length(line)==0)break  #end of file
   line-unlist(strsplit(line,\t))
 geno_t-cbind(geno_t,line)
 }
  write.table(geno_t,xxx.txt)

 It works but it is too slow ,how to optimize it???

 I hate to be negative, but this will also not work on a 6x 6
 matrix. At some point R will complain either about the lack of memory
 or about you trying to allocate a vector that is too long.

Maybe this depends on the R version. I have not tried it, but the dev
version of R can handle much larger vectors. See
http://stat.ethz.ch/R-manual/R-devel/library/base/html/LongVectors.html

Yau He, if you are feeling adventurous you could give the development
version of R a try.

Best,
Ista


 I think your best bet is to look at file-backed data packages (for
 example, package bigmemory). Look at this URL:
 http://cran.r-project.org/web/views/HighPerformanceComputing.html and
 scroll down to  Large memory and out-of-memory data. Some of the
 packages may have the functionality you are looking for and may do it
 faster than your code.

 If this doesn't help, you _may_ be able to make your code work, albeit
 slowly, if you replace the cbind() by data.frame. cbind() will in this
 case produce a matrix, and matrices are limited to 2^31 elements,
 which is less than 6 times 6. A data.frame is a special type
 of list and so _may_ be able to handle that many elements, given
 enough system RAM. There are experts on this list who will correct me
 if I'm wrong.

 If you are on a linux system, you can use split (type man split at the
 shell prompt to see help) to split the file into smaller chunks of say
 5000 lines or so. Process each file separately, write it into a
 separate output file, then use the linux utility paste to paste the
 files side-by-side into the final output.

 Further, if you want to make it faster, do not grow geno_t by
 cbind'ing a new column to it in each iteration. Pre-allocate a matrix
 or data frame of an appropriate number of rows and columns and fill it
 out as you go. But it will still be slow, which I think is due to the
 inherent slowness of readLines and possibly strsplit.

 HTH,

 Peter

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: How to conditionally remove dataframe rows?

2013-03-06 Thread David Winsemius

On Mar 6, 2013, at 3:21 PM, Francisco Carvalho Diniz wrote:

 Hi,
 
 I have a data frame with two columns. I need to remove duplicated rows in
 first column, but I need to do it conditionally to values of the second
 column.
 
 Example:
 
Point_counts   Psi_Sp
 
 1A   0
 2A   1
 3B   1
 4B   2
 5B   0
 6C   1
 7D   1
 8D   2
 
 
 I need to turn this data frame in one without duplicated rows at
 point-counts (one visit per point) but maintain the ones with maximum value
 at Psi_Sp, e.g. remove row 1 and maintain 2 or remove rows 3 and 5 and
 maintain 4. At the end I want a data frame like the one below:
 
Try this:

dfrm - dfrm[ order(dfrm[[1]], -dfrm[[2]] ) , ]  
#put desired rows at top of each Point_counts category

# then take top item in each category

dfrm[ !duplicated(dfrm[[1]]) , ] 

 Point_counts   Psi_Sp
 
 1  A   1
 2  B   2
 3  C   0
 4  D   2
 
 How can I do it? I found several ways to edit data frames, but
 unfortunately I cound not use none of them.
 
 I appreciate
 
-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: How to conditionally remove dataframe rows?

2013-03-06 Thread arun
Hi,

dfrm- read.table(text=
    Point_counts  Psi_Sp

1    A  0
2    A  1
3    B  1
4    B  2
5    B  0
6    C  1
7    D  1
8    D  2
,sep=,header=TRUE,stringsAsFactors=FALSE)
 res-do.call(rbind,lapply(split(dfrm,dfrm$Point_counts),function(x) 
x[which.max(x$Psi_Sp),]))
 row.names(res)-1:nrow(res)
 # Point_counts Psi_Sp
#1    A  1
#2    B  2
#3    C  1 #your input data doesn't have 0
#4    D  2
A.K.



- Original Message -
From: Francisco Carvalho Diniz chicocdi...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Wednesday, March 6, 2013 6:21 PM
Subject: [R] Fwd: How to conditionally remove dataframe rows?

Hi,

I have a data frame with two columns. I need to remove duplicated rows in
first column, but I need to do it conditionally to values of the second
column.

Example:

        Point_counts       Psi_Sp

1            A                       0
2            A                       1
3            B                       1
4            B                       2
5            B                       0
6            C                       1
7            D                       1
8            D                       2


I need to turn this data frame in one without duplicated rows at
point-counts (one visit per point) but maintain the ones with maximum value
at Psi_Sp, e.g. remove row 1 and maintain 2 or remove rows 3 and 5 and
maintain 4. At the end I want a data frame like the one below:

         Point_counts           Psi_Sp

1              A                           1
2              B                           2
3              C                           0
4              D                           2

How can I do it? I found several ways to edit data frames, but
unfortunately I cound not use none of them.

I appreciate

Francisco

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple plots and looping assistance requested (revised codes)

2013-03-06 Thread arun
Hi Irucka,

I tried it and was able to plot it without any errors.  Here, your code 
indicates you need two lines. temper[[i]][1]
 temper[[1]][1] # which is the column 1.
  Month
1 1
2 2
3 3


 temper[[1]][2]
#  Data1
#1   1.5
#2  12.3
#3  11.4



Suppose I use names(temper) instead of seq_along(temper)
pdf(irucka.pdf)

 lapply(names(temper),function(i) 
{plot(as.matrix(temper[[i]][1]),as.matrix(temper[[i]][2]),main=Fluxmaster 
versus EGRET/WRTDS \n Seasonal FLux Sum,sub=i,xlab=Calendar Year 
Timesteps,ylab=Total Flux (kg/season)); lines(temper[[i]][1]); 
lines(temper[[i]][2])})
dev.off()

which may not be the one you wanted.
A.K.


From: Irucka Embry iruc...@mail2world.com
To: smartpink...@yahoo.com 
Sent: Wednesday, March 6, 2013 9:32 PM
Subject: Re: [R] multiple plots and looping assistance requested (revised codes)


Hi Arun, I was only able to plot by changing from names(temper) to 
seq_along(temper) and by providing a numeric column entry for the [i] index. My 
problem has been trying to figure out how to index each column by skipping 
column 1. Do you have any suggestions?

 tempernow - lapply(seq_along(temper),function(i) 
 {plot(as.matrix(temp[[i]][1]), as.matrix(temp[[i]][2]), main=Fluxmaster 
 versus EGRET/WRTDS \n Seasonal Flux Sum, sub = i,  xlab=Calendar Year 
 Timesteps, ylab=Total Flux (kg/season));  lines(temp[[i]][1], 
 temp[[i]][2])})
Error in xy.coords(x, y) : 
(list) object cannot be coerced to type 'double'

Thank you.

Irucka

irucka.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with a function and text

2013-03-06 Thread David Winsemius

On Mar 6, 2013, at 3:44 PM, Eliano wrote:

 Thanks. Btw are you able to help with my issue? Thanks, Eliano

I'm sorry,  I was too busy answering the question from 'Eliano' over on 
StackOverflow. I didn't have time to address this one.

(Please do note that cross-posting questions to Rhelp is contrary to advice in 
the Posting Guide.)

You might also do further searching in the Archives with the search terms: 
substitute text eval 
 and perhaps narrow it down further with contributors named: grothendeick, 
dunlap, ligges, venables

 
 Sent from my iPhone
 
 On 6 Mar 2013, at 23:41, David Winsemius [via R] 
 ml-node+s789695n4660547...@n4.nabble.com wrote:
 
 
 On Mar 6, 2013, at 11:25 AM, Eliano Marques wrote:
 
 Hi, can I understand why this message was rejected ?
 Thanks,
 Eliano
 
 First hit on a Markmail search:
 
 http://markmail.org/message/5xog3ayx4amprsdx?q=list:org%2Er-project%2Er-help+nabble+rejected
 
 -- 
 David.
 
 
 Sent from my iPhone
 
 On 6 Mar 2013, at 19:18, Eliano [hidden 
 email]/user/SendEmail.jtp?type=nodenode=4660547i=0
 wrote:
 
 Hi everyone,
 
 I am writing some code to generate a function. I am passing that code to
 a
 dataset which i'm importing in R, e.g.
 Test=read.table('C:/test.txt', header=F, sep='\t', na.strings='NA',
 dec='.',
 strip.white=TRUE)
 Test
 
 V1
 (if(nclusters0){OptmizationInputs[3,3]*beta[1]}else{0})+
 (if(nclusters1){OptmizationInputs[3,3]*beta[1]}else{0})+
 V1 has inside a code for a function.
 
 I'm having problems with 2 things:
 
 1 - I need to take out from V1 all  that appears in the text, i tried a
 replace but did not work.
 Test=replace(Test,'  ', ' ')  , did not work.
 
 2 - Writing a function like this :
 
 nlog=function(par)
  {
  beta=par[1:n]
  Measure=Test[1]  # would this read the text?
  return(Measure)
  }
 
 So i need to use that code inside the function as above.
 Any suggestion on how you would do this?
 
 Kind Regards,
 Eliano
 
 
 
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523.html
 Sent from the R help mailing list archive at Nabble.com.
 
 
 __
 [hidden email] /user/SendEmail.jtp?type=nodenode=4660547i=1 mailing
 list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius
 Alameda, CA, USA
 
 __
 [hidden email] /user/SendEmail.jtp?type=nodenode=4660547i=2 mailing
 list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 --
 If you reply to this email, your message will be added to the discussion
 below:
 http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523p4660547.html
 To unsubscribe from Help with a function and text, click
 herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4660523code=ZWxpYW5vLm0ubWFycXVlc0BnbWFpbC5jb218NDY2MDUyM3wtMTk0ODk5MDYy
 .
 NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
 
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Help-with-a-function-and-text-tp4660523p4660548.html
 Sent from the R help mailing list archive at Nabble.com.
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multivariate Power Test?

2013-03-06 Thread Charles Determan Jr
Generic question... I am familiar with generic power calculations in R,
however a lot of the data I primarily work with is multivariate.  Is there
any package/function that you would recommend to conduct such power
analysis?  Any recommendations would be appreciated.

Thank you for your time,

Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2013-03-06 Thread Peter Maclean
Is there a R package that use sampling weights in multilevel modeling? The 
survey package does not handle multilevel modeling and the weight option in 
lmer and nlmer functions from lme4 (used for multilevel modeling) is for 
weighted least squares estimation. 
Suggestion from one with experience in this subjet (including creating weights 
from strata and sampling unit variables) will be helpful. For example if 
analyzing data clustered in schools, how to use student's sampling weight or 
school sampling weight or both? 


Peter Maclean
Department of Economics
UDSM
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >