date:20091116

I'm not convinced it's right. In fact, I'm pretty sure the last step  
taking only the first half of the list is wrong. I also do not know if  
you have considered how you want to count situations like:


3 2 7 4 5 7 ...
7 3 8 6 1 2 9 2 ..

How many pairs of 2-7/7-2 would that represent?

--
David
On Nov 15, 2009, at 11:06 PM, cindy Guo wrote:


Hi, David,

The matrix has 20 columns.
Thank you very much for your help. I think it's right, but it seems  
I need some time to figure it out. I am a green hand. There are so  
many functions here I never used before. :)


Cindy

On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius dwinsem...@comcast.net 
 wrote:

Assuming that the number of columns is 4, then consider this approach:

 prs -scan()
1: 2 5 1 6
5: 1 7 8 2
9: 3 7 6 2
13: 9 8 5 7
17:
Read 16 items
prmtx - matrix(prs, 4,4, byrow=T)

#Now make copus of x.y and y.x

pair.str - sapply(1:nrow(prmtx), function(z)  
c(apply(combn(prmtx[z,], 2), 2,function(x) paste(x[1],x[2],  
sep=.)) , apply(combn(prmtx[z,], 2), 2,function(x)  
paste(x[2],x[1], sep=.))) )

tpair -table(pair.str)

# This then gives you a duplicated list
 tpair[tpair1]
pair.str
1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7
 2   2   2   2   2   2   2   2

# So only take the first half of the pairs:
 head(tpair[tpair1], sum(tpair1)/2)

pair.str
1.2 2.1 2.6 2.7
 2   2   2   2

--
David.



On Nov 15, 2009, at 8:06 PM, David Winsemius wrote:

I could of course be wrong but have you yet specified the number of  
columns for this pairing exercise?


On Nov 15, 2009, at 5:26 PM, cindy Guo wrote:

Hi, All,

I have an n by m matrix with each entry between 1 and 15000. I want  
to know
the frequency of each pair in 1:15000 that occur together in rows.  
So for

example, if the matrix is
2 5 1 6
1 7 8 2
3 7 6 2
9 8 5 7
Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to  
return
the value 2 for this pair as well as that for all pairs. Is there a  
fast way

to do this avoiding loops? Loops take too long.

and provide commented, minimal, self-contained, reproducible code.
 ^^

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Odp: ^ operator

2009-11-16 Thread carol white

but with complex, I get complex numbers for the first and last elements:

 (as.complex(tmp))^(1/3)
[1] 0.01969170+0.03410703i 0.03478442+0.i 0.03285672+0.i
[4] 0.08950802+0.i 0.05848363+0.10129661i

whereas for the first element, we get the followings.

Moreover, 
 -6.108576e-05^(1/3)
  [1] -0.03938341
and
 -(6.108576e-05^(1/3))
[1] -0.03938341
and
 -((6.108576e-05)^(1/3))
[1] -0.03938341

give the same results.

so using () doesn't preserve any thing

--- On Mon, 11/16/09, Petr PIKAL petr.pi...@precheza.cz wrote:

 From: Petr PIKAL petr.pi...@precheza.cz
 Subject: Odp: [R] ^ operator
 To: carol white wht_...@yahoo.com
 Cc: r-h...@stat.math.ethz.ch
 Date: Monday, November 16, 2009, 3:40 AM
 Hi
 
 AFAIK, this is issue of the preference of operators. 
 
 r-help-boun...@r-project.org
 napsal dne 16.11.2009 11:24:59:
 
  Hi,
  I want to apply ^ operator to a vector but it is
 applied to some of the 
  elements correctly and to some others, it generates
 NaN. Why is it not 
 able to
  calculate -6.108576e-05^(1/3) even though it exists?
  
  
   tmp
  [1] -6.108576e-05  4.208762e-05 
 3.547092e-05  7.171101e-04 
 -1.600269e-03
   tmp^(1/3)
  [1]        NaN 0.03478442
 0.03285672 0.08950802        NaN
 
 This computes (-a)^(1/3) which is not possible in real
 numbers. You have 
 to use as.complex(tmp)^(1/3) to get a result.
 
   -6.108576e-05^(1/3)
  [1] -0.03938341
 
 this is actually
 -(6.108576e-05^(1/3))
 
 Regards
 Petr
 
 
  
  __
  R-help@r-project.org
 mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
 
 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Labels in horizontal dendrogram not placed correctly?

2009-11-16 Thread joris meys

Hi all,

I tried plotting a horizontal dendrogram, but it seems as if the
labels are not taken into account in the function plot.dendrogram().

A minimal example :
Test - data.frame(
x1x = c(1:10),
x2x = c(2:11),
x3x = c(11:2)
)

TestDist - daisy(data.frame(t(Test)))
TestAgnes - agnes(TestDist)
plot(as.dendrogram(TestAgnes),horiz=T)

If I run this in R 2.10.0, I get a horizontal dendrogram with the
labels to the far right, and partly outside the plot area. This is
highly inconvenient. Am I doing something wrong or is this a bug?

Kind regards
Joris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple if else statement problem


thanks petr, this is actually shorter ;)

Petr Pikal wrote:
 
 Hi
 
 r-help-boun...@r-project.org napsal dne 13.11.2009 18:54:05:
 
 
 Ok Jim it worked, thank you! it´s funny because it worked with the first
 syntax in some cases...
 
 you can use another approach in this case
 
 P-max(c(P1,P2))
 
 Regards
 Petr
 
 
 
 
 anna_l wrote:
  
  Hello, I am getting an error with the following code:
  if( P2  P1)
  + {
  + P-P2
  + }
  else
  Erro: unexpected 'else' in else
  {
  + P-P1
  + }
  
  I checked the syntax so I don´t understand, I have other if else
  statements with the same syntax working. Thanks in advance
  
 
 -- 
 View this message in context: 
 http://old.nabble.com/Simple-if-else-statement-
 problem-tp26340336p26340642.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://old.nabble.com/Simple-if-else-statement-problem-tp26340336p26371185.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lapply() not converting columns to factors (no error message)


Dear List,

I'm having a curious problem with lapply(). I've used it before to convert 
a subset of columns in my dataframe, to factors, and its worked. But now, 
on re-running the identical code as before it just doesn't convert the 
columns into factors at all.


As far as I can see I've done nothing different, and its strange that it 
shouldn't do the action.


Has anybody come across this before? Any input on this strange issue much 
appreciated..


Hope I haven't missed something obvious.

Thanks a lot,

Aditi

(P.s.- I've tried converting columns one by one to factors this time, and 
that works.



P1L55-factor(P1L55)
levels(P1L55)
[1] 0 1


Code:

prm-read.table(P:\\.  .csv, header=T, ...sep=,, ...)

prmdf-data.frame(prm)

prmdf[2:13]-lapply(prmdf[2:13], factor) ## action performed, no error 
message


##I tried to pick random columns and check


levels(P1L55)
NULL



is.factor(P1L96)
FALSE



--
A Singh
aditi.si...@bristol.ac.uk
School of Biological Sciences
University of Bristol

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems by saving Rprofile.site under vista

Hi Charles, I´ve already been running it as an administrator this is why I
don´t understand it.

Charles Annis, P.E. wrote:

You may have to run R as Administrator (right-click, choose run as
administrator) to make these kinds of changes. After you have things the
way you like them, run R in the usual way by clicking on the icon.

Charles Annis, P.E.

charles.an...@statisticalengineering.com
561-352-9699
http://www.StatisticalEngineering.com

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of anna_l
Sent: Friday, November 13, 2009 11:46 AM
To: r-help@r-project.org
Subject: [R] Problems by saving Rprofile.site under vista

Hello, I am trying to save some changes I have done on the Rprofile.site
under vista and it doesn´t let me save the file saying that it can´t
create
the following file (Rprofile.site) and that I should check the pathfile
or
the file name.

--
View this message in context:
http://old.nabble.com/Problems-by-saving-Rprofile.site-under-vista-tp26339605p26339605.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
View this message in context:
http://old.nabble.com/Problems-by-saving-Rprofile.site-under-vista-tp26339605p26371258.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Conditional statement

2009-11-16 Thread Rafael Moral

Dear useRs,

I wrote a function that simulates a stochastic model in discrete time.
The problem is that the stochastic parameters should not be negative and 
sometimes they happen to be.
How can I conditionate it to when it draws a negative number, it transforms 
into zero in that time step?

Here is the function:

stochastic_prost - function(Fmean, Fsd, Smean, Ssd, f, s, n, time, out=FALSE, 
plot=TRUE) {
nt - rep(0, time)
nt[1] - n
for(n in 2:time) {
nt[n] - 0.5*rnorm(1, Fmean, Fsd)*rnorm(1, Smean, 
Ssd)*exp(1)^(-(f+s)*nt[n-1])*nt[n-1]}
if(out==TRUE) {print(data.frame(nt))}
if(plot==TRUE) {plot(1:time, nt, type='l', main='Simulation', 
ylab='Population', xlab='Generations')}
}

The 2 rnorm()'s should not be negative; when negative they should turn into 
zero.

Thanks in advance,
Rafael


  

[[elided Yahoo spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ^ operator

R does not know that 1/3 is 1/3. It is represented internally as  
0.3...3, so certain mathematical facts such as the existence of  
real roots of fractional integer powers are opaque to R  (since it is  
not a symbolic algebra system.)


Try seaarching for cube roots on R-search (for example):

http://finzi.psych.upenn.edu/Rhelp08/2009-July/205006.html  # Further  
comments by Ted Harding


 complex(real=-6.108576e-05)^(1/3)
[1] 0.01969171+0.03410703i'

So the safest way to get the real cube root (as opposed the complex  
roots) is to use:


sign(tmp)*abs(tmp)^1/3

 sign(tmp)*abs(tmp)^(1/3)
[1] -0.03938341  0.03478442  0.03285672  0.08950802 -0.11696726

On Nov 16, 2009, at 5:24 AM, carol white wrote:


Hi,
I want to apply ^ operator to a vector but it is applied to some of  
the elements correctly and to some others, it generates NaN. Why is  
it not able to calculate -6.108576e-05^(1/3) even though it exists?



tmp
[1] -6.108576e-05  4.208762e-05  3.547092e-05  7.171101e-04  
-1.600269e-03

tmp^(1/3)

[1]NaN 0.03478442 0.03285672 0.08950802NaN

-6.108576e-05^(1/3)

[1] -0.03938341


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply() not converting columns to factors (no error message)


Sorry, my file is at:


http://www.4shared.com/file/153147281/a5c78386/Testvcomp10.html


--
A Singh
aditi.si...@bristol.ac.uk
School of Biological Sciences
University of Bristol

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Odp: ^ operator

2009-11-16 Thread Ted Harding


On 16-Nov-09 11:40:29, Petr PIKAL wrote:
 Hi
 AFAIK, this is issue of the preference of operators. 
 
 r-help-boun...@r-project.org napsal dne 16.11.2009 11:24:59:
 

Not in this case (see below), though of course in general - takes
precedence over ^, so, for example, in the expression

  -2^(1/3)

the - is applied first, giving (-2); and then ^ is applied
next, giving (-2)^(1/3). There is a work-round (see below).

 Hi,
 I want to apply ^ operator to a vector but it is applied to
 some of the elements correctly and to some others, it generates
 NaN. Why is it not able to calculate -6.108576e-05^(1/3) even
 though it exists?

It only exists (in the real domain) if ^ takes precedence over -
which (in R) it does not!

  tmp
 [1] -6.108576e-05  4.208762e-05  3.547092e-05  7.171101e-04 
 -1.600269e-03
  tmp^(1/3)
 [1]NaN 0.03478442 0.03285672 0.08950802NaN
 
 This computes (-a)^(1/3) which is not possible in real numbers.

In this example, that is not accurate. tmp has already been
defined, and contains numbers which are already stored as negative
numbers, so - is no longer on the scene as an operator, before
^ is applied; the issue of precedence of - over ^ is no
longer present. The NaN arises from x^(1/3) where x is negative.

 You have to use as.complex(tmp)^(1/3) to get a result.
 
  -6.108576e-05^(1/3)
 [1] -0.03938341

This is not the result I get:

  as.complex(tmp)^(1/3)
# [1] 0.01969171+0.03410703i 0.03478442+0.i
# [3] 0.03285672+0.i 0.08950802+0.i
# [5] 0.05848363+0.10129662i

 this is actually
 -(6.108576e-05^(1/3))
 
 Regards
 Petr

It is possible to work round the problem without using as.complex
which can introduce complications -- see above, and also:

  x - (-1)
  x^(1/3)
  # [1] NaN
  as.complex(x)^(1/3)
  # [1] 0.5+0.8660254i

  as.complex(-1)^(1/2)
  # [1] 0+1i

which you would not want if you are working throughout in real
numbers (you would want the result -1 instead). Although, in the
mathematics of complex numbers, (-1)^(1/3) has three values, one
of which is -1, R only returns a single value.

However, you would have to define a new operator, called say %^%:

  %^%-function(X,x){sign(X)*(abs(X)^x)}

  tmp - c(-6.108576e-05, 4.208762e-05, 3.547092e-05,
7.171101e-04, -1.600269e-03)
  tmp%^%(1/3)
  # [1] -0.03938341  0.03478442  0.03285672  0.08950802 -0.11696726

The definition of %^% forces ^ to take precedence over -,
by in effect removing - from the scene until ^ has done its
work. But, if you hope to rely on this, note that if you apply
to 'tmp' any function in which the ordinary ^ will be used on
a negative number, you will still have the same problem.

Note: Trying to redefine ^ will not work, since invoking the
result initiates an infinite recursion:

  ^- function(X,x){sign(X)*(abs(X)^x)}
  ## (This definition will be accepted by R)
  tmp%^%(1/3)
  # Error: evaluation nested too deeply: infinite recursion /
  #   options(expressions=)?

It's not a clean situatio, but I hope the above helps!
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 16-Nov-09   Time: 12:55:25
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply() not converting columns to factors (no error message)

2009-11-16 Thread Sundar Dorai-Raj

Works for me:

x - 
read.csv(url(http://dc170.4shared.com/download/153147281/a5c78386/Testvcomp10.csv?tsid=20091116-075223-c3093ab0;))
names(x)
x[2:13] - lapply(x[2:13], factor)

 levels(x$P1L55)
[1] 0 1
 is.factor(x$P1L96)
[1] TRUE

 sessionInfo()
R version 2.10.0 (2009-10-26)
i386-apple-darwin9.8.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] lattice_0.17-26

loaded via a namespace (and not attached):
[1] grid_2.10.0  tools_2.10.0

On Mon, Nov 16, 2009 at 4:50 AM, A Singh aditi.si...@bristol.ac.uk wrote:
 Sorry, my file is at:


 http://www.4shared.com/file/153147281/a5c78386/Testvcomp10.html


 --
 A Singh
 aditi.si...@bristol.ac.uk
 School of Biological Sciences
 University of Bristol

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pairs

I stuck in another 7 in one of the lines with a 2 and reasoned that  
we could deal with the desire for non-ordered pair counting by  
pasting min(x,y) to max(x,y);


 dput(prmtx)
structure(c(2, 1, 3, 9, 5, 7, 7, 8, 1, 7, 6, 5, 6, 2, 2, 7), .Dim =  
c(4L,

4L))
 prmtx
 [,1] [,2] [,3] [,4]
[1,]2516
[2,]1772
[3,]3762
[4,]9857

 pair.str - sapply(1:nrow(prmtx), function(z)   
apply(combn(prmtx[z,], 2), 2,function(x) paste(min(x[2],x[1]),  
max(x[2],x[1]), sep=.)))


The logic:
sapply(1:nrow(prmtx), ... just loops over the rows of the matrix.
combn(prmtx[z,], 2)  ... returns a two row matrix of combination in a  
single row.
apply(combn(prmtx[z,], 2), 2 ... since combn( , 2)  returns a matrix  
that has two _rows_ I needed to loop over the columns.
paste(min(x[2],x[1]), max(x[2],x[1]), sep=.) ... stick the minimum  
of a pair in front of the max and separates them with a period to  
prevent two+ digits from being non-unique


Then using table() and logical tests in an index for the desired  
multiple pairs:



 tpair -table(pair.str)
 tpair
pair.str
1.2 1.5 1.6 1.7 2.3 2.5 2.6 2.7 3.6 3.7 5.6 5.7 5.8 5.9 6.7 7.7 7.8  
7.9 8.9
  2   1   1   2   1   1   2   3   1   1   1   1   1   1   1   1   1
1   1

 tpair[tpair1]
pair.str
1.2 1.7 2.6 2.7
  2   2   2   3

--
David.

On Nov 16, 2009, at 7:02 AM, David Winsemius wrote:

I'm not convinced it's right. In fact, I'm pretty sure the last step  
taking only the first half of the list is wrong. I also do not know  
if you have considered how you want to count situations like:


3 2 7 4 5 7 ...
7 3 8 6 1 2 9 2 ..

How many pairs of 2-7/7-2 would that represent?

--
David
On Nov 15, 2009, at 11:06 PM, cindy Guo wrote:


Hi, David,

The matrix has 20 columns.
Thank you very much for your help. I think it's right, but it seems  
I need some time to figure it out. I am a green hand. There are so  
many functions here I never used before. :)


Cindy

On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius dwinsem...@comcast.net 
 wrote:
Assuming that the number of columns is 4, then consider this  
approach:


 prs -scan()
1: 2 5 1 6
5: 1 7 8 2
9: 3 7 6 2
13: 9 8 5 7
17:
Read 16 items
prmtx - matrix(prs, 4,4, byrow=T)

#Now make copus of x.y and y.x

pair.str - sapply(1:nrow(prmtx), function(z)  
c(apply(combn(prmtx[z,], 2), 2,function(x) paste(x[1],x[2],  
sep=.)) , apply(combn(prmtx[z,], 2), 2,function(x)  
paste(x[2],x[1], sep=.))) )

tpair -table(pair.str)

# This then gives you a duplicated list
 tpair[tpair1]
pair.str
1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7
2   2   2   2   2   2   2   2

# So only take the first half of the pairs:
 head(tpair[tpair1], sum(tpair1)/2)

pair.str
1.2 2.1 2.6 2.7
2   2   2   2

--
David.



On Nov 15, 2009, at 8:06 PM, David Winsemius wrote:

I could of course be wrong but have you yet specified the number of  
columns for this pairing exercise?


On Nov 15, 2009, at 5:26 PM, cindy Guo wrote:

Hi, All,

I have an n by m matrix with each entry between 1 and 15000. I want  
to know
the frequency of each pair in 1:15000 that occur together in rows.  
So for

example, if the matrix is
2 5 1 6
1 7 8 2
3 7 6 2
9 8 5 7
Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to  
return
the value 2 for this pair as well as that for all pairs. Is there a  
fast way

to do this avoiding loops? Loops take too long.

and provide commented, minimal, self-contained, reproducible code.
^^

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply() not converting columns to factors (no error message)


Oh, strange!

I thought it might be a problem with the 'base' package installation, 
because the same thing's worked for me too before but won't do now.


I tried to reinstall it (base), but R says its there already which I 
expected it to be anyway.


I don't quite know where the issue is. Very odd.


--On 16 November 2009 04:59 -0800 Sundar Dorai-Raj sdorai...@gmail.com 
wrote:



Works for me:

x -
read.csv(url(http://dc170.4shared.com/download/153147281/a5c78386/Testvc
omp10.csv?tsid=20091116-075223-c3093ab0)) names(x)
x[2:13] - lapply(x[2:13], factor)


levels(x$P1L55)

[1] 0 1

is.factor(x$P1L96)

[1] TRUE


sessionInfo()

R version 2.10.0 (2009-10-26)
i386-apple-darwin9.8.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] lattice_0.17-26

loaded via a namespace (and not attached):
[1] grid_2.10.0  tools_2.10.0

On Mon, Nov 16, 2009 at 4:50 AM, A Singh aditi.si...@bristol.ac.uk
wrote:

Sorry, my file is at:


http://www.4shared.com/file/153147281/a5c78386/Testvcomp10.html


--
A Singh
aditi.si...@bristol.ac.uk
School of Biological Sciences
University of Bristol

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented,
minimal, self-contained, reproducible code.





--
A Singh
aditi.si...@bristol.ac.uk
School of Biological Sciences
University of Bristol

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] survreg function in survival package

2009-11-16 Thread carol white

Thank you David.

I don't think that I could pass by rweibull function since I use uniform random 
variable to generate survival times having weibull distribution. Therefore, 
like you, I have not found any other solution to set the shape parameter. If I 
want to calculate hazard, I will need both scale and shape parameters.

Sorry, just forgot to reply to all when replying to your email. Unfortunately, 
nobody else has replied yet so I wonder if anybody else could be helpful.

Cheers,

--- On Sat, 11/14/09, David Winsemius dwinsem...@comcast.net wrote:

 From: David Winsemius dwinsem...@comcast.net
 Subject: Re: [R] survreg function in survival package
 To: carol white wht_...@yahoo.com
 Date: Saturday, November 14, 2009, 8:44 AM
 It appears from a look at the str()
 output from your survreg object that you did set the scale
 parameter, at least the $scale value is set to 1 which is
 not what happens when that procedure is employed without
 that explicit setting. Does that mean that the coefficients
 were the shape parameters? The help page for
 survreg.distributions {survival} says that the scale =
 1/shape and that the intercept is =log(scale).
 
 ?survreg.distributions
 ?survreg.object
 
 And this is found in the survreg example:
 # There are multiple ways to parameterize a Weibull
 distribution. The survreg
 # function imbeds it in a general location-scale familiy,
 which is a
 # different parameterization than the rweibull function,
 and often leads
 # to confusion.
 # survreg's scale = 1/(rweibull shape)
 # survreg's intercept = log(rweibull scale)
 # For the log-likelihood all parameterizations lead to the
 same value. y - rweibull(1000, shape=2, scale=5)
 survreg(Surv(y)~1, dist=weibull)
 I find that a bit confusing because it would seem that the
 scale should not be a (pseudo-)random number. I was
 especting to read that scale would be either pweibull(shape)
 or qweibull(shape). Guess I will need to go back to my
 textbooks when I have time, which sadly I do not have
 today.
 Given that you are asking this offlist, I am sending it
 only to you which is not the optimal method for this
 exchange. It means that neither one of us wiil get our
 confusion and questions addressed by more knowledgeable
 persons reading the r-help list. My suggestion is that you
 copy this to the list.
 --David
 On Nov 13, 2009, at 9:02 AM, carol white wrote:
 
  Thanks for your reply.
  
  which parameter presents the base line scale
 parameter? How is it possible to set the shape parameter for
 weibull in survreg?
  
  Many thanks
  
  --- On Fri, 11/13/09, David Winsemius dwinsem...@comcast.net
 wrote:
  
  From: David Winsemius dwinsem...@comcast.net
  Subject: Re: [R] survreg function in survival
 package
  To: carol white wht_...@yahoo.com
  Cc: r-h...@stat.math.ethz.ch
  Date: Friday, November 13, 2009, 3:56 AM
  
  On Nov 13, 2009, at 3:17 AM, carol white wrote:
  
  Hi,
  Is it normal to get intercept in the list of
  covariates in the output of survreg function with
 standard
  error, z, p.value etc? Does it mean that intercept
 was
  fitted with the covariates? Does Value column
 represent
  coefficients or some thing else?
  
  
  Don't you need a baseline scale parameter for the
 Weibull
  function? You didn't offer the structure of your
 dataframe,
  but if it is the standard ovarian set, then the
 rx coef is
  just the difference between the scale parameter of
 rx=2 from
  that of rx=1, and similarly for ecog.ps. You would
 not have
  an estimate for rx=1 and ecog.ps=1 if you were not
 given the
  Intercept coef.
  
  In the future it would be good manners to indicate
 what
  grad school you are taking classes at.
  
  
  --David
  
  Regards,
  
 
 -
  tmp = survreg(Surv(futime, fustat) ~ ecog.ps +
 rx,
  ovarian, dist='weibull',scale=1)
  summary(tmp)
  
  Call:
  survreg(formula = Surv(futime, fustat) ~
 ecog.ps + rx,
  data = ovarian,
      dist = weibull,
 scale = 1)
  
     Value Std. Error
  z        p
  (Intercept)  6.962
  1.322  5.267 1.39e-07
  ecog.ps     -0.433
    0.587 -0.738 4.61e-01
  rx
     0.582      0.587
  0.991 3.22e-01
  
  Scale fixed at 1
  
  Weibull distribution
  Loglik(model)=
 -97.2   Loglik(intercept
  only)= -98
      Chisq= 1.67 on 2
 degrees of
  freedom, p= 0.43
  Number of Newton-Raphson Iterations: 4
  n= 26
 --
 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT
 
 


  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Conditional statement

2009-11-16 Thread jim holtman

Generate the numbers, test for zero and then set negatives to zero:

 set.seed(1)
 x - rnorm(100,5,3)
 sum(x0)
[1] 3
 x[x0] - 0
 sum(x0)
[1] 0



On Mon, Nov 16, 2009 at 7:43 AM, Rafael Moral
rafa_moral2...@yahoo.com.br wrote:
 Dear useRs,

 I wrote a function that simulates a stochastic model in discrete time.
 The problem is that the stochastic parameters should not be negative and 
 sometimes they happen to be.
 How can I conditionate it to when it draws a negative number, it transforms 
 into zero in that time step?

 Here is the function:

 stochastic_prost - function(Fmean, Fsd, Smean, Ssd, f, s, n, time, 
 out=FALSE, plot=TRUE) {
 nt - rep(0, time)
 nt[1] - n
 for(n in 2:time) {
 nt[n] - 0.5*rnorm(1, Fmean, Fsd)*rnorm(1, Smean, 
 Ssd)*exp(1)^(-(f+s)*nt[n-1])*nt[n-1]}
 if(out==TRUE) {print(data.frame(nt))}
 if(plot==TRUE) {plot(1:time, nt, type='l', main='Simulation', 
 ylab='Population', xlab='Generations')}
 }

 The 2 rnorm()'s should not be negative; when negative they should turn into 
 zero.

 Thanks in advance,
 Rafael


      
 
 [[elided Yahoo spam]]

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Conditional statement



On Nov 16, 2009, at 7:43 AM, Rafael Moral wrote:


Dear useRs,

I wrote a function that simulates a stochastic model in discrete time.
The problem is that the stochastic parameters should not be negative  
and sometimes they happen to be.
How can I conditionate it to when it draws a negative number, it  
transforms into zero in that time step?


Here is the function:

stochastic_prost - function(Fmean, Fsd, Smean, Ssd, f, s, n, time,  
out=FALSE, plot=TRUE) {

nt - rep(0, time)
nt[1] - n
for(n in 2:time) {
nt[n] - 0.5*rnorm(1, Fmean, Fsd)*rnorm(1, Smean, Ssd)*exp(1)^(-(f 
+s)*nt[n-1])*nt[n-1]}

if(out==TRUE) {print(data.frame(nt))}
if(plot==TRUE) {plot(1:time, nt, type='l', main='Simulation',  
ylab='Population', xlab='Generations')}

}

The 2 rnorm()'s should not be negative; when negative they should  
turn into zero.


  ...*max(0, rnorm(1, Fmean, Fsd)*max(0, rnorm(1, Smean, Ssd)*...


Thanks in advance,
Rafael


  


[[elided Yahoo spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Odp: ^ operator

2009-11-16 Thread Liviu Andronic

On 11/16/09, Ted Harding ted.hard...@manchester.ac.uk wrote:
 Not in this case (see below), though of course in general - takes
  precedence over ^, so, for example, in the expression

   -2^(1/3)

  the - is applied first, giving (-2); and then ^ is applied
  next, giving (-2)^(1/3). There is a work-round (see below).

Hmm.. I may be doing something wrong, but from here it looks to be the
opposite.
 -2^(1/3); -(2)^(1/3); -(2^(1/3));
[1] -1.2599
[1] -1.2599
[1] -1.2599
 (-2)^(1/3)
[1] NaN

The results don't change when switching from the unary minus.
 0-2^(1/3); 0-(2)^(1/3); 0-(2^(1/3));
[1] -1.2599
[1] -1.2599
[1] -1.2599

It seems to me that in this example ^ is applied first, and -
second. There is also this fortune entry.
 fortune(unary)

Thomas Lumley: The precedence of ^ is higher than that of unary minus.
It may be surprising,
[...]
Hervé Pagès: No, it's not surprising. At least to me... In the country
where I grew up, I've
been teached that -x^2 means -(x^2) not (-x)^2.
   -- Thomas Lumley and Hervé Pagès (both explaining that operator
precedence is working
  perfectly well)
  R-devel (January 2006)


Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply() not converting columns to factors (no error message)

Didn't you notice the difference between Sundar's code and yours?  
Sundar put the data.frame name before the column name while you did  
not do so in your check step.


--
DW


On Nov 16, 2009, at 8:07 AM, A Singh wrote:


Oh, strange!

I thought it might be a problem with the 'base' package  
installation, because the same thing's worked for me too before but  
won't do now.


I tried to reinstall it (base), but R says its there already which I  
expected it to be anyway.


I don't quite know where the issue is. Very odd.


--On 16 November 2009 04:59 -0800 Sundar Dorai-Raj sdorai...@gmail.com 
 wrote:



Works for me:

x -
read.csv(url(http://dc170.4shared.com/download/153147281/a5c78386/Testvc
omp10.csv?tsid=20091116-075223-c3093ab0)) names(x)
x[2:13] - lapply(x[2:13], factor)


levels(x$P1L55)

[1] 0 1

is.factor(x$P1L96)

[1] TRUE


sessionInfo()

R version 2.10.0 (2009-10-26)
i386-apple-darwin9.8.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] lattice_0.17-26

loaded via a namespace (and not attached):
[1] grid_2.10.0  tools_2.10.0

On Mon, Nov 16, 2009 at 4:50 AM, A Singh aditi.si...@bristol.ac.uk
wrote:

Sorry, my file is at:


http://www.4shared.com/file/153147281/a5c78386/Testvcomp10.html


--
A Singh
aditi.si...@bristol.ac.uk
School of Biological Sciences
University of Bristol

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented,
minimal, self-contained, reproducible code.





--
A Singh
aditi.si...@bristol.ac.uk
School of Biological Sciences
University of Bristol

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply() not converting columns to factors (no error message)

2009-11-16 Thread Sundar Dorai-Raj

Could it be you have factor redefined in your workspace? Have you
tried it in a clean directory? I.e. a directory where no .RData
exists?

On Mon, Nov 16, 2009 at 5:07 AM, A Singh aditi.si...@bristol.ac.uk wrote:
 Oh, strange!

 I thought it might be a problem with the 'base' package installation,
 because the same thing's worked for me too before but won't do now.

 I tried to reinstall it (base), but R says its there already which I
 expected it to be anyway.

 I don't quite know where the issue is. Very odd.


 --On 16 November 2009 04:59 -0800 Sundar Dorai-Raj sdorai...@gmail.com
 wrote:

 Works for me:

 x -
 read.csv(url(http://dc170.4shared.com/download/153147281/a5c78386/Testvc
 omp10.csv?tsid=20091116-075223-c3093ab0)) names(x)
 x[2:13] - lapply(x[2:13], factor)

 levels(x$P1L55)

 [1] 0 1

 is.factor(x$P1L96)

 [1] TRUE

 sessionInfo()

 R version 2.10.0 (2009-10-26)
 i386-apple-darwin9.8.0

 locale:
 [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 other attached packages:
 [1] lattice_0.17-26

 loaded via a namespace (and not attached):
 [1] grid_2.10.0  tools_2.10.0

 On Mon, Nov 16, 2009 at 4:50 AM, A Singh aditi.si...@bristol.ac.uk
 wrote:

 Sorry, my file is at:


 http://www.4shared.com/file/153147281/a5c78386/Testvcomp10.html


 --
 A Singh
 aditi.si...@bristol.ac.uk
 School of Biological Sciences
 University of Bristol

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.




 --
 A Singh
 aditi.si...@bristol.ac.uk
 School of Biological Sciences
 University of Bristol






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error on reading an excel file


Hello everybody, here is the code I use to read an excel file containing two
rows, one of date, the other of prices:
library(RODBC)
z - odbcConnectExcel(SPX_HistoricalData.xls)
datas - sqlFetch(z,Sheet1)
close(z)
It works pretty well but the only thing is that the datas stop at row 7530
and I don´t know why datas is a data frame that contains 7531 rows with the
last two ones = NA...

-
Anna Lippel
new in R so be careful I should be asking a lt of questions!:teeth:
-- 
View this message in context: 
http://old.nabble.com/Error-on-reading-an-excel-file-tp26371750p26371750.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Odp: ^ operator

2009-11-16 Thread Alain Guillet


Hi,

You forgot to put the parenthesis in the way Petr told you : 
(-6.108576e-05)^(1/3)  and the result is NaN. What do you want to preserve?



Alain




carol white wrote:

but with complex, I get complex numbers for the first and last elements:

  

(as.complex(tmp))^(1/3)


[1] 0.01969170+0.03410703i 0.03478442+0.i 0.03285672+0.i
[4] 0.08950802+0.i 0.05848363+0.10129661i

whereas for the first element, we get the followings.

Moreover, 
  

-6.108576e-05^(1/3)
 [1] -0.03938341


and
  

-(6.108576e-05^(1/3))


[1] -0.03938341
and
 -((6.108576e-05)^(1/3))
[1] -0.03938341

give the same results.

so using () doesn't preserve any thing

--- On Mon, 11/16/09, Petr PIKAL petr.pi...@precheza.cz wrote:

  

From: Petr PIKAL petr.pi...@precheza.cz
Subject: Odp: [R] ^ operator
To: carol white wht_...@yahoo.com
Cc: r-h...@stat.math.ethz.ch
Date: Monday, November 16, 2009, 3:40 AM
Hi

AFAIK, this is issue of the preference of operators. 


r-help-boun...@r-project.org
napsal dne 16.11.2009 11:24:59:



Hi,
I want to apply ^ operator to a vector but it is
  
applied to some of the 


elements correctly and to some others, it generates
  
NaN. Why is it not 
able to


calculate -6.108576e-05^(1/3) even though it exists?


  tmp
[1] -6.108576e-05  4.208762e-05 
  
3.547092e-05  7.171101e-04 
-1.600269e-03


tmp^(1/3)


[1]NaN 0.03478442
  

0.03285672 0.08950802NaN

This computes (-a)^(1/3) which is not possible in real
numbers. You have 
to use as.complex(tmp)^(1/3) to get a result.




-6.108576e-05^(1/3)


[1] -0.03938341
  

this is actually
-(6.108576e-05^(1/3))

Regards
Petr




__
R-help@r-project.org
  

mailing list


https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
  

http://www.R-project.org/posting-guide.html


and provide commented, minimal, self-contained,
  

reproducible code.








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  


--
Alain Guillet
Statistician and Computer Scientist

SMCS - Institut de statistique - Université catholique de Louvain
Bureau c.316
Voie du Roman Pays, 20
B-1348 Louvain-la-Neuve
Belgium

tel: +32 10 47 30 50

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] on gsub (simple, but not to me!) sintax

2009-11-16 Thread Ottorino-Luca Pantani


Dear R users,
my problem today deals with my ignorance on regular expressions.
a matter I recently discovered.

Consider the following

foo -
c(V_7_101110_V,  V_7_101110_V,  V_9_101110_V,  V_9_101110_V,
V_9_s101110_V,  V_9_101110_V,  V_9_101110_V,  V_11_101110_V,
V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V,
V_17_101110_V, V_17_101110_V)

what I'm trying to obtain is to add a zero in front of numbers below 10,
as in

c(V_07_101110_V,  V_07_101110_V,  V_09_101110_V,  V_09_101110_V,
V_09_101110_V,  V_09_101110_V,  V_09_101110_V,  V_11_101110_V,
V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V,
V_17_101110_V, V_17_101110_V)


I'm able to do this on the emacs buffer through query-replace-regexp

C-M-%
search for
V_\(.\)_
and substitute with
V_0\1_

but I completely ignore how to do it with gsub within R
and the help is quite complicate to understand
(at least to me, at this moment in time)

I can search the vector through
grep(V_._,  foo)

but I always get errors either on
gsub('V_\(.\)_', 'V_0\1_', foo)


or I get not what I'm looking for on
gsub('V_._', 'V_0._', foo)
gsub('V_._', 'V_0\1_', foo)

Thanks in advance
--
Ottorino-Luca Pantani, Università di Firenze
Dip. Scienza del Suolo e Nutrizione della Pianta
P.zle Cascine 28 50144 Firenze Italia
Ubuntu 8.04.3 LTS -- GNU Emacs 23.0.60.1 (x86_64-pc-linux-gnu, GTK+ 
Version 2.12.9)

ESS version 5.5 -- R 2.10.0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Odp: ^ operator

2009-11-16 Thread Petr PIKAL

Hi

r-help-boun...@r-project.org napsal dne 16.11.2009 13:27:03:

 but with complex, I get complex numbers for the first and last elements:
 
  (as.complex(tmp))^(1/3)
 [1] 0.01969170+0.03410703i 0.03478442+0.i 0.03285672+0.i
 [4] 0.08950802+0.i 0.05848363+0.10129661i

And that is a right answer

 
 whereas for the first element, we get the followings.
 
 Moreover, 
  -6.108576e-05^(1/3)
   [1] -0.03938341
 and
  -(6.108576e-05^(1/3))
 [1] -0.03938341
 and
  -((6.108576e-05)^(1/3))
 [1] -0.03938341
 

No. With all constructions like above you compute a cube root of 
***positive*** number and then you put *-* sign before the result, hence 
the same result.

Try instead to make a cube root of negative number.

 (-6.108576e-05)^(1/3)
[1] NaN

This is what you exactly do by the first call. Beware also that 1/3 is not 
exactly representable in binary arithmetic and so you actually do not 
compute cube root but some root which is quite near to cube root.

 (1000^(1/3))-10
[1] -1.776357e-15

If you want cube root and have negative numbers you need probably 
something like

sign(tmp) * abs(tmp)^(1/3)


 give the same results.
 
 so using () doesn't preserve any thing

You need to use parentheses on correct places. To see what is the 
precedence of operators see

?Syntax 

Regards
Petr


 
 --- On Mon, 11/16/09, Petr PIKAL petr.pi...@precheza.cz wrote:
 
  From: Petr PIKAL petr.pi...@precheza.cz
  Subject: Odp: [R] ^ operator
  To: carol white wht_...@yahoo.com
  Cc: r-h...@stat.math.ethz.ch
  Date: Monday, November 16, 2009, 3:40 AM
  Hi
  
  AFAIK, this is issue of the preference of operators. 
  
  r-help-boun...@r-project.org
  napsal dne 16.11.2009 11:24:59:
  
   Hi,
   I want to apply ^ operator to a vector but it is
  applied to some of the 
   elements correctly and to some others, it generates
  NaN. Why is it not 
  able to
   calculate -6.108576e-05^(1/3) even though it exists?
   
   
tmp
   [1] -6.108576e-05  4.208762e-05 
  3.547092e-05  7.171101e-04 
  -1.600269e-03
tmp^(1/3)
   [1]NaN 0.03478442
  0.03285672 0.08950802NaN
  
  This computes (-a)^(1/3) which is not possible in real
  numbers. You have 
  to use as.complex(tmp)^(1/3) to get a result.
  
-6.108576e-05^(1/3)
   [1] -0.03938341
  
  this is actually
  -(6.108576e-05^(1/3))
  
  Regards
  Petr
  
  
   
   __
   R-help@r-project.org
  mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
  reproducible code.
  
  
 
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Odp: ^ operator

2009-11-16 Thread Petr PIKAL

Hi

r-help-boun...@r-project.org napsal dne 16.11.2009 13:55:30:

 
 On 16-Nov-09 11:40:29, Petr PIKAL wrote:
  Hi
  AFAIK, this is issue of the preference of operators. 
  
  r-help-boun...@r-project.org napsal dne 16.11.2009 11:24:59:
  
 
 Not in this case (see below), though of course in general - takes
 precedence over ^, so, for example, in the expression
 
   -2^(1/3)
 
 the - is applied first, giving (-2); and then ^ is applied
 next, giving (-2)^(1/3). There is a work-round (see below).

Are you sure?

I get

 -2^(1/3)
[1] -1.259921

 (-2)^(1/3)
[1] NaN

 2^(1/3)
[1] 1.259921

So ^ is applied first and then the result is negated.


See ?Syntax.

I agree with what you write below, though.

Regards
Petr




 
  Hi,
  I want to apply ^ operator to a vector but it is applied to
  some of the elements correctly and to some others, it generates
  NaN. Why is it not able to calculate -6.108576e-05^(1/3) even
  though it exists?
 
 It only exists (in the real domain) if ^ takes precedence over -
 which (in R) it does not!
 
   tmp
  [1] -6.108576e-05  4.208762e-05  3.547092e-05  7.171101e-04 
  -1.600269e-03
   tmp^(1/3)
  [1]NaN 0.03478442 0.03285672 0.08950802NaN
  
  This computes (-a)^(1/3) which is not possible in real numbers.
 
 In this example, that is not accurate. tmp has already been
 defined, and contains numbers which are already stored as negative
 numbers, so - is no longer on the scene as an operator, before
 ^ is applied; the issue of precedence of - over ^ is no
 longer present. The NaN arises from x^(1/3) where x is negative.
 
  You have to use as.complex(tmp)^(1/3) to get a result.
  
   -6.108576e-05^(1/3)
  [1] -0.03938341
 
 This is not the result I get:
 
   as.complex(tmp)^(1/3)
 # [1] 0.01969171+0.03410703i 0.03478442+0.i
 # [3] 0.03285672+0.i 0.08950802+0.i
 # [5] 0.05848363+0.10129662i
 
  this is actually
  -(6.108576e-05^(1/3))
  
  Regards
  Petr
 
 It is possible to work round the problem without using as.complex
 which can introduce complications -- see above, and also:
 
   x - (-1)
   x^(1/3)
   # [1] NaN
   as.complex(x)^(1/3)
   # [1] 0.5+0.8660254i
 
   as.complex(-1)^(1/2)
   # [1] 0+1i
 
 which you would not want if you are working throughout in real
 numbers (you would want the result -1 instead). Although, in the
 mathematics of complex numbers, (-1)^(1/3) has three values, one
 of which is -1, R only returns a single value.
 
 However, you would have to define a new operator, called say %^%:
 
   %^%-function(X,x){sign(X)*(abs(X)^x)}
 
   tmp - c(-6.108576e-05, 4.208762e-05, 3.547092e-05,
 7.171101e-04, -1.600269e-03)
   tmp%^%(1/3)
   # [1] -0.03938341  0.03478442  0.03285672  0.08950802 -0.11696726
 
 The definition of %^% forces ^ to take precedence over -,
 by in effect removing - from the scene until ^ has done its
 work. But, if you hope to rely on this, note that if you apply
 to 'tmp' any function in which the ordinary ^ will be used on
 a negative number, you will still have the same problem.
 
 Note: Trying to redefine ^ will not work, since invoking the
 result initiates an infinite recursion:
 
   ^- function(X,x){sign(X)*(abs(X)^x)}
   ## (This definition will be accepted by R)
   tmp%^%(1/3)
   # Error: evaluation nested too deeply: infinite recursion /
   #   options(expressions=)?
 
 It's not a clean situatio, but I hope the above helps!
 Ted.
 
 
 E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
 Fax-to-email: +44 (0)870 094 0861
 Date: 16-Nov-09   Time: 12:55:25
 -- XFMail --
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Odp: ^ operator

2009-11-16 Thread Ted Harding

On 16-Nov-09 13:13:27, Liviu Andronic wrote:
 On 11/16/09, Ted Harding ted.hard...@manchester.ac.uk wrote:
 Not in this case (see below), though of course in general - takes
  precedence over ^, so, for example, in the expression

   -2^(1/3)

  the - is applied first, giving (-2); and then ^ is applied
  next, giving (-2)^(1/3). There is a work-round (see below).

 Hmm.. I may be doing something wrong, but from here it looks to be the
 opposite.
 -2^(1/3); -(2)^(1/3); -(2^(1/3));
 [1] -1.2599
 [1] -1.2599
 [1] -1.2599
 (-2)^(1/3)
 [1] NaN
 
 The results don't change when switching from the unary minus.
 0-2^(1/3); 0-(2)^(1/3); 0-(2^(1/3));
 [1] -1.2599
 [1] -1.2599
 [1] -1.2599

Correct!!! I was inadvertently put on the wrong foot by Pietr
Pikal's comment about precedence, and as a result what I wrote
about precedence of ^ relative to - was on the wrong foot
throughout, and should be ignored. My apologies for any confusion
this may have caused to anybody.

In any case, this is not relevant to Carol White's query about
taking the cube root (or indeed any fractional power) of a
negative number. This can only be done (as Carol intended it)
by using the form sign(x)*(abs(x)^power).

As I tried to point out, there is a distinction between an
expression which the user may enter as x - -1.234, and then
x^(1/3), expecting -(1.234^(1/3)), and the cube root of the
negative number x.

Ted.

 It seems to me that in this example ^ is applied first, and -
 second. There is also this fortune entry.
 fortune(unary)
 
 Thomas Lumley: The precedence of ^ is higher than that of unary minus.
 It may be surprising,
 [...]
 HervÃ© PagÃ¨s: No, it's not surprising. At least to me... In the
 country
 where I grew up, I've
 been teached that -x^2 means -(x^2) not (-x)^2.
-- Thomas Lumley and HervÃ© PagÃ¨s (both explaining that operator
 precedence is working
   perfectly well)
   R-devel (January 2006)
 
 
 Liviu
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 16-Nov-09   Time: 13:40:06
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply() not converting columns to factors (no error message)


Oh yes. Did notice that now. Thanks for pointing that out.
I was a bit concerned because that's a crucial step for running further 
lmer models, and that isn't working too, based on this factoring of columns.


Will hopefully be able to weed it out.

Am sorry if this wasted a bit of time.
I realized that summary(prmdf) gives me what I need.

Thanks a lot,

Aditi

--On 16 November 2009 08:17 -0500 David Winsemius dwinsem...@comcast.net 
wrote:



Didn't you notice the difference between Sundar's code and yours? Sundar
put the data.frame name before the column name while you did not do so in
your check step.

--
DW


On Nov 16, 2009, at 8:07 AM, A Singh wrote:


Oh, strange!

I thought it might be a problem with the 'base' package
installation, because the same thing's worked for me too before but
won't do now.

I tried to reinstall it (base), but R says its there already which I
expected it to be anyway.

I don't quite know where the issue is. Very odd.


--On 16 November 2009 04:59 -0800 Sundar Dorai-Raj sdorai...@gmail.com
 wrote:


Works for me:

x -
read.csv(url(http://dc170.4shared.com/download/153147281/a5c78386/Test
vc omp10.csv?tsid=20091116-075223-c3093ab0)) names(x)
x[2:13] - lapply(x[2:13], factor)


levels(x$P1L55)

[1] 0 1

is.factor(x$P1L96)

[1] TRUE


sessionInfo()

R version 2.10.0 (2009-10-26)
i386-apple-darwin9.8.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] lattice_0.17-26

loaded via a namespace (and not attached):
[1] grid_2.10.0  tools_2.10.0

On Mon, Nov 16, 2009 at 4:50 AM, A Singh aditi.si...@bristol.ac.uk
wrote:

Sorry, my file is at:


http://www.4shared.com/file/153147281/a5c78386/Testvcomp10.html


--
A Singh
aditi.si...@bristol.ac.uk
School of Biological Sciences
University of Bristol

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented,
minimal, self-contained, reproducible code.





--
A Singh
aditi.si...@bristol.ac.uk
School of Biological Sciences
University of Bristol

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented,
minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT





--
A Singh
aditi.si...@bristol.ac.uk
School of Biological Sciences
University of Bristol

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] on gsub (simple, but not to me!) sintax

2009-11-16 Thread Duncan Murdoch


On 11/16/2009 8:21 AM, Ottorino-Luca Pantani wrote:

Dear R users,
my problem today deals with my ignorance on regular expressions.
a matter I recently discovered.


You were close.  First, gsub by default doesn't need escapes before the 
parens.  (There are lots of different conventions for regular 
expressions, unfortunately.)  So the Emacs regular expression V_\(.\)_ 
is entered as V_(.)_ in the default version of gsub().  Second, to 
enter a backslash into a string, you need to escape it.  So the 
replacement pattern V_0\1_ is entered as V_0\\1_.  So


gsub(V_(.)_, V_0\\1_, foo)

should give you what you want.

Duncan Murdoch



Consider the following

foo -
c(V_7_101110_V,  V_7_101110_V,  V_9_101110_V,  V_9_101110_V,
V_9_s101110_V,  V_9_101110_V,  V_9_101110_V,  V_11_101110_V,
V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V,
V_17_101110_V, V_17_101110_V)

what I'm trying to obtain is to add a zero in front of numbers below 10,
as in

c(V_07_101110_V,  V_07_101110_V,  V_09_101110_V,  V_09_101110_V,
V_09_101110_V,  V_09_101110_V,  V_09_101110_V,  V_11_101110_V,
V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V,
V_17_101110_V, V_17_101110_V)


I'm able to do this on the emacs buffer through query-replace-regexp

C-M-%
search for
V_\(.\)_
and substitute with
V_0\1_

but I completely ignore how to do it with gsub within R
and the help is quite complicate to understand
(at least to me, at this moment in time)

I can search the vector through
grep(V_._,  foo)

but I always get errors either on
gsub('V_\(.\)_', 'V_0\1_', foo)


or I get not what I'm looking for on
gsub('V_._', 'V_0._', foo)
gsub('V_._', 'V_0\1_', foo)

Thanks in advance


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] on gsub (simple, but not to me!) sintax



On Nov 16, 2009, at 8:21 AM, Ottorino-Luca Pantani wrote:


Dear R users,
my problem today deals with my ignorance on regular expressions.
a matter I recently discovered.

Consider the following

foo -
c(V_7_101110_V,  V_7_101110_V,  V_9_101110_V,  V_9_101110_V,
V_9_s101110_V,  V_9_101110_V,  V_9_101110_V,  V_11_101110_V,
V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V,
V_17_101110_V, V_17_101110_V)

what I'm trying to obtain is to add a zero in front of numbers below  
10,

as in

c(V_07_101110_V,  V_07_101110_V,  V_09_101110_V,   
V_09_101110_V,

V_09_101110_V,  V_09_101110_V,  V_09_101110_V,  V_11_101110_V,
V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V,
V_17_101110_V, V_17_101110_V)

Any of these (the need for doubling of the \\ for the back-reference  
seems to be the main issue:

 gsub(_([[:digit:]])_., _0\\1_, foo)
 [1] V_07_01110_V  V_07_01110_V  V_09_01110_V  V_09_01110_V   
V_09_101110_V
 [6] V_09_01110_V  V_09_01110_V  V_11_101110_V V_11_101110_V  
V_11_101110_V

[11] V_11_101110_V V_11_101110_V V_17_101110_V V_17_101110_V

 gsub(_(\\d)_., _0\\1_, foo)
 [1] V_07_01110_V  V_07_01110_V  V_09_01110_V  V_09_01110_V   
V_09_101110_V
 [6] V_09_01110_V  V_09_01110_V  V_11_101110_V V_11_101110_V  
V_11_101110_V

[11] V_11_101110_V V_11_101110_V V_17_101110_V V_17_101110_V

 gsub(V_(.)_, V_0\\1_, foo)
 [1] V_07_101110_V  V_07_101110_V  V_09_101110_V   
V_09_101110_V  V_09_s101110_V
 [6] V_09_101110_V  V_09_101110_V  V_11_101110_V   
V_11_101110_V  V_11_101110_V

[11] V_11_101110_V  V_11_101110_V  V_17_101110_V  V_17_101110_V


I'm able to do this on the emacs buffer through query-replace-regexp

C-M-%
search for
V_\(.\)_
and substitute with
V_0\1_

but I completely ignore how to do it with gsub within R
and the help is quite complicate to understand
(at least to me, at this moment in time)

I can search the vector through
grep(V_._,  foo)

but I always get errors either on
gsub('V_\(.\)_', 'V_0\1_', foo)


or I get not what I'm looking for on
gsub('V_._', 'V_0._', foo)
gsub('V_._', 'V_0\1_', foo)

Thanks in advance
--
Ottorino-Luca Pantani, Università di Firenze
Dip. Scienza del Suolo e Nutrizione della Pianta
P.zle Cascine 28 50144 Firenze Italia
Ubuntu 8.04.3 LTS -- GNU Emacs 23.0.60.1 (x86_64-pc-linux-gnu, GTK+  
Version 2.12.9)

ESS version 5.5 -- R 2.10.0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Relase positive with log and zero of negative with 0

2009-11-16 Thread Peter Ehlers




David Winsemius wrote:


On Nov 15, 2009, at 10:18 AM, rkevinbur...@charter.net wrote:

This is a very simple question but I couldn't form a site search 
quesry that would return a reasonable result set.


Say I have a vector:

x - c(0,2,3,4,5,-1,-2)

I want to replace all of the values in 'x' with the log of x. 
Naturally this runs into problems since some of the values are 
negative or zero. So how can I replace all of the positive elements of 
x with the log(x) and the rest with zero?


  x - c(0,2,3,4,5,-1,-2)
  x - ifelse(x0, log(x), 0)
Warning message:
In log(x) : NaNs produced
  x
[1] 0.000 0.6931472 1.0986123 1.3862944 1.6094379 0.000 0.000

The warning is harmless as you can see, but if you wanted to avoid it, 
then:


  x[x=0] - 0; x[x0] -log(x[x0])

In the second command, you need to have the logical test on both sides 
to avoid replacement  out of synchrony.



Here is one more way, somewhat less transparent, motivated
by the examples on the ?ifelse page:

 x - log(ifelse(x  0, x, 1))

 -Peter Ehlers


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Relase positive with log and zero of negative with 0



On Nov 16, 2009, at 8:55 AM, Peter Ehlers wrote:




David Winsemius wrote:

On Nov 15, 2009, at 10:18 AM, rkevinbur...@charter.net wrote:
This is a very simple question but I couldn't form a site search  
quesry that would return a reasonable result set.


Say I have a vector:

x - c(0,2,3,4,5,-1,-2)

I want to replace all of the values in 'x' with the log of x.  
Naturally this runs into problems since some of the values are  
negative or zero. So how can I replace all of the positive  
elements of x with the log(x) and the rest with zero?

 x - c(0,2,3,4,5,-1,-2)
 x - ifelse(x0, log(x), 0)
Warning message:
In log(x) : NaNs produced
 x
[1] 0.000 0.6931472 1.0986123 1.3862944 1.6094379 0.000  
0.000
The warning is harmless as you can see, but if you wanted to avoid  
it, then:

 x[x=0] - 0; x[x0] -log(x[x0])
In the second command, you need to have the logical test on both  
sides to avoid replacement  out of synchrony.

Here is one more way, somewhat less transparent, motivated
by the examples on the ?ifelse page:

x - log(ifelse(x  0, x, 1))


Here's yet another motivated by the above:

 log( (x=0) + (x0)*x )
[1] 0.000 0.6931472 1.0986123 1.3862944 1.6094379 0.000  
0.000




-Peter Ehlers

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R: phase determination

2009-11-16 Thread Lisandro Benedetti Cecchi

Hi again:

Any thought on the following??

I'm trying to determine the phase of irregularly sampled data. Is there any
particular reason why both spec.pgram and spec.ls return phase-NULL for
vectors?

 

Thank you.

Lisandro

 

 


x

Lisandro Benedetti-Cecchi

Associate Professor in Ecology

Department of Biology - University of Pisa

Via Derna 1, 56126 Pisa,

Italy

Office: 39 050 2211413

Fax: 39 050 2211410

e-mail: lbenede...@biologia.unipi.it

http://www.discat.unipi.it/BiolMar/people/LBC/LBC.htm

http://www.unipi.it

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] on gsub (simple, but not to me!) sintax

2009-11-16 Thread Wacek Kusnierczyk


Duncan Murdoch wrote:

On 11/16/2009 8:21 AM, Ottorino-Luca Pantani wrote:

Dear R users,
my problem today deals with my ignorance on regular expressions.
a matter I recently discovered.


You were close.  First, gsub by default doesn't need escapes before 
the parens.  (There are lots of different conventions for regular 
expressions, unfortunately.)  So the Emacs regular expression V_\(.\)_ 
is entered as V_(.)_ in the default version of gsub().  Second, to 
enter a backslash into a string, you need to escape it.  So the 
replacement pattern V_0\1_ is entered as V_0\\1_.  So


gsub(V_(.)_, V_0\\1_, foo)

should give you what you want.


actually, guessing from the form of the input, sub is more appropriate, 
though the performance gain seems inessential (~3%).


vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply() not converting columns to factors (no error message)

2009-11-16 Thread Peter Dalgaard

A Singh wrote:
 Dear List,
 
 I'm having a curious problem with lapply(). I've used it before to
 convert a subset of columns in my dataframe, to factors, and its worked.
 But now, on re-running the identical code as before it just doesn't
 convert the columns into factors at all.
 
 As far as I can see I've done nothing different, and its strange that it
 shouldn't do the action.
 
 Has anybody come across this before? Any input on this strange issue
 much appreciated..
 
 Hope I haven't missed something obvious.
 
 Thanks a lot,
 
 Aditi
 
 (P.s.- I've tried converting columns one by one to factors this time,
 and that works.
 
 P1L55-factor(P1L55)
 levels(P1L55)
 [1] 0 1
 
 Code:
 
 prm-read.table(P:\\.  .csv, header=T, ...sep=,, ...)
 
 prmdf-data.frame(prm)
 
 prmdf[2:13]-lapply(prmdf[2:13], factor) ## action performed, no error
 message
 
 ##I tried to pick random columns and check
 
 levels(P1L55)
 NULL
 
 is.factor(P1L96)
 FALSE

Make sure that you are looking in the same object that you changed. E.g.

attach(prmdf)
prmdf[2:13]-lapply(prmdf[2:13], factor)
levels(P1L55)

is not going to work

levels(prmdf$P1L55)

should, or attaching _after_ the change. Also, make sure that you don't
have P1L55 et al. sitting in the global enviromnent.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Cluster analysis: hclust manipulation possible?

2009-11-16 Thread Jopi Harri

I am doing cluster analysis [hclust(Dist, method=average)] on
data that potentially contains redundant objects. As expected,
the inclusion of redundant objects affects the clustering result,
i.e., the data a1, = a2, = a3, b, c, d, e1, = e2 is likely to
cluster differently from the same data without the redundancy,
i.e., a1, b, c, d, e1. This is apparent when the outcome is
visualized as a dendrogram.

Now, it seems that the clustering result for which the redundancy
has been eliminated is more robust for the present assignment
than that of the redundant data. Naturally, there is no problem
in the elimination: just exclude the redundant objects from Dist.

However, it would be very convenient to be able to include the
redundant objects in the *dendrogram* by attaching them as
0-level branches to the subtrees, i.e.:

1.0---
0.5___|___|_..
0.0.._|_..|..|..|.._|_
|.|.|.|..|..|.|...|...
...a1a2a3.b..c..d.e1.e2...

instead of

1.0---
0.5___|___|_..
0.0...|...|..|..|...|.
..a1..b..c..d..e1.

The question: Can this be accomplished in the *dendrogram plot*
by manipulating the resulting hclust data structure or by some
other means, and if yes, how?

Jopi Harri

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] violin - like plots for bivariate data

2009-11-16 Thread Eric Nord


I'm attempting to produce something like a violin plot to display how y
changes with x for members of different groups (My specific case is how
floral area changes over time for several species of plants). I've looked at
panel.violin (in lattice), which makes nice violin plots, but is really set
up to work on a single variable - the area trace represents the frequency of
each value of x for each group.

I'm wondering if anyone is aware of a function to do this?

I can imagine how to accomplish this using polygon, but I will admit I'm not
sure what the best way would be to smooth the data. That said, I would
prefer not to reinvent the wheel!

Thanks in advance for any wisdom you can share!

Eric
-- 
View this message in context: 
http://old.nabble.com/violin---like-plots-for-bivariate-data-tp26373071p26373071.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] violin - like plots for bivariate data

2009-11-16 Thread Gabor Grothendieck

Try:

RSiteSearch(violin plot)

On Mon, Nov 16, 2009 at 9:40 AM, Eric Nord ericn...@psu.edu wrote:

 I'm attempting to produce something like a violin plot to display how y
 changes with x for members of different groups (My specific case is how
 floral area changes over time for several species of plants). I've looked at
 panel.violin (in lattice), which makes nice violin plots, but is really set
 up to work on a single variable - the area trace represents the frequency of
 each value of x for each group.

 I'm wondering if anyone is aware of a function to do this?

 I can imagine how to accomplish this using polygon, but I will admit I'm not
 sure what the best way would be to smooth the data. That said, I would
 prefer not to reinvent the wheel!

 Thanks in advance for any wisdom you can share!

 Eric
 --
 View this message in context: 
 http://old.nabble.com/violin---like-plots-for-bivariate-data-tp26373071p26373071.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] on gsub (simple, but not to me!) sintax

2009-11-16 Thread Ottorino-Luca Pantani


Duncan Murdoch ha scritto:

On 11/16/2009 8:21 AM, Ottorino-Luca Pantani wrote:

Dear R users,
my problem today deals with my ignorance on regular expressions.
a matter I recently discovered.


You were close.  First, gsub by default doesn't need escapes before 
the parens.  (There are lots of different conventions for regular 
expressions, unfortunately.)  So the Emacs regular expression V_\(.\)_ 
is entered as V_(.)_ in the default version of gsub().  Second, to 
enter a backslash into a string, you need to escape it.  So the 
replacement pattern V_0\1_ is entered as V_0\\1_.  So


gsub(V_(.)_, V_0\\1_, foo)

should give you what you want.

Duncan Murdoch
Any of these (the need for doubling of the \\ for the 
back-reference seems to be the main issue:

 gsub(_([[:digit:]])_., _0\\1_, foo)
 
 gsub(_(\\d)_., _0\\1_, foo)
 
 gsub(V_(.)_, V_0\\1_, foo)


David Winsemius, MD
Heritage Laboratories
West Hartford, CT




I suspected something on the double escape..

Thanks to you all.
R is a wonderful software and R-help is always a great place to visit !!!
8rino

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Weighted descriptives by levels of another variables

2009-11-16 Thread Andrew Miles

Thanks!  Using the plyr package and the approach you outlined seems to  
work well for relatively simple functions (like wtd.mean), but so far  
I haven't had much success in using it with more complex descriptive  
functions like describe {Hmisc}.  I'll take a look later, though, and  
see if I can figure out why.


At any rate, ddply() looks like it will simplify writing a function  
that will allow for weighting data and subdividing it, but still give  
comprehensive summary statistics (i.e. not just the mean or quantiles,  
but all in one).  I'll post it to the list once I have the time to  
write it up.


I also took a stab at using the svyby funtion in the survey package,  
but received the following error message when I input :


 svyby(cbind(educ, age), female, svynlsy, svymean)
Error in `[.survey.design2`(design, byfactor %in% byfactor[i], ) :
  (subscript) logical subscript too long
__
In addition to using the survey package (and the svyby function), I've  
found
that many of the 'weighted' functions, such as wtd.mean, work well  
with the

plyr package.  For example,

wtdmean=function(df)wtd.mean(df$obese,df$sampwt);
ddply(mydata, ~cut2(age,c(2,6,12,16)),'wtdmean')

hth, david freedman


Andrew Miles-2 wrote:


I've noticed that R has a number of very useful functions for
obtaining descriptive statistics on groups of variables, including
summary {stats}, describe {Hmisc}, and describe {psych}, but none that
I have found is able to provided weighted descriptives of subsets of a
data set (ex. descriptives for both males and females for age, where
accurate results require use of sampling weights).

Does anybody know of a function that does this?

What I've looked at already:

I have looked at describe.by {psych} which will give descriptives by
levels of another variable (eg. mean ages of males and females), but
does not accept sample weights.

I have also looked at describe {Hmisc} which allows for weights, but
has no functionality for subdivision.

I tried using a by() function with describe{Hmisc}:

by(cbind(my, variables, here), division.variable, describe,
weights=weight.variable)

but found that this returns an error message stating that the
variables to be described and the weights variable are not the same
length:

Error in describe.vector(xx, nam[i], exclude.missing =
exclude.missing,  :
  length of weights must equal length of x
In addition: Warning message:
In present  !is.na(weights) :
  longer object length is not a multiple of shorter object length

This comes because the by() function passes down a subset of the
variables to be described to describe(), but not a subset of the
weights variable.  describe() then searches the whatever data set is
attached in order to find the weights variables, but this is in its
original (i.e. not subsetted) form.  Here is an example using the
ChickWeight dataset that comes in the datasets package.

data(ChickWeight)
attach(ChickWeight)
library(Hmisc)
#this gives descriptive data on the variables Time and Chick by
levels of Diet)
by(cbind(Time, Chick), Diet, describe)
#trying to add weights, however, does not work for reasons described
above
wgt=rnorm(length(Chick), 12, 1)
by(cbind(Time, Chick), Diet, describe, weights=wgt)

Again, my question is, does anybody know of a function that combines
both the ability to provided weighted descriptives with the ability to
subdivide by the levels of some other variable?


Andrew Miles
Department of Sociology
Duke University


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package tm fails to remove the with remove stopwords

2009-11-16 Thread Mark Kimpel

Thanks Ingo.

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please


On Sun, Nov 15, 2009 at 11:05 AM, Ingo Feinerer feine...@logic.at wrote:

 On Thu, Nov 12, 2009 at 11:29:50AM -0500, Mark Kimpel wrote:
  I am using code that previously worked to remove stopwords using package
 tm.

 Thanks for reporting. This is a bug in the removeWords() function in
 tm version 0.5-1 available from CRAN:

  require(tm)
  myDocument - c(the rain in Spain, falls mainly on the plain, jack
 and jill ran up the hill, to fetch a pail of water)
  text.corp - Corpus(VectorSource(myDocument))
  #
  text.corp - tm_map(text.corp, stripWhitespace)
  text.corp - tm_map(text.corp, removeNumbers)
  text.corp - tm_map(text.corp, removePunctuation)
  ## text.corp - tm_map(text.corp, stemDocument)
  text.corp - tm_map(text.corp, removeWords, c(the,
 stopwords(english)))
  dtm - DocumentTermMatrix(text.corp)
  dtm
  dtm.mat - as.matrix(dtm)
  dtm.mat
 
   dtm.mat
  Terms
  Docs falls fetch hill jack jill mainly pail plain rain ran spain the
 water
 1 0 0000  00 01   0 1   1
 0
 2 1 0000  10 10   0 0   0
 0
 3 0 0111  00 00   1 0   0
 0
 4 0 1000  01 00   0 0   0
 1

 The function removeWords() fails to remove patterns at the beginning or at
 the end
 of a line.

 This bug is fixed in the latest development version on R-Forge, and
 the fix will be included in the next CRAN release.

 Please see

 https://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/inst/NEWS?root=tmview=markup
 for a list of all bug fixes and changes between each tm version.

 Best regards, Ingo Feinerer

 --
 Ingo Feinerer
 Vienna University of Technology
 http://www.dbai.tuwien.ac.at/staff/feinerer


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data source name not found and no default driver specified

2009-11-16 Thread helpme

I'm stumped. When trying to connect to Oracle using the RODBC package I get
an error:
*[RODBC] Data source name not found and no default driver specified.
ODBC connect failed.*

I've read over all the posts and documentation manuals.
The system is Windows Server 2003 with R 2.81. and the latest downloadable
RODBC package. The Oracle SID/DSN is mfopdw. I made sure to add it to
Control Panel-Administrative Priviledges-Microsoft ODBC system/user DNS.

I've also tried the following in no particular order:

1.) Turn on all oracle services in control panel-administrative
priviledges.
2.) Checked tsnnames.ora for SID.
3.) Add microsoft ODBC service to Control Panel services for SID
4.) Use Sqldeveler to test connection another way besides R (It was
successful)
5.) channel-odbcDriverConnect(connection=Driver={Microsoft ODBC for
Oracle}; DSN=abc,UID=abc;PWD=abc;case=oracle)

received error drivers SQLAllocHandle on SQL_HANDLE_ENV failed one time;
another time I got the error that Oracle client and networking components
7.3 or greater is not found.

6.) tnsping mfopdw

lsnrctl start mfopdw

tried to add oracle/bin to path

Nothing is working.


Please advise.

Thank you,

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] extracting estimated covariance parameters from lme fit

2009-11-16 Thread Green, Gerwyn (greeng6)

Dear all

Apologies in advance as this seems like a trivial question. Nonetheless,
a question I haven't been able to resolve myself !. Within a single
repetition of a simulation (to be repeated many times) I am fitting the
following linear mixed model using lme...

Y_{gtr} = \mu + U_{g} +  W_{gt} + Z_{gtr}

U_{g} ~ N(0,\gamma^{2}), W_{gt} ~ N(0,\kappa^{2}), Z_{gtr} ~
N(0,\tau^{2})

g = 1,...,G
t = 1,...,T
r= 1,...,R


...by doing

 Model.fit - lme(Y ~ 1, data=data, random= ~1|gene/treatment)

I would like to be able to extract the estimated covariance parameters
contained within the lme object. I know if I type...

 Model.fit$sigma


...then I get the estimated residual variance, i.e. within the context
of the above model, the estimate for \tau. But I would also like to
extract the estimates for \gamma and \kappa by doing
Model.fit$something. I am aware that I can view the output using the
extractor function summary, but within a single repetition of my
simulation routine I want to be able to code something like

gamma - Model.fit$. 
kappa - Model.fit$.


and then plug `gamma' and `kappa' into some formulae. This process of
fitting and extracting will be repeated many times, which is why I wish
to automate everything.

Again, any help would be greatly appreciated

Best

Gerwyn Green
School of Health and Medicine
Lancaster University






Any help would be greatly appreciated

Best


Gerwyn Green
School of Health and Medicine
Lancaster Uinversity

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] violin - like plots for bivariate data

2009-11-16 Thread Liaw, Andy

sounds like bivariate density contours may be what you're looking for.


Andy

From: Eric Nord
 
 I'm attempting to produce something like a violin plot to 
 display how y
 changes with x for members of different groups (My specific 
 case is how
 floral area changes over time for several species of plants). 
 I've looked at
 panel.violin (in lattice), which makes nice violin plots, but 
 is really set
 up to work on a single variable - the area trace represents 
 the frequency of
 each value of x for each group.
 
 I'm wondering if anyone is aware of a function to do this?
 
 I can imagine how to accomplish this using polygon, but I 
 will admit I'm not
 sure what the best way would be to smooth the data. That said, I would
 prefer not to reinvent the wheel!
 
 Thanks in advance for any wisdom you can share!
 
 Eric
 -- 
 View this message in context: 
 http://old.nabble.com/violin---like-plots-for-bivariate-data-t
 p26373071p26373071.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (Parallel) Random number seed question...

2009-11-16 Thread Blair Christian

Hi All,

I have k identical parallel pieces of code running, each using n.rand
random numbers.  I would like to use the same RNG (for now), and set
the seeds so that I can guarantee that there are no overlaps in the
random numbers sampled by the k pieces of code.  Another side goal is
to have reproducibility of my results.  In the past I have used C with
SPRNG for this task, but I'm hoping that there is an easy way to do
this in R with poor man's parallelization (eg running multiple Rs on
multiple processors without the overhead of setting up any mpi or
using snow(fall)).  It is not clear from the documentation if set.seed
arguments are sequential or not for a given RNG, eg if set.seed(1) on
processor 1, set.seed(1+n.rand) on processor 2, set.seed(1+2*n.rand)
on processor 3, etc for the default RNG Mersenne-Twister.  An easy
approach would be to simply write a script to generate n.rand numbers,
records the .Random.seed and proceeds in that manner- inelegant, but
effective.  My question here is Is there a better way?


(mvtnorm part directed to Torsten Hothorn)
To further clarify, it seems there is a different RNG for normal
(rnorm) than for everything else? (eg RNGKind( ..,
normal.kind=Inversion); further, does anybody know if mvtnorm uses
this generator?


Further, some RNGs seem to be based on the archictecture (eg the
Knuth-TAOCP-2002 for example)- is the period really related to 2^32,
or is it dependent the architecture, 2^64 for 64 bit R and 2^32 for 32
bit R?


I noticed there are several packages related to RNG- please direct me
to a vignette/R news article/previous post if this has been covered ad
nauseum.  I have skimmed vignettes/docs for rsprng package, RNG doc in
base, setRNG package, mvtnorm package vignette  (Or am I setting
myself up to write a current RNG doc?)


(directed to Gregory Warnes)
I found a presentation by Gregory Warnes from 1999 addressing these
same questions (and uses a collings generator in some C code).
http://www.r-project.org/conferences/DSC-1999/slides/warnes.ps.gz
Have you turned to the snowfall related parallel implementations, did
your Collings generator work well, or have you discovered another
trick you might like to share?

Thank you all for your time and excellent contributions to the open
source community,
Blair

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] extracting values from correlation matrix

2009-11-16 Thread Lee William

Hi! All,

I have 2 correlation matrices of 4000x4000 both with same row names and
column names say cor1 and cor2. I have extracted some information from 1st
matrix cor1 which is something like this:

rowname  colname  cor1_value
 a  b0.8
 b  a0.8
 c  f 0.62
 d  k0.59
 -  -  --
 -  -  --

Now I wish to extract values from matrix cor2 for the same rowname and
colname as above so that it looks similar to something like this with values
in cor2_value:

rowname  colname  cor1_value  cor2_value
 a  b0.8 ---
 b  a0.8 ---
 c  f 0.62   ---
 d  k0.59   ---
 -  -  --  ---
 -  -  --  ---

I am running out of ideas. So I decided to post this on mailing list. Please
Help!

Best
Lee

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extracting values from correlation matrix

2009-11-16 Thread jim holtman

Assuming that your data is in a dataframe 'cordata' , then following
should work:

cordata$cor2_value - sapply(1:nrow(cordata), function(.row){
cor2[cordata$rowname[.row], cordata$colname[.row]]
}

On Mon, Nov 16, 2009 at 11:44 AM, Lee William leeon...@gmail.com wrote:
 Hi! All,

 I have 2 correlation matrices of 4000x4000 both with same row names and
 column names say cor1 and cor2. I have extracted some information from 1st
 matrix cor1 which is something like this:

 rowname  colname  cor1_value
  a              b            0.8
  b              a            0.8
  c              f             0.62
  d              k            0.59
  -              -              --
  -              -              --

 Now I wish to extract values from matrix cor2 for the same rowname and
 colname as above so that it looks similar to something like this with values
 in cor2_value:

 rowname  colname  cor1_value  cor2_value
  a              b            0.8             ---
  b              a            0.8             ---
  c              f             0.62           ---
  d              k            0.59           ---
  -              -              --              ---
  -              -              --              ---

 I am running out of ideas. So I decided to post this on mailing list. Please
 Help!

 Best
 Lee

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cluster analysis: hclust manipulation possible?

2009-11-16 Thread Charles C. Berry


On Mon, 16 Nov 2009, Jopi Harri wrote:


I am doing cluster analysis [hclust(Dist, method=average)] on
data that potentially contains redundant objects. As expected,
the inclusion of redundant objects affects the clustering result,
i.e., the data a1, = a2, = a3, b, c, d, e1, = e2 is likely to
cluster differently from the same data without the redundancy,
i.e., a1, b, c, d, e1. This is apparent when the outcome is
visualized as a dendrogram.

Now, it seems that the clustering result for which the redundancy
has been eliminated is more robust for the present assignment
than that of the redundant data. Naturally, there is no problem
in the elimination: just exclude the redundant objects from Dist.

However, it would be very convenient to be able to include the
redundant objects in the *dendrogram* by attaching them as
0-level branches to the subtrees, i.e.:

1.0---
0.5___|___|_..
0.0.._|_..|..|..|.._|_
|.|.|.|..|..|.|...|...
...a1a2a3.b..c..d.e1.e2...

instead of

1.0---
0.5___|___|_..
0.0...|...|..|..|...|.
..a1..b..c..d..e1.

The question: Can this be accomplished in the *dendrogram plot*
by manipulating the resulting hclust data structure or by some
other means, and if yes, how?



Yes, you need to study

?hclust

particularly the part about 'Value' from which you will see what needs 
modification.



Here is a very simple example:


res - hclust(dist(1-diag(3)*rnorm(3)))
plot(res)
res2 - res
res2$merge - rbind(-cbind(1:3,4:6), matrix(ifelse( res2$merge0, -res2$merge, 
res2$merge+sum(res2$merge0)),2))
res2$height - c(rep(0,3), res2$height)
res2$order - as.vector( rbind(res2$order,(4:6)[res2$order]) )
plot(res2)
str( res )
str( res2 )



Alternatively, you could use as.dendrogram( res ) as the point of 
departure and manipulate the value.


HTH,

Chuck





Jopi Harri

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] specifying group plots using panel.groups

2009-11-16 Thread wintere


Hi, 

I am trying to plot two types of data on the same graph: points and
distributions. I am attempting to use the panel.groups function, but cannot
seem to get it to work. I have a melted data set and put in a FLAG column to
separate my data into the two groups that I would like to plot, point data
(FLAG=0) and the distribution(FLAG=1). Here is the code i am using in R: 

stripplot(
variable~value,
conf(RunlogBootCL), 
groups=FLAG,
panel=panel.superpose,
panel.groups=function(x,y, group.number, ...){
if(group.number==1)panel.xyplot(x,y, group.number...)
else if(group.number==0)panel.covplot(x,y, group.number...)},
ref=1,
cex = 0.5,
col=black,
main='Covariate Effects on Clearance',
xlab='relative clearance',
fill='transparent'
)

For some reason I can only get one or the other to plot!! (points or
distributions). Can you please direct me to my error?! thanks!


-- 
View this message in context: 
http://old.nabble.com/specifying-group-plots-using-panel.groups-tp26374674p26374674.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R-help

2009-11-16 Thread Lloyd Barcza



I have been trying to write a function for the following problem:

Suppose I have three vectors a,b,c of different lengths:
e.g. a=c(a1,a2,a3,...) where a[i] form the basis of our function variables:

if we define a table for example:




and define the fn(x) -function{..sum(argument)..}
where x-c(a,b,c) so that we can maximise fn(x) as:

optim(...,fn,). So in other words the problem
if to find optimal values for vectors a,b,c. I'm
not sure how to set this problem up as a function
in R. Any help would be much appreciated.

Regards

Lloyd Barcza
- Forwarded by Lloyd Barcza/UK/RoyalSun on
16/11/2009 15:07 -






Lloyd Barcza
Pricing Analyst
Affinity Pricing
Royal  SunAlliance
Tel: 01403 234784
Email: lloyd.bar...@uk.rsagroup.com


   
__

   **
   MORE THN ® is a trading style of Royal  Sun Alliance
   Insurance plc (No. 93792). Registered in England and Wales at
   St. Mark's Court, Chart Way, Horsham, West Sussex RH12 1XL.
   Authorised  regulated by the Financial Services Authority

   For your protection, telephone calls will be recorded and may be
   monitored.

   The information in this e-mail is confidential and ma...{{dropped:15}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] test for causality

2009-11-16 Thread tobiasfa


Hi useRs..

I cant figure out how to test for causality using causality() in vars
package

I have two datasets (A, B) and i want to test if A (Granger)cause B.
How do I write the script? I dont understand ?causality. How do I get x to
contain A and B. Further using the command VAR() to specify x, I dont
either understand.

Kind regards Tobias

-- 
View this message in context: 
http://old.nabble.com/test-for-causality-tp26373931p26373931.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R-help

2009-11-16 Thread Lloyd Barcza



I have been trying to write a function for the following problem:

Suppose I have three vectors a,b,c of different lengths:
e.g. a=c(a1,a2,a3,...) where a[i] form the basis of our function variables:

if we define a table for example:




and define the fn(x) -function{..sum(argument)..}
where x-c(a,b,c) so that we can maximise fn(x) as:

optim(...,fn,). So in other words the problem
if to find optimal values for vectors a,b,c. I'm
not sure how to set this problem up as a function
in R. Any help would be much appreciated.

Regards

Lloyd Barcza
- Forwarded by Lloyd Barcza/UK/RoyalSun on
16/11/2009 15:07 -






Lloyd Barcza
Pricing Analyst
Affinity Pricing
Royal  SunAlliance
Tel: 01403 234784
Email: lloyd.bar...@uk.rsagroup.com


   
__

   **
   MORE THN ® is a trading style of Royal  Sun Alliance
   Insurance plc (No. 93792). Registered in England and Wales at
   St. Mark's Court, Chart Way, Horsham, West Sussex RH12 1XL.
   Authorised  regulated by the Financial Services Authority

   For your protection, telephone calls will be recorded and may be
   monitored.

   The information in this e-mail is confidential and ma...{{dropped:15}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Step Function Freezing R

2009-11-16 Thread Jgabriel

Can you think of any systemic changes that might interefere with R
besides Symantec EndPoint and LiveUpdate? I have removed those
programs and allocated more memory to R, but it is still way too
slow.

On Nov 13, 10:45 pm, J Dougherty j...@surewest.net wrote:
 On Friday 13 November 2009 07:17:28 am Jgabriel wrote:



  I can't fully answer all these questions, but I'll do my best - There
  have not been any updates of Windows, and I did not update R during
  the period, although I did reinstall it after the problem started.
  There have been no changes to Norton or any other software that uses
  system resources in that way. The one thing I can think of is that I
  installed a program called Digitizer (creates data tables/ csds from
  visually analyzing line charts) around the same time it started
 freezing. I have completely uninstalled it and deleted all related
  files. The problem should not be with the data, which is fine. I am
  running Windows XP Professional and Excel 2007. I have allowed the
  process to run overnight on two separate occasions. The first time it
  was completely frozen and there was no progress. Last night it
  actually made progress, but the fact remains that a process that used
  to take an hour at the most still has not even halfway completed after
  over 8 hours. I even installed extra RAM on the system so that there
  is more than when the process used to work. I agree that it is
  probably a change in software, but I can't figure out what has changed
  or what I can do about it.

 OK, from what you are saying, it seems clear the problem is a system problem
 rather than an R issue.  MS issues patches every month, so there may not have
 been upgrades, but there could still be systemic changes.   As regards R, did
 you update R?  How big is the data table?  Has it grown?  Did you or someone
 else alter the available memory to R by an environment setting?  Have you
 searched Memory in R help and manuals?  Is this computer yours, or is it
 used by others who may have altered settings?  These are all questions that
 may be pertinent.  Good luck.

 JWDougherty

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.- Hide 
 quoted text -

 - Show quoted text -

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (Parallel) Random number seed question...

2009-11-16 Thread Torsten Hothorn



On Mon, 16 Nov 2009, Blair Christian wrote:


Hi All,

I have k identical parallel pieces of code running, each using n.rand
random numbers.  I would like to use the same RNG (for now), and set
the seeds so that I can guarantee that there are no overlaps in the
random numbers sampled by the k pieces of code.  Another side goal is
to have reproducibility of my results.  In the past I have used C with
SPRNG for this task, but I'm hoping that there is an easy way to do
this in R with poor man's parallelization (eg running multiple Rs on
multiple processors without the overhead of setting up any mpi or
using snow(fall)).  It is not clear from the documentation if set.seed
arguments are sequential or not for a given RNG, eg if set.seed(1) on
processor 1, set.seed(1+n.rand) on processor 2, set.seed(1+2*n.rand)
on processor 3, etc for the default RNG Mersenne-Twister.  An easy
approach would be to simply write a script to generate n.rand numbers,
records the .Random.seed and proceeds in that manner- inelegant, but
effective.  My question here is Is there a better way?


(mvtnorm part directed to Torsten Hothorn)
To further clarify, it seems there is a different RNG for normal
(rnorm) than for everything else? (eg RNGKind( ..,
normal.kind=Inversion); further, does anybody know if mvtnorm uses
this generator?



mvtnorm is based on FORTRAN code which uses unif_rand() from the C API:

void F77_SUB(rndstart)(void) { GetRNGstate(); }
void F77_SUB(rndend)(void) { PutRNGstate(); }
double F77_SUB(unifrnd)(void) { return unif_rand(); }

Torsten



Further, some RNGs seem to be based on the archictecture (eg the
Knuth-TAOCP-2002 for example)- is the period really related to 2^32,
or is it dependent the architecture, 2^64 for 64 bit R and 2^32 for 32
bit R?


I noticed there are several packages related to RNG- please direct me
to a vignette/R news article/previous post if this has been covered ad
nauseum.  I have skimmed vignettes/docs for rsprng package, RNG doc in
base, setRNG package, mvtnorm package vignette  (Or am I setting
myself up to write a current RNG doc?)


(directed to Gregory Warnes)
I found a presentation by Gregory Warnes from 1999 addressing these
same questions (and uses a collings generator in some C code).
http://www.r-project.org/conferences/DSC-1999/slides/warnes.ps.gz
Have you turned to the snowfall related parallel implementations, did
your Collings generator work well, or have you discovered another
trick you might like to share?

Thank you all for your time and excellent contributions to the open
source community,
Blair

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Discontinuous graph

2009-11-16 Thread Tim Smith

Hi,
I wanted to make a graph with the following table (2 rows, 3 columns): 
a b c
x 1 3 5
y 5 8 6
The first column represents the start cordinate, and the second column contains 
the end cordinate for the x-axis. The third column contains the y-axis 
co-ordinate. For example, the first row in the matrix above represents the 
points (1,5),(2,5), (3,5). How would I go about making a discontinuous graph ?

thanks!


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Discontinuous graph

2009-11-16 Thread Steve Lianoglou


Hi Tim,

On Nov 16, 2009, at 12:40 PM, Tim Smith wrote:


Hi,
I wanted to make a graph with the following table (2 rows, 3 columns):
a b c
x 1 3 5
y 5 8 6
The first column represents the start cordinate, and the second  
column contains the end cordinate for the x-axis. The third column  
contains the y-axis co-ordinate. For example, the first row in the  
matrix above represents the points (1,5),(2,5), (3,5). How would I  
go about making a discontinuous graph ?


What is it that you want to do with this graph? Or, how do you want  
represent it?


Do you just want to generate the sequence of points? I'm guessing not,  
but here's code to do that and stores into the edge.pairs matrix  
(first row is the x-values, 2nd row is the y-value of the same point)


data.matrix - matrix(c(1,3,5,5,8,6), nrow=2, byrow=T)
points - apply(data.matrix, 1, function(row) unlist(t(expand.grid(row 
[1]:row[2], row[3]

edge.pairs - do.call(cbind, points)

It should be pretty straightforward to convert edge.paris into an  
adjacency matrix, if you like. Also, if you're thinking about using R  
to work with graphs, I'd suggest checking out the igraph pacakge.


Hope that helps,

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Discontinuous graph



On Nov 16, 2009, at 12:40 PM, Tim Smith wrote:


Hi,
I wanted to make a graph with the following table (2 rows, 3 columns):
a b c
x 1 3 5
y 5 8 6
The first column represents the start cordinate, and the second  
column contains the end cordinate for the x-axis. The third column  
contains the y-axis co-ordinate. For example, the first row in the  
matrix above represents the points (1,5),(2,5), (3,5). How would I  
go about making a discontinuous graph ?


thanks!


coords - read.table(textConnection(a b c
 x 1 3 5
 y 5 8 6), header=TRUE)

 plot(NULL, NULL, xlim = c(min(coords$a)-.5, max(coords$b)+.5),  
ylim=c(min(coords$c)-.5, max(coords$c)+.5)  )
 apply(coords, 1, function(x) segments(x0=x[1],y0= x[3], x1= x[2],  
y1=x[3]) )


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error on reading an excel file



anna_l wrote:
 
 Hello everybody, here is the code I use to read an excel file containing
 two rows, one of date, the other of prices:
 library(RODBC)
   z - odbcConnectExcel(SPX_HistoricalData.xls)
   datas - sqlFetch(z,Sheet1)
   close(z)
 It works pretty well but the only thing is that the datas stop at row 7530
 and I don´t know why datas is a data frame that contains 7531 rows with
 the last two ones = NA...
 

I find this occurs sometimes when I export an Excel worksheet to CSV.  Excel
will include one or more rows of blank cells after the data stops.  I would
imagine the behavior you are seeing with RODBC is due to the same issue.

I don't know if there is anything that can be done about it other than to
trim your dataset back to the appropriate length once it gets into R.


Good luck!

-Charlie

-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://old.nabble.com/Error-on-reading-an-excel-file-tp26371750p26376554.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error on reading an excel file


Thanks Charlie, well yes it included one row with two NA datas. I guess there
is an explanation, let´s wait and see if someone knows more about it :)


cls59 wrote:
 
 
 anna_l wrote:
 
 Hello everybody, here is the code I use to read an excel file containing
 two rows, one of date, the other of prices:
 library(RODBC)
  z - odbcConnectExcel(SPX_HistoricalData.xls)
  datas - sqlFetch(z,Sheet1)
  close(z)
 It works pretty well but the only thing is that the datas stop at row
 7530 and I don´t know why datas is a data frame that contains 7531 rows
 with the last two ones = NA...
 
 
 I find this occurs sometimes when I export an Excel worksheet to CSV. 
 Excel will include one or more rows of blank cells after the data stops. 
 I would imagine the behavior you are seeing with RODBC is due to the same
 issue.
 
 I don't know if there is anything that can be done about it other than to
 trim your dataset back to the appropriate length once it gets into R.
 
 
 Good luck!
 
 -Charlie
 


-
Anna Lippel
new in R so be careful I should be asking a lt of questions!:teeth:
-- 
View this message in context: 
http://old.nabble.com/Error-on-reading-an-excel-file-tp26371750p26376656.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Discontinuous graph

2009-11-16 Thread Steve Lianoglou



On Nov 16, 2009, at 12:58 PM, David Winsemius wrote:


On Nov 16, 2009, at 12:40 PM, Tim Smith wrote:


Hi,
I wanted to make a graph with the following table (2 rows, 3  
columns):

a b c
x 1 3 5
y 5 8 6
The first column represents the start cordinate, and the second  
column contains the end cordinate for the x-axis. The third column  
contains the y-axis co-ordinate. For example, the first row in the  
matrix above represents the points (1,5),(2,5), (3,5). How would I  
go about making a discontinuous graph ?


thanks!


coords - read.table(textConnection(a b c
x 1 3 5
y 5 8 6), header=TRUE)

plot(NULL, NULL, xlim = c(min(coords$a)-.5, max(coords$b)+.5), ylim=c 
(min(coords$c)-.5, max(coords$c)+.5)  )
apply(coords, 1, function(x) segments(x0=x[1],y0= x[3], x1= x[2],  
y1=x[3]) )


Oh, *that* kind of graph!

... my high-school English teacher once said that all communication  
is miscommunication because we each interpret things according to our  
own experiences, etc ... I guess that goes to show:


  (i) me that he was right (once again);
  (ii) you what I've been working on lately :-)

Sorry for the line-noise,

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error on reading an excel file

2009-11-16 Thread Gabor Grothendieck

You could try one of the other methods of reading Excel files and see
if they are affected:

http://wiki.r-project.org/rwiki/doku.php?id=tips:data-io:ms_windows

On Mon, Nov 16, 2009 at 8:19 AM, anna_l lippelann...@hotmail.com wrote:

 Hello everybody, here is the code I use to read an excel file containing two
 rows, one of date, the other of prices:
 library(RODBC)
        z - odbcConnectExcel(SPX_HistoricalData.xls)
        datas - sqlFetch(z,Sheet1)
        close(z)
 It works pretty well but the only thing is that the datas stop at row 7530
 and I don´t know why datas is a data frame that contains 7531 rows with the
 last two ones = NA...

 -
 Anna Lippel
 new in R so be careful I should be asking a lt of questions!:teeth:
 --
 View this message in context: 
 http://old.nabble.com/Error-on-reading-an-excel-file-tp26371750p26371750.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error on reading an excel file



Gabor Grothendieck wrote:
 
 You could try one of the other methods of reading Excel files and see
 if they are affected:
 

I would guess that since Excel includes the blank rows when exporting to
CSV, then blank cells are being stored by Excel in the data files--
therefore any method of extracting data from those files will also pick up
the empty cells.

I think the crux of this issue lies with Excel and you will probably have to
look for a fix there.


-Charlie

-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://old.nabble.com/Error-on-reading-an-excel-file-tp26371750p26376915.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sum over indexed value

2009-11-16 Thread Gunadi


I am sure this is easy but I am not finding a function to do this. 

I have two columns in a matrix. The first column contains multiple entries
of numbers from 1 to 100 (i.e. 10 ones, 8 twos etc.). The second column
contains unique numbers. I want to sum the numbers in column two based on
the indexed values in column one (e.g. sum of all values in column two
associated with the value 1 in column one). I would like two columns in
return - the indexed value in column one (i.e. this time no duplicates) and
the sum in column two. 

How do I do this? 
-- 
View this message in context: 
http://old.nabble.com/Sum-over-indexed-value-tp26376359p26376359.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] printing a single row, but dont know which row to print

2009-11-16 Thread frenchcr



I have 20 columns of data, and in column 5 I have a value of 17600 but I
dont know which row this value is in (i have over 300,000 rows).

I'm trying to do 2 things:

1) I want to find out which row in column 5 has this number in it.

2) Then I want to print out that row with all the column headers so i can
look at the other parameters in the row that are associated with this value.


How do i do it?


-- 
View this message in context: 
http://old.nabble.com/printing-a-single-row%2C-but-dont-know-which-row-to-print-tp26376647p26376647.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Discontinuous graph

Hi,

An alternative with ggplot2,

library(ggplot2)

ggplot(data=coords) +
  geom_segment(aes(x=a, xend=b, y=c, yend=c))


HTH,

baptiste

2009/11/16 David Winsemius dwinsem...@comcast.net:

 On Nov 16, 2009, at 12:40 PM, Tim Smith wrote:

 Hi,
 I wanted to make a graph with the following table (2 rows, 3 columns):
 a b c
 x 1 3 5
 y 5 8 6
 The first column represents the start cordinate, and the second column
 contains the end cordinate for the x-axis. The third column contains the
 y-axis co-ordinate. For example, the first row in the matrix above
 represents the points (1,5),(2,5), (3,5). How would I go about making a
 discontinuous graph ?

 thanks!

 coords - read.table(textConnection(a b c
  x 1 3 5
  y 5 8 6), header=TRUE)

  plot(NULL, NULL, xlim = c(min(coords$a)-.5, max(coords$b)+.5),
 ylim=c(min(coords$c)-.5, max(coords$c)+.5)  )
  apply(coords, 1, function(x) segments(x0=x[1],y0= x[3], x1= x[2], y1=x[3])
 )

 --

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sum over indexed value

2009-11-16 Thread smu



P=data.frame(x=c(1,1,2,3,2,1),y=rnorm(6))
tapply(P$y,P$x,sum)

regards,
 stefan


On Mon, Nov 16, 2009 at 09:49:17AM -0800, Gunadi wrote:
 
 I am sure this is easy but I am not finding a function to do this. 
 
 I have two columns in a matrix. The first column contains multiple entries
 of numbers from 1 to 100 (i.e. 10 ones, 8 twos etc.). The second column
 contains unique numbers. I want to sum the numbers in column two based on
 the indexed values in column one (e.g. sum of all values in column two
 associated with the value 1 in column one). I would like two columns in
 return - the indexed value in column one (i.e. this time no duplicates) and
 the sum in column two. 
 
 How do I do this? 
 -- 
 View this message in context: 
 http://old.nabble.com/Sum-over-indexed-value-tp26376359p26376359.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] printing a single row, but dont know which row to print

Hi,

Try this,

set.seed(2) # reproducible
d = matrix(sample(1:20,20), 4, 5)
d

d[ d[ ,2] == 18 , ]

You may need to test with all.equal if your values are subject to
rounding errors.

HTH,

baptiste

2009/11/16 frenchcr frenc...@btinternet.com:


 I have 20 columns of data, and in column 5 I have a value of 17600 but I
 dont know which row this value is in (i have over 300,000 rows).

 I'm trying to do 2 things:

 1) I want to find out which row in column 5 has this number in it.

 2) Then I want to print out that row with all the column headers so i can
 look at the other parameters in the row that are associated with this value.


 How do i do it?


 --
 View this message in context: 
 http://old.nabble.com/printing-a-single-row%2C-but-dont-know-which-row-to-print-tp26376647p26376647.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] No Visible Binding for global variable

2009-11-16 Thread Doran, Harold

While building a package, I see the following:

* checking R code for possible problems ... NOTE
cheat.fit: no visible binding for global variable 'Zobs'
plot.jml: no visible binding for global variable 'Var1'

I see the issue has come up before, but I'm having a hard time discerning how 
solutions applied elsewhere would apply here. The entire code for both 
functions is below, but the only place the variable Zobs appears in the 
function cheat.fit is:

cheaters - cbind(data.frame(cheaters), exactMatch)
names(cheaters)[1] - 'Zobs'
names(cheaters)[2] - 'Nexact'
cheaters$Zcrit - Zcrit
cheaters$Mean - means
cheaters$Var - vars
cheaters - subset(cheaters, Zobs = Zcrit)
result -  list(pairs = c(row.names(cheaters)), Ncheat = 
nrow(cheaters),
TotalCompare = totalCompare, alpha = alpha,
ExactMatch = cheaters$Nexact, Zobs = cheaters$Zobs, Zcrit 
= Zcrit,
Mean = cheaters$Mean, Variance = cheaters$Var, Probs = 
stuProbs)
result

and the only place Var1 appears in the plot function is here

prop.correct - subset(data.frame(prop.table(table(tmp[, i+1], tmp$Estimate), 
margin=2)), Var1 == 1)[, 2:3]

Many thanks,
Harold

 sessionInfo()
R version 2.10.0 (2009-10-26)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

cheat.fit - function(dat, key, wrongChoice, alpha = .01, rfa = c('nr', 'uni', 
'bsct'), bonf = c('yes','no'), con = 1e-12, lower = 0, upper = 50){
bonf - tolower(bonf)
bonf - match.arg(bonf)
rfa  - match.arg(rfa)
rfa  - tolower(rfa)
dat - t(dat)
correctStuMat - numeric(ncol(dat))
for(i in 1:ncol(dat)){
correctStuMat[i] - mean(key==dat[,i], na.rm= 
TRUE)
}

correctClsMat - numeric(length(key))
for(i in 1:length(key)){
correctClsMat[i] - mean(key[i]==dat[i,], 
na.rm= TRUE)
}

### this is here for cases if all students in a class
### did not answer the item
correctClsMat[is.na(correctClsMat)] - 0

pCorr - function(R,c,q){

numer - function(R,a,c,q){
result - sum((1-(1-R)^a)^(1/a), na.rm= 
TRUE)-c*q
result
}

denom - function(R,a,c,q){
result - sum(na.rm= TRUE, -((1 - (1 - 
R)^a)^(1/a) * (log((1 - (1 - R)^a)) * (1/a^2)) +
(1 - (1 - R)^a)^((1/a) - 1) * ((1/a) * ((1 - 
R)^a * log((1 - R))
result
}

aConst - function(R, c, q, con){
a - .5 # starting value for a
change - 1
while(abs(change)  con) {
r1 - numer(R,a,c,q)
r2 - denom(R,a,c,q)
change - r1/r2
a - a - change
}
a
}

bisect - function(R, c, q, lower, upper, con){
f - function(a) sum((1 - (1-R)^a)^(1/a)) - c * 
q
if(f(lower) * f(upper)  0)
stop(endpoints must have 
opposite signs)
while(abs(lower-upper)  con){
x = .5 * (lower+upper)
if(f(x) * f(lower) =0) lower = 
x
else upper = x
}
.5 * (lower+upper)
   }

if(rfa == 'nr'){
if(any(correctClsMat==1)) 
correctClsMat[correctClsMat==1]-. else correctClsMat
if(any(correctClsMat==0)) 
correctClsMat[correctClsMat==0]-.0001 else correctClsMat
a - aConst(R,c,q, con)
} else if(rfa == 'uni'){
f - function(R, a, c, q) sum((1 - 
(1-R)^a)^(1/a)) - c * q
a - uniroot(f, c(lower,upper), R = R, c = c, q 
= q)$root
} else if(rfa == 'bsct'){
a - bisect(R, c, q, lower = lower, upper = 
upper, con)
}

result - (1-(1-R)^a)^(1/a)
result
} # end pCorr function

Re: [R] printing a single row, but dont know which row to print



On Nov 16, 2009, at 1:38 PM, baptiste auguie wrote:


Hi,

Try this,

set.seed(2) # reproducible
d = matrix(sample(1:20,20), 4, 5)
d

d[ d[ ,2] == 18 , ]

You may need to test with all.equal if your values are subject to
rounding errors.

HTH,

baptiste

2009/11/16 frenchcr frenc...@btinternet.com:



I have 20 columns of data, and in column 5 I have a value of 17600  
but I

dont know which row this value is in (i have over 300,000 rows).

I'm trying to do 2 things:

1) I want to find out which row in column 5 has this number in it.


Using baptiste's setup:

 which(d[, 2]==18)
[1] 4



2) Then I want to print out that row with all the column headers so  
i can
look at the other parameters in the row that are associated with  
this value.



How do i do it?


--
View this message in context: 
http://old.nabble.com/printing-a-single-row%2C-but-dont-know-which-row-to-print-tp26376647p26376647.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] No Visible Binding for global variable

2009-11-16 Thread William Dunlap

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Doran, Harold
 Sent: Monday, November 16, 2009 10:45 AM
 To: r-help@r-project.org
 Subject: [R] No Visible Binding for global variable
 
 While building a package, I see the following:
 
 * checking R code for possible problems ... NOTE
 cheat.fit: no visible binding for global variable 'Zobs'
 plot.jml: no visible binding for global variable 'Var1'
 
 I see the issue has come up before, but I'm having a hard 
 time discerning how solutions applied elsewhere would apply 
 here. The entire code for both functions is below, but the 
 only place the variable Zobs appears in the function cheat.fit is:
 
 cheaters - cbind(data.frame(cheaters), exactMatch)
 names(cheaters)[1] - 'Zobs'
 names(cheaters)[2] - 'Nexact'
 cheaters$Zcrit - Zcrit
 cheaters$Mean - means
 cheaters$Var - vars
 cheaters - subset(cheaters, Zobs = Zcrit)

The code in the codetools package does not know
that subset() does not evaluate its second argument
in the standard way.  Hence it gives a false alarm
here.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  


 result -  list(pairs = 
 c(row.names(cheaters)), Ncheat = nrow(cheaters),
 TotalCompare = totalCompare, alpha = alpha,
 ExactMatch = cheaters$Nexact, Zobs = 
 cheaters$Zobs, Zcrit = Zcrit,
 Mean = cheaters$Mean, Variance = 
 cheaters$Var, Probs = stuProbs)
 result
 
 and the only place Var1 appears in the plot function is here
 
 prop.correct - subset(data.frame(prop.table(table(tmp[, 
 i+1], tmp$Estimate), margin=2)), Var1 == 1)[, 2:3]
 
 Many thanks,
 Harold
 
  sessionInfo()
 R version 2.10.0 (2009-10-26)
 i386-pc-mingw32
 
 locale:
 [1] LC_COLLATE=English_United States.1252
 [2] LC_CTYPE=English_United States.1252
 [3] LC_MONETARY=English_United States.1252
 [4] LC_NUMERIC=C
 [5] LC_TIME=English_United States.1252
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base
 
 cheat.fit - function(dat, key, wrongChoice, alpha = .01, rfa 
 = c('nr', 'uni', 'bsct'), bonf = c('yes','no'), con = 1e-12, 
 lower = 0, upper = 50){
 bonf - tolower(bonf)
 bonf - match.arg(bonf)
 rfa  - match.arg(rfa)
 rfa  - tolower(rfa)
 dat - t(dat)
 correctStuMat - numeric(ncol(dat))
 for(i in 1:ncol(dat)){
 correctStuMat[i] - 
 mean(key==dat[,i], na.rm= TRUE)
 }
 
 correctClsMat - numeric(length(key))
 for(i in 1:length(key)){
 correctClsMat[i] - 
 mean(key[i]==dat[i,], na.rm= TRUE)
 }
 
 ### this is here for cases if all students in a class
 ### did not answer the item
 correctClsMat[is.na(correctClsMat)] - 0
 
 pCorr - function(R,c,q){
 
 numer - function(R,a,c,q){
 result - 
 sum((1-(1-R)^a)^(1/a), na.rm= TRUE)-c*q
 result
 }
 
 denom - function(R,a,c,q){
 result - sum(na.rm= TRUE, 
 -((1 - (1 - R)^a)^(1/a) * (log((1 - (1 - R)^a)) * (1/a^2)) +
 (1 - (1 - R)^a)^((1/a) - 1) * 
 ((1/a) * ((1 - R)^a * log((1 - R))
 result
 }
 
 aConst - function(R, c, q, con){
 a - .5 # starting value for a
 change - 1
 while(abs(change)  con) {
 r1 - numer(R,a,c,q)
 r2 - denom(R,a,c,q)
 change - r1/r2
 a - a - change
 }
 a
 }
 
 bisect - function(R, c, q, lower, upper, con){
 f - function(a) sum((1 - 
 (1-R)^a)^(1/a)) - c * q
 if(f(lower) * f(upper)  0)
 
 stop(endpoints must have opposite signs)
 while(abs(lower-upper)  con){
 x = .5 * (lower+upper)
 if(f(x) * 
 f(lower) =0) lower = x
 else upper = x
 }
 .5 * (lower+upper)
}
 
 if(rfa == 'nr'){
 if(any(correctClsMat==1)) 
 correctClsMat[correctClsMat==1]-. else correctClsMat

Re: [R] extracting estimated covariance parameters from lme fit

2009-11-16 Thread Kingsford Jones

The VarCorr function will extract the components of the random effects
covariance matrix, but note the quirk that it returns values as
characters:

library(nlme)
f1 - lme(distance ~ age, data = Orthodont, random = ~1 + age|Subject)
(vc - VarCorr(f1))
# Subject = pdLogChol(1 + age)
# Variance   StdDevCorr
# (Intercept) 5.41508724 2.3270340 (Intr)
# age 0.05126955 0.2264278 -0.609
# Residual1.71620401 1.3100397

str(vc)
#  'VarCorr.lme' chr [1:3, 1:3] 5.41508724 0.05126955 1.71620401 ...
# - attr(*, dimnames)=List of 2
#   ..$ : chr [1:3] (Intercept) age Residual
#   ..$ : chr [1:3] Variance StdDev Corr
#  - attr(*, title)= chr Subject = pdLogChol(1 + age)

(sigma2.age - as.numeric(vc[2, 1]))
# [1] 0.05126955


hth,

Kingsford Jones

On Mon, Nov 16, 2009 at 9:25 AM, Green, Gerwyn (greeng6)
g.gre...@lancaster.ac.uk wrote:
 Dear all

 Apologies in advance as this seems like a trivial question. Nonetheless,
 a question I haven't been able to resolve myself !. Within a single
 repetition of a simulation (to be repeated many times) I am fitting the
 following linear mixed model using lme...

 Y_{gtr} = \mu + U_{g} +  W_{gt} + Z_{gtr}

 U_{g} ~ N(0,\gamma^{2}), W_{gt} ~ N(0,\kappa^{2}), Z_{gtr} ~
 N(0,\tau^{2})

 g = 1,...,G
 t = 1,...,T
 r= 1,...,R


 ...by doing

 Model.fit - lme(Y ~ 1, data=data, random= ~1|gene/treatment)

 I would like to be able to extract the estimated covariance parameters
 contained within the lme object. I know if I type...

 Model.fit$sigma


 ...then I get the estimated residual variance, i.e. within the context
 of the above model, the estimate for \tau. But I would also like to
 extract the estimates for \gamma and \kappa by doing
 Model.fit$something. I am aware that I can view the output using the
 extractor function summary, but within a single repetition of my
 simulation routine I want to be able to code something like

 gamma - Model.fit$.
 kappa - Model.fit$.


 and then plug `gamma' and `kappa' into some formulae. This process of
 fitting and extracting will be repeated many times, which is why I wish
 to automate everything.

 Again, any help would be greatly appreciated

 Best

 Gerwyn Green
 School of Health and Medicine
 Lancaster University






 Any help would be greatly appreciated

 Best


 Gerwyn Green
 School of Health and Medicine
 Lancaster Uinversity

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sum over indexed value

2009-11-16 Thread Henrique Dallazuanna

Try this:

with(DF, rowsum(Col2, Col1))

On Mon, Nov 16, 2009 at 3:49 PM, Gunadi boydkra...@gmail.com wrote:

 I am sure this is easy but I am not finding a function to do this.

 I have two columns in a matrix. The first column contains multiple entries
 of numbers from 1 to 100 (i.e. 10 ones, 8 twos etc.). The second column
 contains unique numbers. I want to sum the numbers in column two based on
 the indexed values in column one (e.g. sum of all values in column two
 associated with the value 1 in column one). I would like two columns in
 return - the indexed value in column one (i.e. this time no duplicates) and
 the sum in column two.

 How do I do this?
 --
 View this message in context: 
 http://old.nabble.com/Sum-over-indexed-value-tp26376359p26376359.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] in excel i can sort my dataset, what do i use in R

2009-11-16 Thread frenchcr




In excel a handy tool is the sort data by column ...i.e. i can highlight the
whole dataset and sort it according to a particular column...like sort the
data in a column in acending or decending order where all the other columns
change aswell.

I need to do this in R now but dont know how.

...heres an example...

Say I have dataset...

  Header 1Header 2Header 3
1 3  Working 12 
2 4  Off1
3 5  Works2
4 2  Works13
5 4  Off5


...and i want to sort the data by putting the values in the third column in
acending order, like this...

 Header 1Header 2Header 3

1 4  Off1
2 5  Works2
3 4  Off5
4 3  Working 12
5 2  Works13

...although im sorting column three in acending order all the rows shuffle
so that the parameters in each row stay aligned.



How do i do this in R? 
-- 
View this message in context: 
http://old.nabble.com/in-excel-i-can-sort-my-dataset%2C-what-do-i-use-in-R-tp26377540p26377540.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] in excel i can sort my dataset, what do i use in R

?order
?sort

2009/11/16 frenchcr frenc...@btinternet.com:

In excel a handy tool is the sort data by column ...i.e. i can highlight the
whole dataset and sort it according to a particular column...like sort the
data in a column in acending or decending order where all the other columns
change aswell.

I need to do this in R now but dont know how.

...heres an example...

Say I have dataset...

Header 1 Header 2 Header 3
1 3 Working 12
2 4 Off 1
3 5 Works 2
4 2 Works 13
5 4 Off 5

...and i want to sort the data by putting the values in the third column in
acending order, like this...

Header 1 Header 2 Header 3

1 4 Off 1
2 5 Works 2
3 4 Off 5
4 3 Working 12
5 2 Works 13

...although im sorting column three in acending order all the rows shuffle
so that the parameters in each row stay aligned.

How do i do this in R?
--
View this message in context:
http://old.nabble.com/in-excel-i-can-sort-my-dataset%2C-what-do-i-use-in-R-tp26377540p26377540.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data source name not found and no default driver specified

2009-11-16 Thread helpme

I forgot to mention that it's running Windows Server 2003 x64 OS version

On Mon, Nov 16, 2009 at 11:22 AM, helpme myrquesti...@gmail.com wrote:

 I'm stumped. When trying to connect to Oracle using the RODBC package I get
 an error:
 *[RODBC] Data source name not found and no default driver specified.
 ODBC connect failed.*

 I've read over all the posts and documentation manuals.
 The system is Windows Server 2003 with R 2.81. and the latest downloadable
 RODBC package. The Oracle SID/DSN is mfopdw. I made sure to add it to
 Control Panel-Administrative Priviledges-Microsoft ODBC system/user DNS.

 I've also tried the following in no particular order:

 1.) Turn on all oracle services in control panel-administrative
 priviledges.
 2.) Checked tsnnames.ora for SID.
 3.) Add microsoft ODBC service to Control Panel services for SID
 4.) Use Sqldeveler to test connection another way besides R (It was
 successful)
 5.) channel-odbcDriverConnect(connection=Driver={Microsoft ODBC for
 Oracle}; DSN=abc,UID=abc;PWD=abc;case=oracle)

 received error drivers SQLAllocHandle on SQL_HANDLE_ENV failed one time;
 another time I got the error that Oracle client and networking components
 7.3 or greater is not found.

 6.) tnsping mfopdw

 lsnrctl start mfopdw

 tried to add oracle/bin to path

 Nothing is working.


 Please advise.

 Thank you,


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fitting a logistic regression with mixed type of variables

2009-11-16 Thread Jack Luo

Hi,

I am trying to fit a logistic regression using glm, but my explanatory
variables are of mixed type: some are numeric, some are ordinal, some are
categorical, say

If x1 is numeric, x2 is ordinal, x3 is categorical, is the following formula
OK?

*model - glm(y~x1+x2+x3, family=binomial(link=logit), na.action=na.pass)*

*
*

*Thanks,*

*
*

*-Jack*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] object not found inside step() function

2009-11-16 Thread shuai yuan

Hi, there,

My appologize if someone ask the same question before. I searched the
mailing list and found one similar post, but not what i want.

The problem for me is, I use the step( glm()) to do naive forward
selection for logistic regression.  My code is functional
in the open environment. But if I wrap it up as a function, then R keeps
saying object 'a' not found. Actually,  data frame
a is inside the function.

I did some search online. i guess the reason may be R did not keep the data
in glm() output after building the model but not sure.
Can anyone please tell me how to work around this problem?

Thanks a lot in advance.

I am using R 2.9.0. Here is the sample code:

#
naivelr-function(x,y){
:
:
:

a-data.frame(x)

form-paste(y~1+,paste(grep(X.*,names(a),value=T),collapse=+),sep=)

if(is.null(force.in)!=T){
lowmo-paste(y~1+,paste(grep(X.*,names(a)[force.in],value=T),collapse=+),sep=)

} else
{lowmo-y~1}

lower1-glm(lowmo,family=binomial,data=data.frame(a,y))
upper1-glm(form,family=binomial,data=data.frame(a,y))


stepout-step(lower1,scope=list(lower=lower1,upper=upper1),direction=forward,k=0,trace=100)
# here is the error:Start:
#AIC=689.62
#y ~ 1
#Error in data.frame(a, y) : object 'a' not found---but a is there!
:
:
:
}




 Sincerely

samer yuan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting a logistic regression with mixed type of variables



On Nov 16, 2009, at 2:22 PM, Jack Luo wrote:


Hi,

I am trying to fit a logistic regression using glm, but my explanatory
variables are of mixed type: some are numeric, some are ordinal,  
some are

categorical, say

If x1 is numeric, x2 is ordinal, x3 is categorical, is the following  
formula

OK?


The formula's certainly OK. What may be non-OK will be your  
understanding of the output. The default handling of ordinal factors  
is a common source of questions to R-help, so read up first.


*model - glm(y~x1+x2+x3, family=binomial(link=logit),  
na.action=na.pass)*


Why have you chosen that na.action option?

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pairs

2009-11-16 Thread cindy Guo

I forgot to say that there are no ties in each row. So any number can occur
only once in each row. Also as I mentioned earlier, actually I only need the
top 50 most frequent pairs, is there a more efficient way to do it? Because
I have 15000 numbers, output of all the pairs would be too long.

Thank you,

Cindy

On Mon, Nov 16, 2009 at 7:02 AM, David Winsemius dwinsem...@comcast.netwrote:

 I stuck in another 7 in one of the lines with a 2 and reasoned that we
 could deal with the desire for non-ordered pair counting by pasting
 min(x,y) to max(x,y);

  dput(prmtx)
 structure(c(2, 1, 3, 9, 5, 7, 7, 8, 1, 7, 6, 5, 6, 2, 2, 7), .Dim = c(4L,
 4L))
  prmtx
 [,1] [,2] [,3] [,4]
 [1,]2516
 [2,]1772
 [3,]3762
 [4,]9857

  pair.str - sapply(1:nrow(prmtx), function(z)  apply(combn(prmtx[z,], 2),
 2,function(x) paste(min(x[2],x[1]), max(x[2],x[1]), sep=.)))

 The logic:
 sapply(1:nrow(prmtx), ... just loops over the rows of the matrix.
 combn(prmtx[z,], 2)  ... returns a two row matrix of combination in a
 single row.
 apply(combn(prmtx[z,], 2), 2 ... since combn( , 2)  returns a matrix that
 has two _rows_ I needed to loop over the columns.
 paste(min(x[2],x[1]), max(x[2],x[1]), sep=.) ... stick the minimum of a
 pair in front of the max and separates them with a period to prevent two+
 digits from being non-unique

 Then using table() and logical tests in an index for the desired multiple
 pairs:


  tpair -table(pair.str)
  tpair
 pair.str
 1.2 1.5 1.6 1.7 2.3 2.5 2.6 2.7 3.6 3.7 5.6 5.7 5.8 5.9 6.7 7.7 7.8 7.9 8.9
  2   1   1   2   1   1   2   3   1   1   1   1   1   1   1   1   1   1   1

  tpair[tpair1]
 pair.str
 1.2 1.7 2.6 2.7
  2   2   2   3

 --
 David.


 On Nov 16, 2009, at 7:02 AM, David Winsemius wrote:

 I'm not convinced it's right. In fact, I'm pretty sure the last step taking
 only the first half of the list is wrong. I also do not know if you have
 considered how you want to count situations like:

 3 2 7 4 5 7 ...
 7 3 8 6 1 2 9 2 ..

 How many pairs of 2-7/7-2 would that represent?

 --
 David
 On Nov 15, 2009, at 11:06 PM, cindy Guo wrote:

 Hi, David,

 The matrix has 20 columns.
 Thank you very much for your help. I think it's right, but it seems I
 need some time to figure it out. I am a green hand. There are so many
 functions here I never used before. :)

 Cindy

 On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius dwinsem...@comcast.net
 wrote:
 Assuming that the number of columns is 4, then consider this approach:

  prs -scan()
 1: 2 5 1 6
 5: 1 7 8 2
 9: 3 7 6 2
 13: 9 8 5 7
 17:
 Read 16 items
 prmtx - matrix(prs, 4,4, byrow=T)

 #Now make copus of x.y and y.x

 pair.str - sapply(1:nrow(prmtx), function(z) c(apply(combn(prmtx[z,],
 2), 2,function(x) paste(x[1],x[2], sep=.)) , apply(combn(prmtx[z,], 2),
 2,function(x) paste(x[2],x[1], sep=.))) )
 tpair -table(pair.str)

 # This then gives you a duplicated list
  tpair[tpair1]
 pair.str
 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7
 2   2   2   2   2   2   2   2

 # So only take the first half of the pairs:
  head(tpair[tpair1], sum(tpair1)/2)

 pair.str
 1.2 2.1 2.6 2.7
 2   2   2   2

 --
 David.



 On Nov 15, 2009, at 8:06 PM, David Winsemius wrote:

 I could of course be wrong but have you yet specified the number of
 columns for this pairing exercise?

 On Nov 15, 2009, at 5:26 PM, cindy Guo wrote:

 Hi, All,

 I have an n by m matrix with each entry between 1 and 15000. I want to
 know
 the frequency of each pair in 1:15000 that occur together in rows. So for
 example, if the matrix is
 2 5 1 6
 1 7 8 2
 3 7 6 2
 9 8 5 7
 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return
 the value 2 for this pair as well as that for all pairs. Is there a fast
 way
 to do this avoiding loops? Loops take too long.

 and provide commented, minimal, self-contained, reproducible code.
^^

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list

Re: [R] No Visible Binding for global variable

2009-11-16 Thread Duncan Murdoch


On 11/16/2009 1:54 PM, William Dunlap wrote:

-Original Message-
From: r-help-boun...@r-project.org 
[mailto:r-help-boun...@r-project.org] On Behalf Of Doran, Harold

Sent: Monday, November 16, 2009 10:45 AM
To: r-help@r-project.org
Subject: [R] No Visible Binding for global variable

While building a package, I see the following:

* checking R code for possible problems ... NOTE
cheat.fit: no visible binding for global variable 'Zobs'
plot.jml: no visible binding for global variable 'Var1'

I see the issue has come up before, but I'm having a hard 
time discerning how solutions applied elsewhere would apply 
here. The entire code for both functions is below, but the 
only place the variable Zobs appears in the function cheat.fit is:


cheaters - cbind(data.frame(cheaters), exactMatch)
names(cheaters)[1] - 'Zobs'
names(cheaters)[2] - 'Nexact'
cheaters$Zcrit - Zcrit
cheaters$Mean - means
cheaters$Var - vars
cheaters - subset(cheaters, Zobs = Zcrit)


The code in the codetools package does not know
that subset() does not evaluate its second argument
in the standard way.  Hence it gives a false alarm
here.


Right.  And if you want to keep it quiet, something like

Zobs - NULL # to satisfy codetools

near the start of the function should work.

Duncan Murdoch



Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  



result -  list(pairs = 
c(row.names(cheaters)), Ncheat = nrow(cheaters),

TotalCompare = totalCompare, alpha = alpha,
ExactMatch = cheaters$Nexact, Zobs = 
cheaters$Zobs, Zcrit = Zcrit,
Mean = cheaters$Mean, Variance = 
cheaters$Var, Probs = stuProbs)

result

and the only place Var1 appears in the plot function is here

prop.correct - subset(data.frame(prop.table(table(tmp[, 
i+1], tmp$Estimate), margin=2)), Var1 == 1)[, 2:3]


Many thanks,
Harold

 sessionInfo()
R version 2.10.0 (2009-10-26)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

cheat.fit - function(dat, key, wrongChoice, alpha = .01, rfa 
= c('nr', 'uni', 'bsct'), bonf = c('yes','no'), con = 1e-12, 
lower = 0, upper = 50){

bonf - tolower(bonf)
bonf - match.arg(bonf)
rfa  - match.arg(rfa)
rfa  - tolower(rfa)
dat - t(dat)
correctStuMat - numeric(ncol(dat))
for(i in 1:ncol(dat)){
correctStuMat[i] - 
mean(key==dat[,i], na.rm= TRUE)

}

correctClsMat - numeric(length(key))
for(i in 1:length(key)){
correctClsMat[i] - 
mean(key[i]==dat[i,], na.rm= TRUE)

}

### this is here for cases if all students in a class
### did not answer the item
correctClsMat[is.na(correctClsMat)] - 0

pCorr - function(R,c,q){

numer - function(R,a,c,q){
result - 
sum((1-(1-R)^a)^(1/a), na.rm= TRUE)-c*q

result
}

denom - function(R,a,c,q){
result - sum(na.rm= TRUE, 
-((1 - (1 - R)^a)^(1/a) * (log((1 - (1 - R)^a)) * (1/a^2)) +
(1 - (1 - R)^a)^((1/a) - 1) * 
((1/a) * ((1 - R)^a * log((1 - R))

result
}

aConst - function(R, c, q, con){
a - .5 # starting value for a
change - 1
while(abs(change)  con) {
r1 - numer(R,a,c,q)
r2 - denom(R,a,c,q)
change - r1/r2
a - a - change
}
a
}

bisect - function(R, c, q, lower, upper, con){
f - function(a) sum((1 - 
(1-R)^a)^(1/a)) - c * q

if(f(lower) * f(upper)  0)

stop(endpoints must have opposite signs)

while(abs(lower-upper)  con){
x = .5 * (lower+upper)
if(f(x) * 
f(lower) =0) lower = x

else upper = x
}
.5 * (lower+upper)
   }

if(rfa == 'nr'){
if(any(correctClsMat==1))

Re: [R] pairs


On Nov 16, 2009, at 2:32 PM, cindy Guo wrote:

 I forgot to say that there are no ties in each row. So any number  
 can occur only once in each row. Also as I mentioned earlier,  
 actually I only need the top 50 most frequent pairs, is there a more  
 efficient way to do it? Because I have 15000 numbers, output of all  
 the pairs would be too long.

?order



 Thank you,

 Cindy

 On Mon, Nov 16, 2009 at 7:02 AM, David Winsemius dwinsem...@comcast.net 
  wrote:
 I stuck in another 7 in one of the lines with a 2 and reasoned  
 that we could deal with the desire for non-ordered pair counting  
 by pasting min(x,y) to max(x,y);

  dput(prmtx)
 structure(c(2, 1, 3, 9, 5, 7, 7, 8, 1, 7, 6, 5, 6, 2, 2, 7), .Dim =  
 c(4L,
 4L))
  prmtx
 [,1] [,2] [,3] [,4]
 [1,]2516
 [2,]1772
 [3,]3762
 [4,]9857

  pair.str - sapply(1:nrow(prmtx), function(z)   
 apply(combn(prmtx[z,], 2), 2,function(x) paste(min(x[2],x[1]),  
 max(x[2],x[1]), sep=.)))

 The logic:
 sapply(1:nrow(prmtx), ... just loops over the rows of the matrix.
 combn(prmtx[z,], 2)  ... returns a two row matrix of combination in  
 a single row.
 apply(combn(prmtx[z,], 2), 2 ... since combn( , 2)  returns a matrix  
 that has two _rows_ I needed to loop over the columns.
 paste(min(x[2],x[1]), max(x[2],x[1]), sep=.) ... stick the minimum  
 of a pair in front of the max and separates them with a period to  
 prevent two+ digits from being non-unique

 Then using table() and logical tests in an index for the desired  
 multiple pairs:


  tpair -table(pair.str)
  tpair
 pair.str
 1.2 1.5 1.6 1.7 2.3 2.5 2.6 2.7 3.6 3.7 5.6 5.7 5.8 5.9 6.7 7.7 7.8  
 7.9 8.9
  2   1   1   2   1   1   2   3   1   1   1   1   1   1   1   1   1
 1   1

  tpair[tpair1]
 pair.str
 1.2 1.7 2.6 2.7
  2   2   2   3

 -- 
 David.


 On Nov 16, 2009, at 7:02 AM, David Winsemius wrote:

 I'm not convinced it's right. In fact, I'm pretty sure the last step  
 taking only the first half of the list is wrong. I also do not know  
 if you have considered how you want to count situations like:

 3 2 7 4 5 7 ...
 7 3 8 6 1 2 9 2 ..

 How many pairs of 2-7/7-2 would that represent?

 -- 
 David
 On Nov 15, 2009, at 11:06 PM, cindy Guo wrote:

 Hi, David,

 The matrix has 20 columns.
 Thank you very much for your help. I think it's right, but it seems  
 I need some time to figure it out. I am a green hand. There are so  
 many functions here I never used before. :)

 Cindy

 On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius dwinsem...@comcast.net 
  wrote:
 Assuming that the number of columns is 4, then consider this approach:

  prs -scan()
 1: 2 5 1 6
 5: 1 7 8 2
 9: 3 7 6 2
 13: 9 8 5 7
 17:
 Read 16 items
 prmtx - matrix(prs, 4,4, byrow=T)

 #Now make copus of x.y and y.x

 pair.str - sapply(1:nrow(prmtx), function(z)  
 c(apply(combn(prmtx[z,], 2), 2,function(x) paste(x[1],x[2],  
 sep=.)) , apply(combn(prmtx[z,], 2), 2,function(x)  
 paste(x[2],x[1], sep=.))) )
 tpair -table(pair.str)

 # This then gives you a duplicated list
  tpair[tpair1]
 pair.str
 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7
 2   2   2   2   2   2   2   2

 # So only take the first half of the pairs:
  head(tpair[tpair1], sum(tpair1)/2)

 pair.str
 1.2 2.1 2.6 2.7
 2   2   2   2

 -- 
 David.



 On Nov 15, 2009, at 8:06 PM, David Winsemius wrote:

 I could of course be wrong but have you yet specified the number of  
 columns for this pairing exercise?

 On Nov 15, 2009, at 5:26 PM, cindy Guo wrote:

 Hi, All,

 I have an n by m matrix with each entry between 1 and 15000. I want  
 to know
 the frequency of each pair in 1:15000 that occur together in rows.  
 So for
 example, if the matrix is
 2 5 1 6
 1 7 8 2
 3 7 6 2
 9 8 5 7
 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to  
 return
 the value 2 for this pair as well as that for all pairs. Is there a  
 fast way
 to do this avoiding loops? Loops take too long.

 and provide commented, minimal, self-contained, reproducible code.
^^

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



David Winsemius, MD
Heritage Laboratories
West Hartford, CT


[[alternative HTML

Re: [R] Sum over indexed value



Gunadi wrote:
 
 I am sure this is easy but I am not finding a function to do this. 
 
 I have two columns in a matrix. The first column contains multiple entries
 of numbers from 1 to 100 (i.e. 10 ones, 8 twos etc.). The second column
 contains unique numbers. I want to sum the numbers in column two based on
 the indexed values in column one (e.g. sum of all values in column two
 associated with the value 1 in column one). I would like two columns in
 return - the indexed value in column one (i.e. this time no duplicates)
 and the sum in column two. 
 
 How do I do this? 
 


Supposing you had the data:

  tstData - data.frame( index = c(1,2,1,1,3,2), 
value = c( 0, 4, 0, 0, 7, 4 ) )

You could use the by() function to divide the data.frame and sum the value
column:

  sums - by( tstData, tstData[['index']], function( slice ){

return( sum( slice[['value']] ) )

  })

However, by() tends to do a poor job of cleanly expressing which values of
'index' generated the sums.  I would recomend the __ply() functions in
Hadley Wickham's plyr package.  Specifically ddply():

  require( plyr )

  sums - ddply( tstData, 'index', function( slice ){

return(
  data.frame( sum = sum( slice[['value']] ) )
)
  })

  sums
   index sum
  1 1   0
  2 2   8
  3 3   7


Hope this helps!

-Charlie

-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://old.nabble.com/Sum-over-indexed-value-tp26376359p26378112.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pairs

2009-11-16 Thread cindy Guo

Do you mean if the numbers in each row are ordered? They are not, but if
it's needed, we can order them. The matrix only has 5000 rows.

On Mon, Nov 16, 2009 at 1:34 PM, David Winsemius dwinsem...@comcast.netwrote:


  On Nov 16, 2009, at 2:32 PM, cindy Guo wrote:

  I forgot to say that there are no ties in each row. So any number can
 occur only once in each row. Also as I mentioned earlier, actually I only
 need the top 50 most frequent pairs, is there a more efficient way to do it?
 Because I have 15000 numbers, output of all the pairs would be too long.


 ?order



 Thank you,

 Cindy

 On Mon, Nov 16, 2009 at 7:02 AM, David Winsemius 
 dwinsem...@comcast.netwrote:

 I stuck in another 7 in one of the lines with a 2 and reasoned that we
 could deal with the desire for non-ordered pair counting by pasting
 min(x,y) to max(x,y);

  dput(prmtx)
 structure(c(2, 1, 3, 9, 5, 7, 7, 8, 1, 7, 6, 5, 6, 2, 2, 7), .Dim = c(4L,
 4L))
  prmtx
 [,1] [,2] [,3] [,4]
 [1,]2516
 [2,]1772
 [3,]3762
 [4,]9857

  pair.str - sapply(1:nrow(prmtx), function(z)  apply(combn(prmtx[z,],
 2), 2,function(x) paste(min(x[2],x[1]), max(x[2],x[1]), sep=.)))

 The logic:
 sapply(1:nrow(prmtx), ... just loops over the rows of the matrix.
 combn(prmtx[z,], 2)  ... returns a two row matrix of combination in a
 single row.
 apply(combn(prmtx[z,], 2), 2 ... since combn( , 2)  returns a matrix that
 has two _rows_ I needed to loop over the columns.
 paste(min(x[2],x[1]), max(x[2],x[1]), sep=.) ... stick the minimum of a
 pair in front of the max and separates them with a period to prevent two+
 digits from being non-unique

 Then using table() and logical tests in an index for the desired multiple
 pairs:


  tpair -table(pair.str)
  tpair
 pair.str
 1.2 1.5 1.6 1.7 2.3 2.5 2.6 2.7 3.6 3.7 5.6 5.7 5.8 5.9 6.7 7.7 7.8 7.9
 8.9
  2   1   1   2   1   1   2   3   1   1   1   1   1   1   1   1   1   1   1


  tpair[tpair1]
 pair.str
 1.2 1.7 2.6 2.7
  2   2   2   3

 --
 David.


 On Nov 16, 2009, at 7:02 AM, David Winsemius wrote:

 I'm not convinced it's right. In fact, I'm pretty sure the last step
 taking only the first half of the list is wrong. I also do not know if you
 have considered how you want to count situations like:

 3 2 7 4 5 7 ...
 7 3 8 6 1 2 9 2 ..

 How many pairs of 2-7/7-2 would that represent?

 --
 David
 On Nov 15, 2009, at 11:06 PM, cindy Guo wrote:

 Hi, David,

 The matrix has 20 columns.
 Thank you very much for your help. I think it's right, but it seems I
 need some time to figure it out. I am a green hand. There are so many
 functions here I never used before. :)

 Cindy

 On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius 
 dwinsem...@comcast.net wrote:
 Assuming that the number of columns is 4, then consider this approach:

  prs -scan()
 1: 2 5 1 6
 5: 1 7 8 2
 9: 3 7 6 2
 13: 9 8 5 7
 17:
 Read 16 items
 prmtx - matrix(prs, 4,4, byrow=T)

 #Now make copus of x.y and y.x

 pair.str - sapply(1:nrow(prmtx), function(z) c(apply(combn(prmtx[z,],
 2), 2,function(x) paste(x[1],x[2], sep=.)) , apply(combn(prmtx[z,], 2),
 2,function(x) paste(x[2],x[1], sep=.))) )
 tpair -table(pair.str)

 # This then gives you a duplicated list
  tpair[tpair1]
 pair.str
 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7
 2   2   2   2   2   2   2   2

 # So only take the first half of the pairs:
  head(tpair[tpair1], sum(tpair1)/2)

 pair.str
 1.2 2.1 2.6 2.7
 2   2   2   2

 --
 David.



 On Nov 15, 2009, at 8:06 PM, David Winsemius wrote:

 I could of course be wrong but have you yet specified the number of
 columns for this pairing exercise?

 On Nov 15, 2009, at 5:26 PM, cindy Guo wrote:

 Hi, All,

 I have an n by m matrix with each entry between 1 and 15000. I want to
 know
 the frequency of each pair in 1:15000 that occur together in rows. So
 for
 example, if the matrix is
 2 5 1 6
 1 7 8 2
 3 7 6 2
 9 8 5 7
 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to
 return
 the value 2 for this pair as well as that for all pairs. Is there a fast
 way
 to do this avoiding loops? Loops take too long.

 and provide commented, minimal, self-contained, reproducible code.
^^

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide

[R] extracting the last row of each group in a data frame

2009-11-16 Thread Hao Cen

Hi,

I would like to extract the last row of each group in a data frame.

The data frame is as follows

Name Value
A 1
A 2
A 3
B 4
B 8
C 2
D 3

I would like to get a data frame as
Name Value
A 3
B 8
C 2
D 3

Thank you for your suggestions in advance

Jeff

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pairs



David Winsemius wrote:
 
 ?order
 

cindy Guo wrote:
 
 Do you mean if the numbers in each row are ordered? They are not, but if
 it's needed, we can order them. The matrix only has 5000 rows.
 

No, he's suggesting you check out the order() function by calling it's help
page:

  ?order

order() will sort your results into ascending or descending order.  You
could then pick off the top 50 by using head().

Hope that helps!

-Charlie


-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://old.nabble.com/pairs-tp26364801p26378236.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pairs


On Nov 16, 2009, at 2:41 PM, cindy Guo wrote:

 Do you mean if the numbers in each row are ordered? They are not,  
 but if it's needed, we can order them. The matrix only has 5000 rows.


No, I mean type ?order at the R command line and read the help page.

 On Mon, Nov 16, 2009 at 1:34 PM, David Winsemius dwinsem...@comcast.net 
  wrote:

 On Nov 16, 2009, at 2:32 PM, cindy Guo wrote:

 I forgot to say that there are no ties in each row. So any number  
 can occur only once in each row. Also as I mentioned earlier,  
 actually I only need the top 50 most frequent pairs, is there a  
 more efficient way to do it? Because I have 15000 numbers, output  
 of all the pairs would be too long.

 ?order



 Thank you,

 Cindy

 On Mon, Nov 16, 2009 at 7:02 AM, David Winsemius dwinsem...@comcast.net 
  wrote:
 I stuck in another 7 in one of the lines with a 2 and reasoned  
 that we could deal with the desire for non-ordered pair counting  
 by pasting min(x,y) to max(x,y);

  dput(prmtx)
 structure(c(2, 1, 3, 9, 5, 7, 7, 8, 1, 7, 6, 5, 6, 2, 2, 7), .Dim =  
 c(4L,
 4L))
  prmtx
 [,1] [,2] [,3] [,4]
 [1,]2516
 [2,]1772
 [3,]3762
 [4,]9857

  pair.str - sapply(1:nrow(prmtx), function(z)   
 apply(combn(prmtx[z,], 2), 2,function(x) paste(min(x[2],x[1]),  
 max(x[2],x[1]), sep=.)))

 The logic:
 sapply(1:nrow(prmtx), ... just loops over the rows of the matrix.
 combn(prmtx[z,], 2)  ... returns a two row matrix of combination in  
 a single row.
 apply(combn(prmtx[z,], 2), 2 ... since combn( , 2)  returns a  
 matrix that has two _rows_ I needed to loop over the columns.
 paste(min(x[2],x[1]), max(x[2],x[1]), sep=.) ... stick the  
 minimum of a pair in front of the max and separates them with a  
 period to prevent two+ digits from being non-unique

 Then using table() and logical tests in an index for the desired  
 multiple pairs:


  tpair -table(pair.str)
  tpair
 pair.str
 1.2 1.5 1.6 1.7 2.3 2.5 2.6 2.7 3.6 3.7 5.6 5.7 5.8 5.9 6.7 7.7 7.8  
 7.9 8.9
  2   1   1   2   1   1   2   3   1   1   1   1   1   1   1   1
 1   1   1

  tpair[tpair1]
 pair.str
 1.2 1.7 2.6 2.7
  2   2   2   3

 -- 
 David.


 On Nov 16, 2009, at 7:02 AM, David Winsemius wrote:

 I'm not convinced it's right. In fact, I'm pretty sure the last  
 step taking only the first half of the list is wrong. I also do not  
 know if you have considered how you want to count situations like:

 3 2 7 4 5 7 ...
 7 3 8 6 1 2 9 2 ..

 How many pairs of 2-7/7-2 would that represent?

 -- 
 David
 On Nov 15, 2009, at 11:06 PM, cindy Guo wrote:

 Hi, David,

 The matrix has 20 columns.
 Thank you very much for your help. I think it's right, but it seems  
 I need some time to figure it out. I am a green hand. There are so  
 many functions here I never used before. :)

 Cindy

 On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius dwinsem...@comcast.net 
  wrote:
 Assuming that the number of columns is 4, then consider this  
 approach:

  prs -scan()
 1: 2 5 1 6
 5: 1 7 8 2
 9: 3 7 6 2
 13: 9 8 5 7
 17:
 Read 16 items
 prmtx - matrix(prs, 4,4, byrow=T)

 #Now make copus of x.y and y.x

 pair.str - sapply(1:nrow(prmtx), function(z)  
 c(apply(combn(prmtx[z,], 2), 2,function(x) paste(x[1],x[2],  
 sep=.)) , apply(combn(prmtx[z,], 2), 2,function(x)  
 paste(x[2],x[1], sep=.))) )
 tpair -table(pair.str)

 # This then gives you a duplicated list
  tpair[tpair1]
 pair.str
 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7
 2   2   2   2   2   2   2   2

 # So only take the first half of the pairs:
  head(tpair[tpair1], sum(tpair1)/2)

 pair.str
 1.2 2.1 2.6 2.7
 2   2   2   2

 -- 
 David.



 On Nov 15, 2009, at 8:06 PM, David Winsemius wrote:

 I could of course be wrong but have you yet specified the number of  
 columns for this pairing exercise?

 On Nov 15, 2009, at 5:26 PM, cindy Guo wrote:

 Hi, All,

 I have an n by m matrix with each entry between 1 and 15000. I want  
 to know
 the frequency of each pair in 1:15000 that occur together in rows.  
 So for
 example, if the matrix is
 2 5 1 6
 1 7 8 2
 3 7 6 2
 9 8 5 7
 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to  
 return
 the value 2 for this pair as well as that for all pairs. Is there a  
 fast way
 to do this avoiding loops? Loops take too long.

 and provide commented, minimal, self-contained, reproducible code.
^^

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list

[R] Writing a data frame in an excel file


Hello, I am having trouble by using the write.table function to write a data
frame of 4 columns and 7530 rows. I don´t  know if I should just use a
sep=\n and change the .xls file into a .csv file. Thanks in advance

-
Anna Lippel
new in R so be careful I should be asking a lt of questions!:teeth:
-- 
View this message in context: 
http://old.nabble.com/Writing-a-data-frame-in-an-excel-file-tp26378240p26378240.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extracting the last row of each group in a data frame

Hi,

You could try plyr,

library(plyr)

ddply(d,.(Name), tail,1)

  Name Value
1A 3
2B 8
3C 2
4D 3

HTH,

baptiste

2009/11/16 Hao Cen h...@andrew.cmu.edu:
 Hi,

 I would like to extract the last row of each group in a data frame.

 The data frame is as follows

 Name Value
 A 1
 A 2
 A 3
 B 4
 B 8
 C 2
 D 3

 I would like to get a data frame as
 Name Value
 A 3
 B 8
 C 2
 D 3

 Thank you for your suggestions in advance

 Jeff

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting a logistic regression with mixed type of variables

2009-11-16 Thread Jack Luo

David,

Thanks for your reply. Since I am kinda new to this forum, could you please
advise me on where to read those questions in R-help? In addition, I did not
pay much attention to the na.action, probably I should use na.action =
na.omit instead of na.pass.

-Jack

On Mon, Nov 16, 2009 at 2:32 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Nov 16, 2009, at 2:22 PM, Jack Luo wrote:

  Hi,

 I am trying to fit a logistic regression using glm, but my explanatory
 variables are of mixed type: some are numeric, some are ordinal, some are
 categorical, say

 If x1 is numeric, x2 is ordinal, x3 is categorical, is the following
 formula
 OK?


 The formula's certainly OK. What may be non-OK will be your understanding
 of the output. The default handling of ordinal factors is a common source of
 questions to R-help, so read up first.


 *model - glm(y~x1+x2+x3, family=binomial(link=logit),
 na.action=na.pass)*


 Why have you chosen that na.action option?

 --
 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pairs

2009-11-16 Thread cindy Guo

Thank you. I will check that.
Cindy

On Mon, Nov 16, 2009 at 1:45 PM, cls59 ch...@sharpsteen.net wrote:



 David Winsemius wrote:
 
  ?order
 

 cindy Guo wrote:
 
  Do you mean if the numbers in each row are ordered? They are not, but if
  it's needed, we can order them. The matrix only has 5000 rows.
 

 No, he's suggesting you check out the order() function by calling it's help
 page:

  ?order

 order() will sort your results into ascending or descending order.  You
 could then pick off the top 50 by using head().

 Hope that helps!

 -Charlie


 -
 Charlie Sharpsteen
 Undergraduate
 Environmental Resources Engineering
 Humboldt State University
 --
 View this message in context:
 http://old.nabble.com/pairs-tp26364801p26378236.html
 Sent from the R help mailing list archive at Nabble.com.

 __
  R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extracting the last row of each group in a data frame



On Nov 16, 2009, at 2:42 PM, Hao Cen wrote:


Hi,

I would like to extract the last row of each group in a data frame.

The data frame is as follows

Name Value
A 1
A 2
A 3
B 4
B 8
C 2
D 3



by(dfname$Value, dfname$Name, tail, 1) #which gets you a list

Or:

aggregate(dfname$Value, list(dfname$Name), tail, 1)  #which returns a  
data.frame

  Group.1 x
1   A 3
2   B 8
3   C 2
4   D 3


I would like to get a data frame as
Name Value
A 3
B 8
C 2
D 3



--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extracting the last row of each group in a data frame

2009-11-16 Thread Jorge Ivan Velez

Dear Jeff,

Here is a suggestion using tapply:

data.frame(last = with(x, tapply(Value, Name, function(x) x[length(x)])))

See ?tapply for more information.

HTH,
Jorge


On Mon, Nov 16, 2009 at 2:42 PM, Hao Cen  wrote:

 Hi,

 I would like to extract the last row of each group in a data frame.

 The data frame is as follows

 Name Value
 A 1
 A 2
 A 3
 B 4
 B 8
 C 2
 D 3

 I would like to get a data frame as
 Name Value
 A 3
 B 8
 C 2
 D 3

 Thank you for your suggestions in advance

 Jeff

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extracting the last row of each group in a data frame



jeffc wrote:
 
 Hi,
 
 I would like to extract the last row of each group in a data frame.
 
 The data frame is as follows
 
 Name Value
 A 1
 A 2
 A 3
 B 4
 B 8
 C 2
 D 3
 
 I would like to get a data frame as
 Name Value
 A 3
 B 8
 C 2
 D 3
 
 Thank you for your suggestions in advance
 
 Jeff
 

Try using the base function by() or ddply() from Hadley Wickham's plyr
package:

  require( plyr )

  tstData - structure(list(Name = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 4L),
.Label = c(A, 
B, C, D), class = factor), Value = c(1L, 2L, 3L, 4L, 
8L, 2L, 3L)), .Names = c(Name, Value), class = data.frame, row.names =
c(NA, 
-7L))

  lastRows - ddply( tstData, 'Name', function( group ){

return(
  data.frame( Value = tail( group[['Value']], n = 1 ) )
)

  })

  lastRows
Name Value
  1A 3
  2B 8
  3C 2
  4D 3


Hope this helps!


-Charlie


-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://old.nabble.com/extracting-the-last-row-of-each-group-in-a-data-frame-tp26378194p26378404.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] object not found inside step() function