Re: [R] Calculates the mean/median from grouped data in R?

2013-05-07 Thread David Winsemius

On May 7, 2013, at 11:15 PM, jpm miao wrote:

> Let me revise the data as below. 
> It is survey data. The respondent must answer the outlook of the oil price, 
> say, in three months. Respondent A might answer that the price will be in the 
> 80-90 interval, while B might answer 100-110. I think there should be a 
> function that finds out the mean and the median of the data with the 
> assumption that all data points inside each interval are evenly distributed. 
> Are there such functions in R?
> 

I doubt so. You could always make one up and my suggestion would be to assume 
that all the items were at the midpoint. (It would be "good exercise" for 
developing your R skills.) First step, constructing a reproducible example to 
work on.

-- 
David.

> 
> 
> 70-80 4
> 80-90 5
> 90-1008
> 100-110   7
> 110-120   3
> 
> 
> 
> 2013/5/8 David Winsemius 
> 
> On May 7, 2013, at 8:40 PM, jpm miao wrote:
> 
> > Is there a function in R that calculate the mean and median for a grouped
> > data?
> > For example, a survey shows the oil price outlook in the future. How can I
> > calculate the mean/median?
> > (Of course, I understand that the groups "below 80" and "above 110" must be
> > defined more specifically)
> >
> >  below 80 4  80-90 5  90-100 8  100-110 7  above 110 3
> >
> 
> Why would all groups need to be defined more specifically? How do we know 
> whether items in the 80-90 range are evenly distributed?  could be 80, 
> 81, 82, 84, 85 or all 89's.
> 
> David Winsemius
> Alameda, CA, USA
> 
> 

David Winsemius
Alameda, CA, USA


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] State space models with regime switching

2013-05-07 Thread Matthieu Stigler
Hi

Indeed the tsDyn package implements two types of regime switching models,
where AR coefficients switch between two regimes, with abrupt transition
(setar) or smooth transition (lstar) from one regime to the other. Note
however that these are simply AR models in the standard form, not in the
state-space formulation.

Best
Matthieu

>Hello,
>
>Maybe package tsDyn. It implements SETAR and LSTAR models, among others.
>
>Hope this helps,
>
>Rui Barradas
>
>Em 06-05-2013 21:28, David Hoppe escreveu:
>> Hello everyone,
>>
>> I'm new to this mailing list, but i hope this is the right place to
>> post my question. I'm trying to do some time series analysis with
>> state space models in R. So far I used the packages dse and dlm. I was
>> wondering if there is a package, which allows for regime switching
>> state space models. I did a lot of searching, but I couldn't find
>> anything.
>>
>> Thanks for your answers!
>>
>> David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] make hyperlink in R

2013-05-07 Thread Bill Hyman
Dear all,

Does anybody know how to make hyperlink in R? If I want to output 
"http://www.r-project.org/"; with hyperlink, how can I do with plain text 
'http://www.r-project.org/'? 


Thank you!

Bill
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculates the mean/median from grouped data in R?

2013-05-07 Thread jpm miao
Let me revise the data as below.
It is survey data. The respondent must answer the outlook of the oil price,
say, in three months. Respondent A might answer that the price will be in
the 80-90 interval, while B might answer 100-110. I think there should be a
function that finds out the mean and the median of the data with the
assumption that all data points inside each interval are evenly
distributed. Are there such functions in R?



70-80480-90590-1008100-1107110-1203



2013/5/8 David Winsemius 

>
> On May 7, 2013, at 8:40 PM, jpm miao wrote:
>
> > Is there a function in R that calculate the mean and median for a grouped
> > data?
> > For example, a survey shows the oil price outlook in the future. How can
> I
> > calculate the mean/median?
> > (Of course, I understand that the groups "below 80" and "above 110" must
> be
> > defined more specifically)
> >
> >  below 80 4  80-90 5  90-100 8  100-110 7  above 110 3
> >
>
> Why would all groups need to be defined more specifically? How do we know
> whether items in the 80-90 range are evenly distributed?  could be 80,
> 81, 82, 84, 85 or all 89's.
>
> David Winsemius
> Alameda, CA, USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to calculate Hightest Posterior Density (HPD) of coeficients in a simple regression (lm) in R?

2013-05-07 Thread Richard Asturia
Hi!

I am trying to calculate HPD for the coeficients of regression models
fitted with lm or lmrob in R, pretty much in the same way that can be
accomplished by the association of mcmcsamp and HPDinterval functions for
multilevel models fitted with lmer. Can anyone point me in the right
direction on which packages/how to implement this?

Thanks for your time!

R.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] is.numeric () FALSE

2013-05-07 Thread Jim Lemon

On 05/08/2013 01:38 PM, Alannah wrote:

Hi there, I am reading into R a dataset with 30 variables. It is  in csv file
format but have also tried txt. While my dataset loads without warning, when
I tried to use Geomorph package (my dataset is from a 3D model) I get a
warning that is.atomic(x) is not true. I understand this is a broad problem
with my dataset now being read as numbers. Thus, is.numeric comes back
FALSE.

How do I fix this problem so my numeric dataset is read as numeric?


Hi Alannah,
The "str" function might be helpful. Say your dataset is named 
"alannahdat". "str" will tell you what is in the dataset:


str(alannahdat)

The output is a listing of the components with information about the 
"class" of each. If you see that one or more components of the dataset 
are not what you expect, you can then trace them back into your CSV data 
file to see what is going wrong. Often a single typographic error in a 
numeric field will cause the entire field to be read as a factor rather 
than numeric.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bootstrapped 1-sided confidence intervals

2013-05-07 Thread Joshua Wiley
Hi Janh,

I do not believe that a "one sided" or "two sided" bootstrap makes any
sense.  It is just a resampling procedure that constructs an empirical
distribution.  If you wish to examine the point where 5% of the
distribution falls in one tail instead of the ends of both tails being 5%,
you could simply look at the 90% CI, which will have 5% below and 5% above.

Cheers,

Josh



On Tue, May 7, 2013 at 8:21 PM, Janh Anni  wrote:

> Hello All,
>
> Does anyone know if there’s a function for computing 1-sided confidence
> intervals for bootstrapped statistics (mean, median, percentiles,
> etc.)?  Thanks
> in advance
>
> Janh
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://joshuawiley.com/
Senior Analyst - Elkhart Group Ltd.
http://elkhartgroup.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] is.numeric () FALSE

2013-05-07 Thread Jeff Newmiller
This is an imponderable question, since we would have to be psychic to know 
what you are doing wrong without seeing what you are actually doing.

I can hypothesize that you are testing whether a data frame is numeric, and can 
warn you that a data frame will NEVER be numeric. The individual columns may or 
may not be numeric, but without a sample of your data I am stuck there.

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Alannah  wrote:

>Hi there, I am reading into R a dataset with 30 variables. It is  in
>csv file
>format but have also tried txt. While my dataset loads without warning,
>when
>I tried to use Geomorph package (my dataset is from a 3D model) I get a
>warning that is.atomic(x) is not true. I understand this is a broad
>problem
>with my dataset now being read as numbers. Thus, is.numeric comes back
>FALSE.
>
>How do I fix this problem so my numeric dataset is read as numeric?
>
>Cheers
>
>
>
>
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/is-numeric-FALSE-tp4666530.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] is.numeric () FALSE

2013-05-07 Thread David Winsemius

On May 7, 2013, at 8:38 PM, Alannah wrote (from Nabble):

> Hi there, I am reading into R a dataset with 30 variables. It is  in csv file
> format but have also tried txt. While my dataset loads without warning, when
> I tried to use Geomorph package (my dataset is from a 3D model) I get a
> warning that is.atomic(x) is not true. I understand this is a broad problem
> with my dataset now being read as numbers. Thus, is.numeric comes back
> FALSE.
> 
> How do I fix this problem so my numeric dataset is read as numeric?

You post the code you used for input , and a couple of lines of data from the 
file. Obviously, if "x" is the result from a read operation, it would be a list 
rather than a numeric-vector.

(And in responding please avoid the typical Nabble practice of not including 
context.)

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculates the mean/median from grouped data in R?

2013-05-07 Thread David Winsemius

On May 7, 2013, at 8:40 PM, jpm miao wrote:

> Is there a function in R that calculate the mean and median for a grouped
> data?
> For example, a survey shows the oil price outlook in the future. How can I
> calculate the mean/median?
> (Of course, I understand that the groups "below 80" and "above 110" must be
> defined more specifically)
> 
>  below 80 4  80-90 5  90-100 8  100-110 7  above 110 3
> 

Why would all groups need to be defined more specifically? How do we know 
whether items in the 80-90 range are evenly distributed?  could be 80, 81, 
82, 84, 85 or all 89's.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bootstrapped 1-sided confidence intervals

2013-05-07 Thread David Winsemius

On May 7, 2013, at 8:37 PM, Pascal Oettli wrote:

> Hello,
> 
> You already asked that question on May 7, 2013. And David Winsemius already 
> responded to you:
> https://stat.ethz.ch/pipermail/r-help/2013-May/353044.html

Indeed. My response was intended drive the poster to the code, and then failing 
success at modifying the code,  at least imply the need for a justification for 
the procedure. It's just possible that the failure of the authors to include 
this as an option was by design rather than oversight.


> 
> Regards,
> Pascal
> 
> 
> On 05/08/2013 12:21 PM, Janh Anni wrote:
>> Hello All,
>> 
>> Does anyone know if there’s a function for computing 1-sided confidence
>> intervals for bootstrapped statistics (mean, median, percentiles,
>> etc.)?  Thanks
>> in advance
>> 
>> Janh


David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to get samples from rtmvnorm with large dimensions

2013-05-07 Thread wslz208
Hi, dear all, 

I wish to get one sample (2500-d vector)  from the truncated multivariate
normal distribution, so I choose use the R function rtmvnorm() to do this. 
But the error information shows that for this function, the dimension should
be lower than 1000,
So could you help me to find out if there's any solution could do such
sampling from a truncated multivariate normal distribution with the sigma
dimension is 2500*2500??

Here is parts of my code for sampling: 

rg[k,]<- rtmvnorm(n=1, as.vector(SG1u), SG1, lower=rep(0,2500),
upper=rep(Inf,2500), algorithm="rejection")
and the error information is:
 "Error: mvt(lower = lower, upper = upper, df = 0, corr = corr, delta =
mean,  : 
  only dimensions 1 <= n <= 1000 allowed" 

Thanks a lot. 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] is.numeric () FALSE

2013-05-07 Thread Alannah
Hi there, I am reading into R a dataset with 30 variables. It is  in csv file
format but have also tried txt. While my dataset loads without warning, when
I tried to use Geomorph package (my dataset is from a 3D model) I get a
warning that is.atomic(x) is not true. I understand this is a broad problem
with my dataset now being read as numbers. Thus, is.numeric comes back
FALSE.

How do I fix this problem so my numeric dataset is read as numeric?

Cheers





--
View this message in context: 
http://r.789695.n4.nabble.com/is-numeric-FALSE-tp4666530.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Calculates the mean/median from grouped data in R?

2013-05-07 Thread jpm miao
Is there a function in R that calculate the mean and median for a grouped
data?
For example, a survey shows the oil price outlook in the future. How can I
calculate the mean/median?
(Of course, I understand that the groups "below 80" and "above 110" must be
defined more specifically)

  below 80 4  80-90 5  90-100 8  100-110 7  above 110 3

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CRAN RSS feeds about updates and package checks

2013-05-07 Thread Gábor Csárdi
Dear All,

I have put together a simple web service that notifies you about new and
updated R packages and/or if CRAN package checks fail for your packages. It
is all done via RSS feeds and there is a short description at
http://cranky.igraph.org

The feeds are dynamically created, so you don't have to subscribe for
updates and checks about all packages, but you can follow a single package,
packages by a given author or maintainer, the (reverse) dependencies of a
package, etc. You can even get a feed with packages whose description
contains a keyword. Here are some simple examples:

All packages:
http://cranky.igraph.org/feed/news

Packages that depend on the ggplot2 package:
http://cranky.igraph.org/feed/news/uses/ggplot2

Packages 'ggplot2' depends on:
http://cranky.igraph.org/feed/news/usedby/ggplot2

Packages by a single person:
http://cranky.igraph.org/feed/news/author/youknowitall

Packages that contain 'network' in the description field:
http://cranky.igraph.org/feed/news/description/network

All package checks:
http://cranky.igraph.org/feed/checks

All packages checks for ggplot2 dependencies:
http://cranky.igraph.org/feed/checks/usedby/ggplot2

I have created this for my own needs, and have been using it for a couple
of weeks now. I thought it might be useful for others as well. It could be
a lot more flexible, e.g. a complex search feature would be great, but I
think it is already useful as it is.

Note that this is independent of CRAN and the R website, so please don't
bother R-core or CRAN maintainers about possible bugs or questions. Contant
me directly instead or report a bug on github, see http://cranky.igraph.org.

Best,
Gabor

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bootstrapped 1-sided confidence intervals

2013-05-07 Thread Pascal Oettli

Hello,

You already asked that question on May 7, 2013. And David Winsemius 
already responded to you:

https://stat.ethz.ch/pipermail/r-help/2013-May/353044.html

Regards,
Pascal


On 05/08/2013 12:21 PM, Janh Anni wrote:

Hello All,

Does anyone know if there’s a function for computing 1-sided confidence
intervals for bootstrapped statistics (mean, median, percentiles,
etc.)?  Thanks
in advance

Janh

[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bootstrapped 1-sided confidence intervals

2013-05-07 Thread Janh Anni
Hello All,

Does anyone know if there’s a function for computing 1-sided confidence
intervals for bootstrapped statistics (mean, median, percentiles,
etc.)?  Thanks
in advance

Janh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How can I find negative items from a vector with a short command?

2013-05-07 Thread Jorge I Velez
f [ f < 0 ]


On Wed, May 8, 2013 at 11:54 AM, jpm miao  wrote:

> Hi,
>
>I have a vector f with some negative columns. I remember that there is
> an easy expression that can find out negative items. Can someone tell me
> how I can do it?
>
>It seems to be
>f[i such that f[i]<0 ...]
>
>Thanks,
>
> Miao
>
> > d<-1:7
> > f<-(-2)^d
> > f
> [1]   -24   -8   16  -32   64 -128
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How can I find negative items from a vector with a short command?

2013-05-07 Thread Pascal Oettli

Hi,

The solution can be easily found on Internet.

By the way, the following does what you are looking for:

> f[f<0]

Regards,
Pascal


On 05/08/2013 10:54 AM, jpm miao wrote:

Hi,

I have a vector f with some negative columns. I remember that there is
an easy expression that can find out negative items. Can someone tell me
how I can do it?

It seems to be
f[i such that f[i]<0 ...]

Thanks,

Miao


d<-1:7
f<-(-2)^d
f

[1]   -24   -8   16  -32   64 -128

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How can I find negative items from a vector with a short command?

2013-05-07 Thread jpm miao
Hi,

   I have a vector f with some negative columns. I remember that there is
an easy expression that can find out negative items. Can someone tell me
how I can do it?

   It seems to be
   f[i such that f[i]<0 ...]

   Thanks,

Miao

> d<-1:7
> f<-(-2)^d
> f
[1]   -24   -8   16  -32   64 -128

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to use big.matrix to read factor columns

2013-05-07 Thread li li
I have a big data set that includes character variables of many different
values.  I'm trying to read the data as big.matrix and then use
biglm.big.matrix to build linear models.  However, since big.matrix will
convert all character vectors to factors and the character labels will be
lost, I decided to create a lookup table outside of R for my character
columns and use numbers to represent different levels for R.  However, I do
not know how to tell big.matrix these columns should be considered factors
instead of numerics.  Please help.  thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to read numeric vector as factors using read.table.ffdf

2013-05-07 Thread li li
I have a big data set that includes character variables of many different
values. I'm trying to use ff to read the data and then use biglm.big.matrix
to build linear models. However, since big.matrix will convert all
character vectors to factors and the character labels will be lost. I
decided to create a lookup table outside of R for my character columns and
use numbers to represent different levels for R. However, I do not know how
to tell read.table.ffdf these columns should be considered factors instead
of numerics. Please help. thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help for creating expression data of Differentially expressed genes

2013-05-07 Thread arun
HI,
Assuming that "out_dat.txt" is the output you expected.


 dat1<- read.table("data1.txt",header=TRUE,stringsAsFactors=FALSE)
dat2<- read.table("data2.txt",header=TRUE,stringsAsFactors=FALSE)
out_dat<- read.table("out_data.txt",header=TRUE,stringsAsFactors=FALSE)
 out_dat2<-merge(dat1[,1:4],dat2,by="ID")
 identical(out_dat,out_dat2)
#[1] TRUE
A.K.






From: Vivek Das 
To: arun  
Cc: R help  
Sent: Tuesday, May 7, 2013 6:07 PM
Subject: Re: R help for creating expression data of Differentially expressed 
genes



HI Arun,

My data sets are as in the provided files. I am providing the sample files. I 
guess this will give a better idea to the type of working I want to do with the 
two files and the kind or script am trying to write. Hope you can give me some 
suggestions regarding this. I am new to R so having trouble to use different 
functions to use this for my working.

Anyone who can help me out with this can be of great help.



--

Vivek Das
PhD Student in Computational Biology
Giuseppe Testa's Lab
European School of Molecular Medicine
IFOM-IEO Campus
Via Adamello, 16
Milan, Italy

emails: vivek@ieo.eu
            vchris...@yahoo.co.in
            vd4mm...@gmail.com



On Tue, May 7, 2013 at 10:36 PM, arun  wrote:

Hi Vivek,
>
>May be this helps:
>set.seed(35)
> dat1<- cbind(ID=1:8, 
>as.data.frame(matrix(sample(1:50,8*7,replace=TRUE),ncol=7)))
>
>set.seed(38)
>dat2<- cbind(ID= sample(1:20,8,replace=FALSE), 
>as.data.frame(matrix(sample(1:50,8*33,replace=TRUE),ncol=33)))
>colnames(dat2)[-1]<-gsub("V","X",colnames(dat2)[-1])
> merge(dat1[,1:2],dat2[,1:31],by="ID")
>#  ID V1 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20
>#1  1 43 44  4 33 47 29 43 31 15  2  34  42   5  18  22  36  34  44   3  45   9
>#2  3 28  4 18 45 24  5 20 30 16 49  34  33   5  24  49  31  10  45  21  26  20
>#3  6  5 16  1  5  2 26  6 40 16 15  50  26  37  22  25  39  16  24  29  50  42
>#4  7 25 26 39 16 29  5 40 15 27 46  16  38  36  42   8   3  29   7  13  18  38
>#5  8 30  3 41 25 38 24 41 44 23  2  45  33  10  18  20  49  19  23  42  25   5
>#  X21 X22 X23 X24 X25 X26 X27 X28 X29 X30
>#1  14  27   3  21   6  44  33  42  10  29
>#2  48  13   8  47  18   9  23   9  44   3
>#3  25  14  31  19  14   6  26  13   6  49
>#4  43  28  15   6   9  19  43  21  41  21
>#5   1  27  18   3  42   5  16  39  46  47
>
>A.K.
>
>
>
>- Original Message -
>
>From: Vivek Das 
>To: arun 
>Cc:
>
>Sent: Tuesday, May 7, 2013 3:45 PM
>Subject: R help for creating expression data of Differentially expressed genes
>
>Hi Arun,
>
>I need some help regarding R scripting. I have two data file one containing 
>seven columns and the other containing 33. Both files have unique identifier 
>as ID. I want to create another file which should have the first two columns 
>of the first file and and the 31 columns of the second file matched on the 
>basis of ID. The first file is having gene I'd and gene names of around 500 
>and I want the output file which is having all of those and other attributes 
>as well. I want to get the output file having all attributes matching with the 
>I'd of the first file. So that I get output of 500 rows with all the 
>attributes of second file. I am new to R but having trouble with merge 
>function in R. If you can help it will be great.
>
>Regards,
>Vivek
>
>Sent from my iPad
>
>On 07/mag/2013, at 21:13, arun  wrote:
>
>> HI Ye,
>>
>> For the NA in ID column,
>>
>>
>>
>> Hi
>> dat1<- read.table(text="
>> ObsNumber     ID          Weight
>>      1                 0001         12
>>      2                 0001          13
>>      3                 0001           14
>>      4                  0002         16
>>       5                 0002         17
>>      6                   N/A          18 
>> ",sep="",header=TRUE,colClass=c("numeric","character","numeric"),na.strings="N/A")
>>  unlist(lapply(split(dat1,dat1$ID),function(x) 
>>with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
>> #[1] "0001_1" "0001_2" "0001_3" "0002_1" "0002_2"
>> A.K.
>> 
>> From: Ye Lin 
>> To: arun 
>> Cc: R help 
>> Sent: Tuesday, May 7, 2013 2:54 PM
>> Subject: Re: [R] create unique ID for each group
>>
>>
>>
>> Thanks A.K. But I have "NA" in ID column, so when I apply the code, it gives 
>> me error saying the replacement as less rows than the data has. Anyway for 
>> ID=N/A, return sth like "N/A_1" in order as well?
>>
>>
>>
>>
>>
>>
>> On Tue, May 7, 2013 at 11:17 AM, arun  wrote:
>>
>> H,
>>> Sorry, a mistake:
>>> dat1$UniqueID<-unlist(lapply(split(dat1,dat1$ID),function(x) 
>>> with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
>>> dat1
>>>  # ObsNumber   ID Weight UniqueID
>>> #1         1 0001     12   0001_1
>>> #2         2 0001     13   0001_2
>>> #3         3 0001     14   0001_3
>>> #4         4 0002     16   0002_1
>>> #5         

Re: [R] R help for creating expression data of Differentially expressed genes

2013-05-07 Thread Vivek Das
HI Arun,

My data sets are as in the provided files. I am providing the sample files.
I guess this will give a better idea to the type of working I want to do
with the two files and the kind or script am trying to write. Hope you can
give me some suggestions regarding this. I am new to R so having trouble to
use different functions to use this for my working.

Anyone who can help me out with this can be of great help.


--

Vivek Das
PhD Student in Computational Biology
Giuseppe Testa's Lab
European School of Molecular Medicine
IFOM-IEO Campus
Via Adamello, 16
Milan, Italy

emails: vivek@ieo.eu
vchris...@yahoo.co.in
vd4mm...@gmail.com


On Tue, May 7, 2013 at 10:36 PM, arun  wrote:

> Hi Vivek,
>
> May be this helps:
> set.seed(35)
>  dat1<- cbind(ID=1:8,
> as.data.frame(matrix(sample(1:50,8*7,replace=TRUE),ncol=7)))
>
> set.seed(38)
> dat2<- cbind(ID= sample(1:20,8,replace=FALSE),
> as.data.frame(matrix(sample(1:50,8*33,replace=TRUE),ncol=33)))
> colnames(dat2)[-1]<-gsub("V","X",colnames(dat2)[-1])
>  merge(dat1[,1:2],dat2[,1:31],by="ID")
> #  ID V1 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18
> X19 X20
> #1  1 43 44  4 33 47 29 43 31 15  2  34  42   5  18  22  36  34  44   3
> 45   9
> #2  3 28  4 18 45 24  5 20 30 16 49  34  33   5  24  49  31  10  45  21
> 26  20
> #3  6  5 16  1  5  2 26  6 40 16 15  50  26  37  22  25  39  16  24  29
> 50  42
> #4  7 25 26 39 16 29  5 40 15 27 46  16  38  36  42   8   3  29   7  13
> 18  38
> #5  8 30  3 41 25 38 24 41 44 23  2  45  33  10  18  20  49  19  23  42
> 25   5
> #  X21 X22 X23 X24 X25 X26 X27 X28 X29 X30
> #1  14  27   3  21   6  44  33  42  10  29
> #2  48  13   8  47  18   9  23   9  44   3
> #3  25  14  31  19  14   6  26  13   6  49
> #4  43  28  15   6   9  19  43  21  41  21
> #5   1  27  18   3  42   5  16  39  46  47
> A.K.
>
>
>
> - Original Message -
> From: Vivek Das 
> To: arun 
> Cc:
> Sent: Tuesday, May 7, 2013 3:45 PM
> Subject: R help for creating expression data of Differentially expressed
> genes
>
> Hi Arun,
>
> I need some help regarding R scripting. I have two data file one
> containing seven columns and the other containing 33. Both files have
> unique identifier as ID. I want to create another file which should have
> the first two columns of the first file and and the 31 columns of the
> second file matched on the basis of ID. The first file is having gene I'd
> and gene names of around 500 and I want the output file which is having all
> of those and other attributes as well. I want to get the output file having
> all attributes matching with the I'd of the first file. So that I get
> output of 500 rows with all the attributes of second file. I am new to R
> but having trouble with merge function in R. If you can help it will be
> great.
>
> Regards,
> Vivek
>
> Sent from my iPad
>
> On 07/mag/2013, at 21:13, arun  wrote:
>
> > HI Ye,
> >
> > For the NA in ID column,
> >
> >
> >
> > Hi
> > dat1<- read.table(text="
> > ObsNumber ID  Weight
> >  1 0001 12
> >  2 0001  13
> >  3 0001   14
> >  4  0002 16
> >   5 0002 17
> >  6   N/A  18
> >
> ",sep="",header=TRUE,colClass=c("numeric","character","numeric"),na.strings="N/A")
> >  unlist(lapply(split(dat1,dat1$ID),function(x)
> with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
> > #[1] "0001_1" "0001_2" "0001_3" "0002_1" "0002_2"
> > A.K.
> > 
> > From: Ye Lin 
> > To: arun 
> > Cc: R help 
> > Sent: Tuesday, May 7, 2013 2:54 PM
> > Subject: Re: [R] create unique ID for each group
> >
> >
> >
> > Thanks A.K. But I have "NA" in ID column, so when I apply the code, it
> gives me error saying the replacement as less rows than the data has.
> Anyway for ID=N/A, return sth like "N/A_1" in order as well?
> >
> >
> >
> >
> >
> >
> > On Tue, May 7, 2013 at 11:17 AM, arun  wrote:
> >
> > H,
> >> Sorry, a mistake:
> >> dat1$UniqueID<-unlist(lapply(split(dat1,dat1$ID),function(x)
> with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
> >> dat1
> >>  # ObsNumber   ID Weight UniqueID
> >> #1 1 0001 12   0001_1
> >> #2 2 0001 13   0001_2
> >> #3 3 0001 14   0001_3
> >> #4 4 0002 16   0002_1
> >> #5 5 0002 17   0002_2
> >>
> >> dat2$UniqueID<-unlist(lapply(split(dat2,dat2$ID),function(x)
> with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
> >>
> >> A.K.
> >>
> >>
> >>
> >>
> >>
> >> - Original Message -
> >>
> >> From: arun 
> >> To: Ye Lin 
> >> Cc: R help 
> >> Sent: Tuesday, May 7, 2013 2:10 PM
> >> Subject: Re: [R] create unique ID for each group
> >>
> >>
> >>
> >> Hi,
> >>
> >> Try this:
> >> dat1<- read.table(text="
> >> O

[R] How to use "SparseM-conversions" to convert a dCgMatrix into a matrix.csr ?

2013-05-07 Thread Yi Yuan
Hi all,

I want to transform a  dCgMatrix from package Matrix into a matrix.csr from
package SparseM, and I found out this link :
http://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/SparseM-conv.html

But there's  no informaion about usage/description/arguments, so how do I
use this SparseM-conversions method ??  Is it a function ??

By the way I already tried function: as.spam.matrix.csr from package spam
and it didn't work:Error in as.spam.matrix.csr(train.sparse) :
  Wrong object passed to 'as.spam.matrix.csr'. Don't know why. My dCgMatrix
is produced by sparseMatrix( ) fuction from package Matrix and it's class
is :
> class(train.sparse)
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"

Could anyone help me figure out how to transform train.sparse matrix into a
matrix.csr ??

Thanks !!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help for creating expression data of Differentially expressed genes

2013-05-07 Thread arun
Hi Vivek,

May be this helps:
set.seed(35)
 dat1<- cbind(ID=1:8, 
as.data.frame(matrix(sample(1:50,8*7,replace=TRUE),ncol=7)))

set.seed(38)
dat2<- cbind(ID= sample(1:20,8,replace=FALSE), 
as.data.frame(matrix(sample(1:50,8*33,replace=TRUE),ncol=33)))
colnames(dat2)[-1]<-gsub("V","X",colnames(dat2)[-1])
 merge(dat1[,1:2],dat2[,1:31],by="ID")
#  ID V1 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20
#1  1 43 44  4 33 47 29 43 31 15  2  34  42   5  18  22  36  34  44   3  45   9
#2  3 28  4 18 45 24  5 20 30 16 49  34  33   5  24  49  31  10  45  21  26  20
#3  6  5 16  1  5  2 26  6 40 16 15  50  26  37  22  25  39  16  24  29  50  42
#4  7 25 26 39 16 29  5 40 15 27 46  16  38  36  42   8   3  29   7  13  18  38
#5  8 30  3 41 25 38 24 41 44 23  2  45  33  10  18  20  49  19  23  42  25   5
#  X21 X22 X23 X24 X25 X26 X27 X28 X29 X30
#1  14  27   3  21   6  44  33  42  10  29
#2  48  13   8  47  18   9  23   9  44   3
#3  25  14  31  19  14   6  26  13   6  49
#4  43  28  15   6   9  19  43  21  41  21
#5   1  27  18   3  42   5  16  39  46  47
A.K.



- Original Message -
From: Vivek Das 
To: arun 
Cc: 
Sent: Tuesday, May 7, 2013 3:45 PM
Subject: R help for creating expression data of Differentially expressed genes

Hi Arun,

I need some help regarding R scripting. I have two data file one containing 
seven columns and the other containing 33. Both files have unique identifier as 
ID. I want to create another file which should have the first two columns of 
the first file and and the 31 columns of the second file matched on the basis 
of ID. The first file is having gene I'd and gene names of around 500 and I 
want the output file which is having all of those and other attributes as well. 
I want to get the output file having all attributes matching with the I'd of 
the first file. So that I get output of 500 rows with all the attributes of 
second file. I am new to R but having trouble with merge function in R. If you 
can help it will be great.

Regards,
Vivek

Sent from my iPad

On 07/mag/2013, at 21:13, arun  wrote:

> HI Ye,
> 
> For the NA in ID column, 
> 
> 
> 
> Hi
> dat1<- read.table(text="
> ObsNumber     ID          Weight
>      1                 0001         12
>      2                 0001          13
>      3                 0001           14
>      4                  0002         16
>       5                 0002         17
>      6                   N/A          18  
> ",sep="",header=TRUE,colClass=c("numeric","character","numeric"),na.strings="N/A")
>  unlist(lapply(split(dat1,dat1$ID),function(x) 
>with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
> #[1] "0001_1" "0001_2" "0001_3" "0002_1" "0002_2"
> A.K.
> 
> From: Ye Lin 
> To: arun  
> Cc: R help  
> Sent: Tuesday, May 7, 2013 2:54 PM
> Subject: Re: [R] create unique ID for each group
> 
> 
> 
> Thanks A.K. But I have "NA" in ID column, so when I apply the code, it gives 
> me error saying the replacement as less rows than the data has. Anyway for 
> ID=N/A, return sth like "N/A_1" in order as well?
> 
> 
> 
> 
> 
> 
> On Tue, May 7, 2013 at 11:17 AM, arun  wrote:
> 
> H,
>> Sorry, a mistake:
>> dat1$UniqueID<-unlist(lapply(split(dat1,dat1$ID),function(x) 
>> with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
>> dat1
>>  # ObsNumber   ID Weight UniqueID
>> #1         1 0001     12   0001_1
>> #2         2 0001     13   0001_2
>> #3         3 0001     14   0001_3
>> #4         4 0002     16   0002_1
>> #5         5 0002     17   0002_2
>> 
>> dat2$UniqueID<-unlist(lapply(split(dat2,dat2$ID),function(x) 
>> with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
>> 
>> A.K.
>> 
>> 
>> 
>> 
>> 
>> - Original Message -
>> 
>> From: arun 
>> To: Ye Lin 
>> Cc: R help 
>> Sent: Tuesday, May 7, 2013 2:10 PM
>> Subject: Re: [R] create unique ID for each group
>> 
>> 
>> 
>> Hi,
>> 
>> Try this:
>> dat1<- read.table(text="
>> ObsNumber     ID          Weight
>>      1                 0001         12
>>      2                 0001          13
>>      3                 0001           14
>>      4                  0002         16
>>       5                 0002         17
>> ",sep="",header=TRUE,colClass=c("numeric","character","numeric"))
>> dat2<- read.table(text="
>> ID               Height
>> 0001            3.2
>> 0001             2.6
>> 0001             3.2
>> 0002             2.2
>> 0002              2.6
>> ",sep="",header=TRUE,colClass=c("character","numeric"))
>> dat1$UniqueID<-with(dat1,as.character(interaction(ID,ObsNumber,sep="_")))
>>  
>>dat2$UniqueID<-with(dat2,as.character(interaction(ID,rownames(dat2),sep="_")))
>>  dat2
>> #    ID Height UniqueID
>> #1 0001    3.2   0001_1
>> #2 0001    2.6   0001_2
>> #3 0001    3.2   0001_3
>> #4 0002    2.2   0002_4
>> #5 0002    2.6   0002_5
>> A.K.
>> 
>> 
>> 
>> - Original Message -
>> From: Ye Lin 
>> To: R help 
>> Cc:
>> Sen

Re: [R] Some unrelated questions.

2013-05-07 Thread Keith S Weintraub
Jim,
   Thanks for your comments.
KW

--

On May 6, 2013, at 5:48 PM, Jim Lemon  wrote:

> see inline
> 
> On 05/07/2013 02:14 AM, Keith S Weintraub wrote:
>> Folks,
>> 
>> I have been working on an R project that has a few dozen functions.
>> 
>> I have some questions that are only tangentially related and might only be a 
>> difference in style.
>> 
>> 1. Some of my functions take single-row data.frames as input parameters 
>> lists. I don't force the user of the function to provide all of the 
>> parameters. I use code within a function to test if a particular parameter 
>> (column-name) exists in the data.frame and if so use the value of that 
>> parameter in a test. If the parameter doesn't exist in the data.frame then 
>> some default behavior applies like so:
>> 
>>if("rollDown" %in% names(runParams)) rollDown<-runParams[["rollDown"]]
>>   else rollDown<-0
>> 
>> Is this good style? What are the pitfalls? Is there a better way?
>> 
> Whether it is good style or not, you must have been reading my mind. This is 
> more or less what I am working on to streamline the increasing number of 
> arguments in functions in some of the packages I maintain. At the moment I am 
> trying to work out whether it is easier to have one big function to complete 
> all the arguments or a set of smaller functions for related groups of 
> arguments.
> 
>> One nice thing about this method is that if I need to add a new parameter I 
>> don't have to change the signature of the function.
>> 
>> 2. What is a good way to organize a project with dozens of functions in R?
>> 
> Creating a package is an easy and well documented way of
> 
> 1) keeping all the functions together
> 2) checking that everything works
> 3) maintaining a record of the evolution of the project
> 
> Even a handful of functions can benefit from packaging.
> 
>> 3. My project involves a fair amount of simulation. I am not talking hours 
>> but some of my "runs" take up to 30 minutes.
>> 
>> Suppose I have a "control" function that calls a number of other functions 
>> that might benefit from compilation (using the compiler package). Is it 
>> better to compile the called functions inside or outside the control 
>> function?
>> 
>> Is there a good "idiom" or standardized way of turning compilation of the 
>> called functions on and off? What about debugging (I use the debug package)?
>> 
>> I am perfectly happy with pointers to articles, books and code.
>> 
> Jim
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create unique ID for each group

2013-05-07 Thread arun


Hi,
Try this:

dat1<- read.table(text="
ObsNumber ID  Weight
 1 0001 12
 2 0001  13
 3 0001   14
 4  0002 16
  5 0002 17
 6   N/A  18   
 7   0003  19
 8   N/A   20
 9   0003  21
",sep="",header=TRUE,colClass=c("numeric","character","numeric"),na.strings="N/A")
dat2<- read.table(text="
ID   Height
0001    3.2
0001 2.6
0001 3.2
0002 2.2
0002  2.6
",sep="",header=TRUE,colClass=c("character","numeric")) 

dat1[!is.na(dat1$ID),"UniqueID"]<-unlist(lapply(split(dat1,dat1$ID),function(x) 
with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
 

dat2$UniqueID<-unlist(lapply(split(dat2,dat2$ID),function(x) 
with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
library(plyr)
join(dat1,dat2,by="UniqueID",type="left")
 # ObsNumber   ID Weight UniqueID   ID Height
#1 1 0001 12   0001_1 0001    3.2
#2 2 0001 13   0001_2 0001    2.6
#3 3 0001 14   0001_3 0001    3.2
#4 4 0002 16   0002_1 0002    2.2
#5 5 0002 17   0002_2 0002    2.6
#6 6  18   NA
#7 7 0003 19   0003_1  NA
#8 8  20   NA
#9 9 0003 21   0003_2  NA
A.K.



From: Ye Lin 
To: arun  
Sent: Tuesday, May 7, 2013 4:05 PM
Subject: Re: [R] create unique ID for each group



Yes, I need to keep the N/A records.



On Tue, May 7, 2013 at 1:00 PM, arun  wrote:


>
>Hi
>Do you need an output like this?
> merge(dat1,dat2,by="UniqueID",all.x=TRUE)
>  UniqueID ObsNumber ID.x Weight ID.y Height
>1   0001_1 1 0001 12 0001    3.2
>2   0001_2 2 0001 13 0001    2.6
>3   0001_3 3 0001 14 0001    3.2
>4   0002_1 4 0002 16 0002    2.2
>5   0002_2 5 0002 17 0002    2.6
>6  6  18  NA
>
>when you use:
>
>
> dat1
>  ObsNumber   ID Weight UniqueID
>1 1 0001 12   0001_1
>2 2 0001 13   0001_2
>3 3 0001 14   0001_3
>4 4 0002 16   0002_1
>5 5 0002 17   0002_2
>6 6  18 
>
> dat2
>    ID Height UniqueID
>1 0001    3.2   0001_1
>2 0001    2.6   0001_2
>3 0001    3.2   0001_3
>4 0002    2.2   0002_1
>5 0002    2.6   0002_2
>
>
>
>
>
>From: Ye Lin 
>To: arun 
>Sent: Tuesday, May 7, 2013 3:41 PM
>
>Subject: Re: [R] create unique ID for each group
>
>
>
>If the ID="N/A" then when merge, there would be any match and can return N/A
>
>I use merge(dat1, dat2, by="UniqueID", all.x=TRUE),then an extra row will be 
>added to the output for each case in dat1 that has no matching cases in dat2
>
>I just have to leave the records in dat1 even ID=N/A
>
>
>
>
>On Tue, May 7, 2013 at 12:38 PM, arun  wrote:
>
>Also, another problem might be where do you assign those rows with missing ID. 
> It could be the missing value for any ID.
>>For example in this case:
>>
>>dat1<- read.table(text="
>> ObsNumber ID  Weight
>>  1 0001 12
>>  2 0001  13
>>  3 0001   14
>>  4  0002 16
>>   5 0002 17
>>  6   N/A  18  
>> 7   0003 19
>> 8   0003  20
>>
>> 
>>",sep="",header=TRUE,colClass=c("numeric","character","numeric"),na.strings="N/A")
>> dat1
>>  ObsNumber   ID Weight
>>1 1 0001 12
>>2 2 0001 13
>>3 3 0001 14
>>4 4 0002 16
>>5 5 0002 17
>>6 6  18
>>7 7 0003 19
>>8 8 0003 20
>>
>>
>> The missing ID could be either "0002" or "0003". 
>>
>>
>>
>>
>>
>>
>>- Original Message -
>>From: arun 
>>To: Ye Lin 
>>Cc:
>>
>>Sent: Tuesday, May 7, 2013 3:32 PM
>>Subject: Re: [R] create unique ID for each group
>>
>>If you modify with na.strings="N/A", IDs with missing values will be read 
>>correctly.  Otherwise, it is just a character string.  BTW, if you need rows 
>>with NAs, then what will be your UniqueIDs you expect for those rows?
>>
>>
>>
>>
>>
>>
>>
>>
>>From: Ye Lin 
>>To: arun 
>>Sent: Tuesday, May 7, 2013 3:25 PM
>>Subject: Re: [R] create unique ID for each group
>>
>>
>>
>>I do need rows with "NA". I already read the data in R, so do I assume I need 
>>to modify dat1 first with na.strings="N/A" ?
>>
>>
>>
>>On Tue, May 7, 2013 at 12:16 PM, arun  wrote:
>>
>>Hi,
>>>Do you need that row with "N/A".  The code I sent will remove that row.  If 
>>>you don't use "na.strings="N/A", then it is not read NA, but some other 
>>>character.  Tha

Re: [R] create unique ID for each group

2013-05-07 Thread William Dunlap
>I want to merge dat1 and dat2 based on "ID" in order, I know "match" only
>returns the first match it finds. So I am thinking create unique ID col in
>dat2 and dat2, then merge.

You can make a new within-group sequence number with ave():

> dat1<- read.table(text="
ObsNumber ID  Weight
 1 0001 12
 2 0001  13
 3 0001   14
 4  0002 16
  5 0002 17
 6   N/A  18   
",sep="",header=TRUE,colClass=c("numeric","character","numeric"),na.strings="N/A")

> dat1$withinIDSeq <- ave(rep(NA_real_,nrow(dat1)), dat1$ID, FUN=seq_along)
> dat1$withinIDSeq
[1]  1  2  3  1  2 NA

Use merge() with the 'by' columns being ID and withinIDSeq or paste them 
together
yourself and use the result as your single 'by' column.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Ye Lin
> Sent: Tuesday, May 07, 2013 11:38 AM
> To: Chris Stubben
> Cc: R help
> Subject: Re: [R] create unique ID for each group
> 
> In each category, the order is the same. Fro example, the first match in
> dat2 should return to the first record in dat2
> 
> 
> On Tue, May 7, 2013 at 11:31 AM, Chris Stubben  wrote:
> 
> > Yes, I tried, but the order of the IDs in dat1 and dat2 is not exactly the
> >> same, I simplify the data here. So in dat2, it may have records for
> >> ID=0002
> >> first then ID=0001, also I have more than two categories under ID col
> >>
> >
> > I should have looked at the question more closely, sorry.   Unique ids in
> > raw datasets
> > are pretty important, especially if observations are split into different
> > files and you are trying to join them later.  How do you know for ID 0001
> > and obs 1 that height is  3.2 and not 2.6, especially if order in the two
> > files are "not exactly the same".
> >
> >
> > Chris
> >
> >
> >
> > --
> >
> > Chris Stubben
> >
> > Los Alamos National Lab
> > Bioscience Division
> > MS M888
> > Los Alamos, NM 87545
> >
> > __**
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/**listinfo/r-help help>
> > PLEASE do read the posting guide http://www.R-project.org/**
> > posting-guide.html 
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting elements from a matrix using a vector containing indices

2013-05-07 Thread David Winsemius

On May 7, 2013, at 11:52 AM, Mark Coletti wrote:

> On Tue, May 7, 2013 at 4:53 AM, peter dalgaard  wrote:
> 
>> 
>> On May 7, 2013, at 06:39 , Mark Coletti wrote:
>> 
>>> I have a matrix of data that has a corresponding vector of indices.  I
>>> would like to use those indices to extract specific matrix elements into
>> a
>>> new vector.  In other words, I have an R X C matrix with a corresponding
>>> vector of C elements that have numbers mapping into specific elements of
>>> the matrix.  I'd like to generate a new vector C long that contains those
>>> elements.
>> 
>> As Berend says, you're not specifying the problem very clearly. Are you
>> looking for something like this (indexing with a matrix)?
>> 
>>> M <- matrix(round(rnorm(20,20,10)),4,5)
>>> M
>> [,1] [,2] [,3] [,4] [,5]
>> [1,]   29   19   18   130
>> [2,]   24   24   11   25   10
>> [3,]   209   12   11   24
>> [4,]   283   16   17   32
>>> Ix <- sample(1:4,5,replace=TRUE)
>>> Ix
>> [1] 3 2 2 4 3
>>> M[cbind(Ix,1:5)]
>> [1] 20 24 11 17 24
> 
> 
> I apologize for not being clearer with my question.  Regardless, you were
> able to suss out what I wanted -- using a vector of y coordinates to pull
> individual elements out by column from a matrix.  Your use of cbind() to
> effect this works.
> 
> And using cbind() to extract matrix elements in that way is very
> non-intuitive.  There is no way I would have thought of that on my own.

So read ?Extract more carefully.

It's a very important help page and failing to read it (ideally least 5 times)  
results in many, many questions on this list.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create unique ID for each group

2013-05-07 Thread arun
HI Ye,

For the NA in ID column, 



Hi
dat1<- read.table(text="
ObsNumber ID  Weight
 1 0001 12
 2 0001  13
 3 0001   14
 4  0002 16
  5 0002 17
 6   N/A  18   
",sep="",header=TRUE,colClass=c("numeric","character","numeric"),na.strings="N/A")
 unlist(lapply(split(dat1,dat1$ID),function(x) 
with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
#[1] "0001_1" "0001_2" "0001_3" "0002_1" "0002_2"
A.K.

From: Ye Lin 
To: arun  
Cc: R help  
Sent: Tuesday, May 7, 2013 2:54 PM
Subject: Re: [R] create unique ID for each group



Thanks A.K. But I have "NA" in ID column, so when I apply the code, it gives me 
error saying the replacement as less rows than the data has. Anyway for ID=N/A, 
return sth like "N/A_1" in order as well?






On Tue, May 7, 2013 at 11:17 AM, arun  wrote:

H,
>Sorry, a mistake:
>dat1$UniqueID<-unlist(lapply(split(dat1,dat1$ID),function(x) 
>with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
>dat1
> # ObsNumber   ID Weight UniqueID
>#1 1 0001 12   0001_1
>#2 2 0001 13   0001_2
>#3 3 0001 14   0001_3
>#4 4 0002 16   0002_1
>#5 5 0002 17   0002_2
>
>dat2$UniqueID<-unlist(lapply(split(dat2,dat2$ID),function(x) 
>with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
>
>A.K.
>
>
>
>
>
>- Original Message -
>
>From: arun 
>To: Ye Lin 
>Cc: R help 
>Sent: Tuesday, May 7, 2013 2:10 PM
>Subject: Re: [R] create unique ID for each group
>
>
>
>Hi,
>
>Try this:
>dat1<- read.table(text="
>ObsNumber ID  Weight
> 1 0001 12
> 2 0001  13
> 3 0001   14
> 4  0002 16
>  5 0002 17
>",sep="",header=TRUE,colClass=c("numeric","character","numeric"))
>dat2<- read.table(text="
>ID   Height
>0001    3.2
>0001 2.6
>0001 3.2
>0002 2.2
>0002  2.6
>",sep="",header=TRUE,colClass=c("character","numeric"))
>dat1$UniqueID<-with(dat1,as.character(interaction(ID,ObsNumber,sep="_")))
> dat2$UniqueID<-with(dat2,as.character(interaction(ID,rownames(dat2),sep="_")))
> dat2
>#    ID Height UniqueID
>#1 0001    3.2   0001_1
>#2 0001    2.6   0001_2
>#3 0001    3.2   0001_3
>#4 0002    2.2   0002_4
>#5 0002    2.6   0002_5
>A.K.
>
>
>
>- Original Message -
>From: Ye Lin 
>To: R help 
>Cc:
>Sent: Tuesday, May 7, 2013 1:54 PM
>Subject: [R] create unique ID for each group
>
>Hey All,
>
>I have a dataset(dat1) like this:
>
>ObsNumber     ID          Weight
>     1                 0001         12
>     2                 0001          13
>     3                 0001           14
>     4                  0002         16
>      5                 0002         17
>
>And another dataset(dat2) like this:
>
>ID               Height
>0001            3.2
>0001             2.6
>0001             3.2
>0002             2.2
>0002              2.6
>
>I want to merge dat1 and dat2 based on "ID" in order, I know "match" only
>returns the first match it finds. So I am thinking create unique ID col in
>dat2 and dat2, then merge. But I dont know how to do that so it can be like
>this:
>
>dat1:
>
>ObsNumber     ID          Weight  UniqueID
>     1                 0001         12         0001_1
>     2                 0001          13        0001_2
>     3                 0001           14       0001_3
>     4                  0002         16         0002_1
>      5                 0002         17         0002_1
>
>dat2:
>
>ID               Height   UniqueID
>0001            3.2          0001_1
>0001             2.6         0001_2
>0001             3.2         0001_3
>0002             2.2         0002_1
>0002              2.6        0002_2
>
>Or if it is possible to merge dat1 and dat2 by matching "ID" but return the
>match in order that would be great!
>
>Thanks for your help!
>
>    [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to calculate the mean in a period of time?

2013-05-07 Thread arun
Hi,
Your question is still not clear.
May be this helps:

dat2<- read.table(text="
patient_id  t scores
1  0    1.6
1  1    2.6
1  2 2.2
1  3 1.8
2  0  2.3
2   2 2.5
2  4  2.6
2   5 1.5
",sep="",header=TRUE)

library(plyr)
 dat2New<-ddply(dat2,.(patient_id),summarize,t=seq(min(t),max(t)))
 res<-join(dat2New,dat2,type="full")
res1<-do.call(rbind,lapply(split(res,res$patient_id),function(x) 
{x1<-x[x$t!=0,];do.call(rbind,lapply(split(x1,((x1$t-1)%/%3)+1),function(y) 
{y1<-if(any(y$t==1)) rbind(x[x$t==0,],y) else y; 
data.frame(patient_id=unique(y1$patient_id),scores=mean(y1$scores,na.rm=TRUE))})
 ) }))
 row.names(res1)<-1:nrow(res1)
res1$period<-with(res1,ave(patient_id,patient_id,FUN=seq))
 res1
#  patient_id scores period
#1  1   2.05  1
#2  2   2.40  1
#3  2   2.05  2


A.K.




From: GUANGUAN LUO 
To: arun  
Sent: Tuesday, May 7, 2013 11:29 AM
Subject: Re: how to calculate the mean in a period of time?



Yes , as you have said, probably , it's not continuous.


2013/5/7 arun 

Hi,
>Your question is not clear.  You mentioned to calculate the mean of 3 months, 
>but infact you added the scores for t=0,1,2,3 as first 3 months, then possibly 
>4,5,6 as the next.  So, it is not exactly three months.  Isn't it?
>
>
>Dear R experts,
>sorry to trouble you again.
>My data is like this now :
>patient_id      t         scores
>1                      0                1.6
>1                      1                2.6
>1                      2                 2.2
>1                      3                 1.8
>2                      0                  2.3
>2                       2                 2.5
>2                      4                  2.6
>2                       5                 1.5
>
>I want to calculate the mean of period of 3 months, just get a table like this
>
>patient_id     period     scores
>1                            1           2.05                      
>(1.6+2.6+2.2+1.8)/4
>2                            1               2.4                     
>(2.3+2.5)/2
>2                            2               2.05                    
>(2.6+1.5)/2
>
>thank you in avance
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create unique ID for each group

2013-05-07 Thread Ye Lin
Thanks A.K. But I have "NA" in ID column, so when I apply the code, it
gives me error saying the replacement as less rows than the data has.
Anyway for ID=N/A, return sth like "N/A_1" in order as well?





On Tue, May 7, 2013 at 11:17 AM, arun  wrote:

> H,
> Sorry, a mistake:
> dat1$UniqueID<-unlist(lapply(split(dat1,dat1$ID),function(x)
> with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
> dat1
>  # ObsNumber   ID Weight UniqueID
> #1 1 0001 12   0001_1
> #2 2 0001 13   0001_2
> #3 3 0001 14   0001_3
> #4 4 0002 16   0002_1
> #5 5 0002 17   0002_2
>
> dat2$UniqueID<-unlist(lapply(split(dat2,dat2$ID),function(x)
> with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
> A.K.
>
>
>
>
>
> - Original Message -
> From: arun 
> To: Ye Lin 
> Cc: R help 
> Sent: Tuesday, May 7, 2013 2:10 PM
> Subject: Re: [R] create unique ID for each group
>
>
>
> Hi,
>
> Try this:
> dat1<- read.table(text="
> ObsNumber ID  Weight
>  1 0001 12
>  2 0001  13
>  3 0001   14
>  4  0002 16
>   5 0002 17
> ",sep="",header=TRUE,colClass=c("numeric","character","numeric"))
> dat2<- read.table(text="
> ID   Height
> 00013.2
> 0001 2.6
> 0001 3.2
> 0002 2.2
> 0002  2.6
> ",sep="",header=TRUE,colClass=c("character","numeric"))
> dat1$UniqueID<-with(dat1,as.character(interaction(ID,ObsNumber,sep="_")))
>
>  
> dat2$UniqueID<-with(dat2,as.character(interaction(ID,rownames(dat2),sep="_")))
>  dat2
> #ID Height UniqueID
> #1 00013.2   0001_1
> #2 00012.6   0001_2
> #3 00013.2   0001_3
> #4 00022.2   0002_4
> #5 00022.6   0002_5
> A.K.
>
>
>
> - Original Message -
> From: Ye Lin 
> To: R help 
> Cc:
> Sent: Tuesday, May 7, 2013 1:54 PM
> Subject: [R] create unique ID for each group
>
> Hey All,
>
> I have a dataset(dat1) like this:
>
> ObsNumber ID  Weight
>  1 0001 12
>  2 0001  13
>  3 0001   14
>  4  0002 16
>   5 0002 17
>
> And another dataset(dat2) like this:
>
> ID   Height
> 00013.2
> 0001 2.6
> 0001 3.2
> 0002 2.2
> 0002  2.6
>
> I want to merge dat1 and dat2 based on "ID" in order, I know "match" only
> returns the first match it finds. So I am thinking create unique ID col in
> dat2 and dat2, then merge. But I dont know how to do that so it can be like
> this:
>
> dat1:
>
> ObsNumber ID  Weight  UniqueID
>  1 0001 12 0001_1
>  2 0001  130001_2
>  3 0001   14   0001_3
>  4  0002 16 0002_1
>   5 0002 17 0002_1
>
> dat2:
>
> ID   Height   UniqueID
> 00013.2  0001_1
> 0001 2.6 0001_2
> 0001 3.2 0001_3
> 0002 2.2 0002_1
> 0002  2.60002_2
>
> Or if it is possible to merge dat1 and dat2 by matching "ID" but return the
> match in order that would be great!
>
> Thanks for your help!
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting elements from a matrix using a vector containing indices

2013-05-07 Thread Mark Coletti
On Tue, May 7, 2013 at 4:53 AM, peter dalgaard  wrote:

>
> On May 7, 2013, at 06:39 , Mark Coletti wrote:
>
> > I have a matrix of data that has a corresponding vector of indices.  I
> > would like to use those indices to extract specific matrix elements into
> a
> > new vector.  In other words, I have an R X C matrix with a corresponding
> > vector of C elements that have numbers mapping into specific elements of
> > the matrix.  I'd like to generate a new vector C long that contains those
> > elements.
>
> As Berend says, you're not specifying the problem very clearly. Are you
> looking for something like this (indexing with a matrix)?
>
> > M <- matrix(round(rnorm(20,20,10)),4,5)
> > M
>  [,1] [,2] [,3] [,4] [,5]
> [1,]   29   19   18   130
> [2,]   24   24   11   25   10
> [3,]   209   12   11   24
> [4,]   283   16   17   32
> > Ix <- sample(1:4,5,replace=TRUE)
> > Ix
> [1] 3 2 2 4 3
> > M[cbind(Ix,1:5)]
> [1] 20 24 11 17 24


I apologize for not being clearer with my question.  Regardless, you were
able to suss out what I wanted -- using a vector of y coordinates to pull
individual elements out by column from a matrix.  Your use of cbind() to
effect this works.

And using cbind() to extract matrix elements in that way is very
non-intuitive.  There is no way I would have thought of that on my own.
 Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create unique ID for each group

2013-05-07 Thread Ye Lin
In each category, the order is the same. Fro example, the first match in
dat2 should return to the first record in dat2


On Tue, May 7, 2013 at 11:31 AM, Chris Stubben  wrote:

> Yes, I tried, but the order of the IDs in dat1 and dat2 is not exactly the
>> same, I simplify the data here. So in dat2, it may have records for
>> ID=0002
>> first then ID=0001, also I have more than two categories under ID col
>>
>
> I should have looked at the question more closely, sorry.   Unique ids in
> raw datasets
> are pretty important, especially if observations are split into different
> files and you are trying to join them later.  How do you know for ID 0001
> and obs 1 that height is  3.2 and not 2.6, especially if order in the two
> files are "not exactly the same".
>
>
> Chris
>
>
>
> --
>
> Chris Stubben
>
> Los Alamos National Lab
> Bioscience Division
> MS M888
> Los Alamos, NM 87545
>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create unique ID for each group

2013-05-07 Thread Chris Stubben

Yes, I tried, but the order of the IDs in dat1 and dat2 is not exactly the
same, I simplify the data here. So in dat2, it may have records for ID=0002
first then ID=0001, also I have more than two categories under ID col


I should have looked at the question more closely, sorry.   Unique ids in raw 
datasets
are pretty important, especially if observations are split into different files and 
you are trying to join them later.  How do you know for ID 0001 and obs 1 that height 
is  3.2 and not 2.6, especially if order in the two files are "not exactly the same".


Chris



--

Chris Stubben

Los Alamos National Lab
Bioscience Division
MS M888
Los Alamos, NM 87545

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about fitting a periodic model in glmm

2013-05-07 Thread Marc Girondot
I would like to fit a period (annual) model in glmm. Here is the script 
I do:

# Generate "dummy" periodic counts with effect of a covariate co
# of course I plan to use this script on my own data !
d <- 1:500
co <- rnorm(500, 10, 2)
yco <- (1+sin(2*pi*(d+100)/365))*10*co/10+co
y <- floor(rnorm(500, yco, 10))
df1 <- data.frame(days=d, number=y, covariate=co, ID=1)
df1[df1$number<0, "number"] <- 0

# Just to look that all is ok:
plot(df1$days, df1$number, type="l", ylim=c(0,80), bty="n")
plot(df1$number, df1$covariate, bty="n")

# days is a fixed effect (I choose the days of observations)
# covariate is a random effect
# I fit periodic effect according to days as: sin( 2*3.1416 * (days) / 
365) + cos( 2*3.1416 * (days) / 365)

library(MASS)
fit <- glmmPQL ( number ~ covariate +
   sin( 2*3.1416 * (days) / 365) +
   cos( 2*3.1416 * (days) / 365),
 family=quasipoisson(link = "log"), data=df1,
 random = ~ 1+covariate | ID)

# test for the effects
library(spida)
wald( fit, list("covariate", "days"))

# predictions: all is good !
plot(df1$days, df1$number, type="l", ylim=c(0, 80), bty="n", main="ID=1")
par(new=TRUE)
newd1 <- data.frame(days=d, covariate=5, ID=1)
p1 <- predict(fit, newd1)
plot(d, exp(p1), type="l", col="red", ylim=c(0, 80),
 bty="n", axes=FALSE, xlab="", ylab="")
par(new=TRUE)
newd1 <- data.frame(days=d, covariate=10, ID=1)
p1 <- predict(fit, newd1)
plot(d, exp(p1), type="l", col="green", ylim=c(0, 80),
 bty="n", axes=FALSE, xlab="", ylab="")
par(new=TRUE)
newd1 <- data.frame(days=d, covariate=15, ID=1)
p1 <- predict(fit, newd1)
plot(d, exp(p1), type="l", col="blue", ylim=c(0, 80),
 bty="n", axes=FALSE, xlab="", ylab="")
legend("topleft", legend=c("covariate=5", "covariate=10", 
"covariate=15"), lty=1, col=c("red", "green", "blue"))


Now my questions:
- The periodic effect has two components, sin and cos dependent on 
"days". After the Wald test, I have then two p values for "days" effect 
(one for sin and one for cos). How can I combine these two p-values to 
get a global effect of periodic effect ?
- If I want to setup interaction between periodic effect of "days" and 
"covariate", how I can do as "days" appears in two effects (sin and cos) ?


A third question related, is it possible to use a glmm with negative 
binomial distribution ? I don't find still...


... or perhaps you have a better way to do all of that !

Thanks a lot.

Marc Girondot

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create unique ID for each group

2013-05-07 Thread Ye Lin
Yes, I tried, but the order of the IDs in dat1 and dat2 is not exactly the
same, I simplify the data here. So in dat2, it may have records for ID=0002
first then ID=0001, also I have more than two categories under ID col.


On Tue, May 7, 2013 at 10:57 AM, Chris Stubben  wrote:

> > I want to merge dat1 and dat2 based on "ID" in order
>
> Have you tried merge(dat1, dat2) ?
>
> If ID is the common column (and no others), then that should be all you
> need to join (see ?merge).  And then order if needed.
>
> Chris
>
>
>
> --
>
> Chris Stubben
>
> Los Alamos National Lab
> Bioscience Division
> MS M888
> Los Alamos, NM 87545
>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create unique ID for each group

2013-05-07 Thread arun
H,
Sorry, a mistake:
dat1$UniqueID<-unlist(lapply(split(dat1,dat1$ID),function(x) 
with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
dat1
 # ObsNumber   ID Weight UniqueID
#1 1 0001 12   0001_1
#2 2 0001 13   0001_2
#3 3 0001 14   0001_3
#4 4 0002 16   0002_1
#5 5 0002 17   0002_2

dat2$UniqueID<-unlist(lapply(split(dat2,dat2$ID),function(x) 
with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_",use.names=FALSE)
A.K.





- Original Message -
From: arun 
To: Ye Lin 
Cc: R help 
Sent: Tuesday, May 7, 2013 2:10 PM
Subject: Re: [R] create unique ID for each group



Hi,

Try this:
dat1<- read.table(text="
ObsNumber ID  Weight
 1 0001 12
 2 0001  13
 3 0001   14
 4  0002 16
  5 0002 17
",sep="",header=TRUE,colClass=c("numeric","character","numeric"))
dat2<- read.table(text="
ID   Height
0001    3.2
0001 2.6
0001 3.2
0002 2.2
0002  2.6
",sep="",header=TRUE,colClass=c("character","numeric")) 
dat1$UniqueID<-with(dat1,as.character(interaction(ID,ObsNumber,sep="_")))
 dat2$UniqueID<-with(dat2,as.character(interaction(ID,rownames(dat2),sep="_")))
 dat2
#    ID Height UniqueID
#1 0001    3.2   0001_1
#2 0001    2.6   0001_2
#3 0001    3.2   0001_3
#4 0002    2.2   0002_4
#5 0002    2.6   0002_5
A.K.



- Original Message -
From: Ye Lin 
To: R help 
Cc: 
Sent: Tuesday, May 7, 2013 1:54 PM
Subject: [R] create unique ID for each group

Hey All,

I have a dataset(dat1) like this:

ObsNumber     ID          Weight
     1                 0001         12
     2                 0001          13
     3                 0001           14
     4                  0002         16
      5                 0002         17

And another dataset(dat2) like this:

ID               Height
0001            3.2
0001             2.6
0001             3.2
0002             2.2
0002              2.6

I want to merge dat1 and dat2 based on "ID" in order, I know "match" only
returns the first match it finds. So I am thinking create unique ID col in
dat2 and dat2, then merge. But I dont know how to do that so it can be like
this:

dat1:

ObsNumber     ID          Weight  UniqueID
     1                 0001         12         0001_1
     2                 0001          13        0001_2
     3                 0001           14       0001_3
     4                  0002         16         0002_1
      5                 0002         17         0002_1

dat2:

ID               Height   UniqueID
0001            3.2          0001_1
0001             2.6         0001_2
0001             3.2         0001_3
0002             2.2         0002_1
0002              2.6        0002_2

Or if it is possible to merge dat1 and dat2 by matching "ID" but return the
match in order that would be great!

Thanks for your help!

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create unique ID for each group

2013-05-07 Thread arun


Hi,

Try this:
dat1<- read.table(text="
ObsNumber ID  Weight
 1 0001 12
 2 0001  13
 3 0001   14
 4  0002 16
  5 0002 17
",sep="",header=TRUE,colClass=c("numeric","character","numeric"))
dat2<- read.table(text="
ID   Height
0001    3.2
0001 2.6
0001 3.2
0002 2.2
0002  2.6
",sep="",header=TRUE,colClass=c("character","numeric")) 
dat1$UniqueID<-with(dat1,as.character(interaction(ID,ObsNumber,sep="_")))
 dat2$UniqueID<-with(dat2,as.character(interaction(ID,rownames(dat2),sep="_")))
 dat2
#    ID Height UniqueID
#1 0001    3.2   0001_1
#2 0001    2.6   0001_2
#3 0001    3.2   0001_3
#4 0002    2.2   0002_4
#5 0002    2.6   0002_5
A.K.



- Original Message -
From: Ye Lin 
To: R help 
Cc: 
Sent: Tuesday, May 7, 2013 1:54 PM
Subject: [R] create unique ID for each group

Hey All,

I have a dataset(dat1) like this:

ObsNumber     ID          Weight
     1                 0001         12
     2                 0001          13
     3                 0001           14
     4                  0002         16
      5                 0002         17

And another dataset(dat2) like this:

ID               Height
0001            3.2
0001             2.6
0001             3.2
0002             2.2
0002              2.6

I want to merge dat1 and dat2 based on "ID" in order, I know "match" only
returns the first match it finds. So I am thinking create unique ID col in
dat2 and dat2, then merge. But I dont know how to do that so it can be like
this:

dat1:

ObsNumber     ID          Weight  UniqueID
     1                 0001         12         0001_1
     2                 0001          13        0001_2
     3                 0001           14       0001_3
     4                  0002         16         0002_1
      5                 0002         17         0002_1

dat2:

ID               Height   UniqueID
0001            3.2          0001_1
0001             2.6         0001_2
0001             3.2         0001_3
0002             2.2         0002_1
0002              2.6        0002_2

Or if it is possible to merge dat1 and dat2 by matching "ID" but return the
match in order that would be great!

Thanks for your help!

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recode categorial vars into binary data

2013-05-07 Thread Chris Stubben


First off, stop using cbind() when it is not needed. You will not see the reason 
when the columns are all numeric but you will start experiencing pain and puzzlement 
when the arguments are of mixed classes. The data.frame function will do what you want. 
(Where do people pick up this practice anyway?)


Maybe from help( data.frame)?

It's in most of the  examples and is not needed ...

L3 <- LETTERS[1:3]
(d <- data.frame(cbind(x=1, y=1:10), fac=sample(L3, 10, replace=TRUE)))

## The same with automatic column names:

data.frame(cbind(  1,   1:10), sample(L3, 10, replace=TRUE))



Chris




--

Chris Stubben

Los Alamos National Lab
Bioscience Division
MS M888
Los Alamos, NM 87545

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create unique ID for each group

2013-05-07 Thread Chris Stubben

> I want to merge dat1 and dat2 based on "ID" in order

Have you tried merge(dat1, dat2) ?

If ID is the common column (and no others), then that should be all you 
need to join (see ?merge).  And then order if needed.


Chris



--

Chris Stubben

Los Alamos National Lab
Bioscience Division
MS M888
Los Alamos, NM 87545

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] create unique ID for each group

2013-05-07 Thread Ye Lin
Hey All,

I have a dataset(dat1) like this:

ObsNumber ID  Weight
 1 0001 12
 2 0001  13
 3 0001   14
 4  0002 16
  5 0002 17

And another dataset(dat2) like this:

ID   Height
00013.2
0001 2.6
0001 3.2
0002 2.2
0002  2.6

I want to merge dat1 and dat2 based on "ID" in order, I know "match" only
returns the first match it finds. So I am thinking create unique ID col in
dat2 and dat2, then merge. But I dont know how to do that so it can be like
this:

dat1:

ObsNumber ID  Weight  UniqueID
 1 0001 12 0001_1
 2 0001  130001_2
 3 0001   14   0001_3
 4  0002 16 0002_1
  5 0002 17 0002_1

dat2:

ID   Height   UniqueID
00013.2  0001_1
0001 2.6 0001_2
0001 3.2 0001_3
0002 2.2 0002_1
0002  2.60002_2

Or if it is possible to merge dat1 and dat2 by matching "ID" but return the
match in order that would be great!

Thanks for your help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recode categorial vars into binary data

2013-05-07 Thread David Winsemius

On May 7, 2013, at 9:20 AM, D. Alain wrote:

> Dear R-List, 
> 
> I would like to recode categorial variables into binary data, so that all 
> values above median are coded 1 and all values below 0, separating each var 
> into two equally large groups (e.g. good performers = 0 vs. bad performers 
> =1).
> 
> I have not succeeded so far in finding a nice solution to do that in R. I 
> thought there might be a better way than ordering each column and recoding 
> the first 50% into 0 and the second into 1. If I use ifelse I have a problem 
> with cases that share the same rank being all median. 
> 
> e.g.
> df<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(1,1,4,2,3,2,2,5,2,2),k2=c(1,2,3,2,1,2,1,3,3,2),result=c(4,3,5,4,2,6,4,4,2,3)))

First off, stop using cbind() when it is not needed. You will not see the 
reason when the columns are all numeric but you will start experiencing pain 
and puzzlement when the arguments are of mixed classes. The data.frame function 
will do what you want. (Where do people pick up this practice anyway?)



df[,2] <- as.numeric( order(df[,2]) >= length(df[,2])/2 )




> 
> now I want to recode k1 and k2 so that I have half of the values recoded 0 
> and half recoded 1, split around the median point. The median of k1 is 2 
> which would lead to unequal groupsize if used 2 as cutoff, so all values k1=2 
> should be recoded 1 or 0 randomly until both categories have the same length.
> 
> something like
> 
> df.rec<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(0,0,1,0,1,1,0,1,0,1),k2=c(0,1,1,0,0,1,0,1,1,0),result=c(4,3,5,4,2,6,4,4,2,3)))
> 
> Can anyone help?
> 
> Thank you in advance.
> 
> Best wishes.
> Alain  
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recode categorial vars into binary data

2013-05-07 Thread Rui Barradas

Hello,

First of all, you don't need as.data.frame(cbind(...)). It's much better 
to simply do data.frame(...).
As for the conversion, the following function doesn't use randomness but 
gets the job done




df <- data.frame(snr=c(1,2,3,4,5,6,7,8,9,10),
k1=c(1,1,4,2,3,2,2,5,2,2),
k2=c(1,2,3,2,1,2,1,3,3,2),
result=c(4,3,5,4,2,6,4,4,2,3))

fun <- function(x){
n <- length(x)
y <- rep(NA, n)
y[x < median(x)] <- 0
y[x > median(x)] <- 1
w <- which(x == median(x))
y[w[seq_len(n/2 - length(which(x < median(x]] <- 0
y[is.na(y)] <- 1
y
}

fun(df$k1)
fun(df$k2)



Hope this helps,

Rui Barradas

Em 07-05-2013 17:20, D. Alain escreveu:

Dear R-List,

I would like to recode categorial variables into binary data, so that all 
values above median are coded 1 and all values below 0, separating each var 
into two equally large groups (e.g. good performers = 0 vs. bad performers =1).

I have not succeeded so far in finding a nice solution to do that in R. I 
thought there might be a better way than ordering each column and recoding the 
first 50% into 0 and the second into 1. If I use ifelse I have a problem with 
cases that share the same rank being all median.

e.g.
df<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(1,1,4,2,3,2,2,5,2,2),k2=c(1,2,3,2,1,2,1,3,3,2),result=c(4,3,5,4,2,6,4,4,2,3)))

now I want to recode k1 and k2 so that I have half of the values recoded 0 and 
half recoded 1, split around the median point. The median of k1 is 2 which 
would lead to unequal groupsize if used 2 as cutoff, so all values k1=2 should 
be recoded 1 or 0 randomly until both categories have the same length.

something like

df.rec<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(0,0,1,0,1,1,0,1,0,1),k2=c(0,1,1,0,0,1,0,1,1,0),result=c(4,3,5,4,2,6,4,4,2,3)))

Can anyone help?

Thank you in advance.

Best wishes.
Alain
[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How does one set up logical functions?

2013-05-07 Thread Gabor Grothendieck
On Tue, May 7, 2013 at 11:06 AM, Gabor Grothendieck
 wrote:
> On Tue, May 7, 2013 at 10:02 AM, Neotropical bat risk assessments
>  wrote:
>> Hi all,
>>
>> I am trying to set up logical function(s) to deal with two adjustments
>> to a blood glucose value.
>> I have been dinking around in Excel and assume this will be much easier
>> in R.
>>
>> DF is date-time, BG value in mg/dL,test strip
>> 4/3/13 19:20105 Aviva-491350
>> 4/4/13 21:0374  Aviva-491350
>> 4/6/13 17:4081  Aviva-491640
>> 4/6/13 17:4082  Aviva-491350
>> 4/6/13 22:48106 Aviva-491640
>> 4/6/13 22:48102 Aviva-491350
>> 4/7/13 5:32 87  Aviva-491350
>> 4/7/13 5:32 103 Aviva-491640
>>
>>
>> What I need are the high and low ranges based on "acceptable" standards
>> of the measured values.
>>
>> The logical expressions need to be
>> IF BG =>100 then "High limit" would = (BG+(BG*.15))
>> IF BG =>100 then "Low limit" would = (BG-(BG*.15))
>> and
>> IF BG <100 then "High limit" would = (BG+15)
>> IF BG <100 then "Low limit" would = (BG-15)
>>
>> The standards are written as: 95% of the individual glucose results
>> shall fall within ą15 mg/dL of the reference results at glucose
>> concentrations less than 100 mg/dL and within ą15% at glucose
>> concentrations greater than or equal to 100 mg/dL.
>>
>> Then I need to plot the measured value and also show the high & low
>> "acceptable" values.
>>
>
> Here it is using gglot2:
>

Here it is again with some fixes and also reading the input data so
its all self contained:

library(ggplot2)
library(gridExtra)

Lines <- "date time BG test_strip
4/3/13 19:20  105 Aviva-491350
4/4/13 21:03   74  Aviva-491350
4/6/13 17:40   81  Aviva-491640
4/6/13 17:40   82  Aviva-491350
4/6/13 22:48   106 Aviva-491640
4/6/13 22:48   102 Aviva-491350
4/7/13 5:3287  Aviva-491350
4/7/13 5:32103 Aviva-491640"

DF <- read.table(text = Lines, header = TRUE)

DF2 <- transform(DF,
   datetime = as.POSIXct(paste(date, time), format = "%m/%d/%y %H:%M"),
   lower = ifelse(BG < 100, BG - 15, BG * 0.85),
   upper = ifelse(BG < 100, BG + 15, BG * 1.15))

ggplot(DF2, aes(datetime, BG)) +
   geom_point() +
   geom_line() +
   geom_smooth(aes(ymin = lower, ymax = upper), stat = "identity") +
   geom_linerange(aes(ymin = lower, ymax = upper)) +
   annotation_custom(tableGrob(DF2, gp = gpar(cex = 0.5)), ymin = 120) +
   coord_cartesian(ylim = c(60, 150)) +
   xlab("") +
   ylab("Blood Glucose") +
   ggtitle("Blood Glucose Levels")



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Balanced design throws "design unbalanced, cannot proceed" error

2013-05-07 Thread David Winsemius

On May 7, 2013, at 8:33 AM, Krysta Chauncey wrote:

> I think this means an unequal sample in different conditions. But it seems
> to mean something else. . .
> 
> I have a data set like below
> 
> participgroup   device  width   length  accep   thresh  rating
> d-rating1   RA  Dingo   nom nom Y   5   8
> 31   RA  Dingo   nom longY   4   6
>   21   RA  Dingo   fat nom Y   4   6
> 21   RA  Dingo   fat longN   6   4
> -2
> 
> and I'm running an ANOVA on it like so
> 
> aov.AMIDS_d <- aov(d.rating ~ group*device*width*length +
> Error(particip/(device*width*length))+group,data.AMIDS_d)
> 
> This works ok until I try to print the condition means like so
> 
> print(model.tables(aov.AMIDS_d,"means"),digits=3)
> 
> and it says
> 
> Error in model.tables.aovlist(aov.AMIDS_d, "means") : design is
> unbalanced so cannot proceed
> 
> According to the design, it ought to be balanced, so I need to check my
> data structure. I tried
> 
> table(data.AMIDS_d[,2:5])

In this table there is a variable named "devicegroup". In the model there are 
terms named 'device' and 'group" but none named 'devicegroup'. This error  
could have been identified 4 days ago on SO if you had edited the question to 
include the results of that table operation. It would have been even better to 
also  offer str(aov.AMIDS_d).

-- 
David.

> 
> to give a table of observations per condition and got this
> 
> , , width = fat, length = long
> 
> devicegroup Dingo SNAR
>   NR12   12
>   NV12   12
>   RA12   12
> , , width = nom, length = long
> 
> devicegroup Dingo SNAR
>   NR12   12
>   NV12   12
>   RA12   12
> , , width = fat, length = nom
> 
> devicegroup Dingo SNAR
>   NR12   12
>   NV12   12
>   RA12   12
> , , width = nom, length = nom
> 
> devicegroup Dingo SNAR
>   NR12   12
>   NV12   12
>   RA12   12
> 
> which looks both correct and balanced. So what am I missing, where's this
> error coming from?
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] recode categorial vars into binary data

2013-05-07 Thread D. Alain
Dear R-List, 

I would like to recode categorial variables into binary data, so that all 
values above median are coded 1 and all values below 0, separating each var 
into two equally large groups (e.g. good performers = 0 vs. bad performers =1).

I have not succeeded so far in finding a nice solution to do that in R. I 
thought there might be a better way than ordering each column and recoding the 
first 50% into 0 and the second into 1. If I use ifelse I have a problem with 
cases that share the same rank being all median. 

e.g.
df<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(1,1,4,2,3,2,2,5,2,2),k2=c(1,2,3,2,1,2,1,3,3,2),result=c(4,3,5,4,2,6,4,4,2,3)))

now I want to recode k1 and k2 so that I have half of the values recoded 0 and 
half recoded 1, split around the median point. The median of k1 is 2 which 
would lead to unequal groupsize if used 2 as cutoff, so all values k1=2 should 
be recoded 1 or 0 randomly until both categories have the same length.

something like

df.rec<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(0,0,1,0,1,1,0,1,0,1),k2=c(0,1,1,0,0,1,0,1,1,0),result=c(4,3,5,4,2,6,4,4,2,3)))

Can anyone help?

Thank you in advance.

Best wishes.
Alain  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot device stretched on Mac - any advice?

2013-05-07 Thread Gunnar
Hi Jeff,

Thanks - I've received an answer from another forum (It had to do with DPI in 
Quartz - rather than a simple aspect ratio).

Gunnar

On 7 May 2013, at 16:43, Jeff Newmiller [via R] wrote:

> Mac OSX-specific questions should be directed to r-sig-mac. 
> 
> Also, you should search before posting... this has been answered many times. 
> I recommend search terms "R aspect ratio". 
> --- 
> Jeff NewmillerThe .   .  Go Live... 
> DCN:<[hidden email]>Basics: ##.#.   ##.#.  Live Go... 
>   Live:   OO#.. Dead: OO#..  Playing 
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with 
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k 
> --- 
> Sent from my phone. Please excuse my brevity. 
> 
> Gunnar <[hidden email]> wrote: 
> 
> >Hi, 
> > 
> >I am using R 3.0.0 on MacOSX 10.7.5 and I am have problem with 
> >visualizing 
> >data in R. When I open a simple plot, 
> > 
> >plot (1,1); 
> > 
> >
> > 
> > 
> > 
> >I get a rectangular window with the labels stretched (see example). I 
> >have 
> >searched for a solution online, but have been unable to find an answer 
> >on 
> >how to make the default device appear as a square. If you have any 
> >suggestions which functions/parameters to look up or how to solve this, 
> >it 
> >would be greatly appreciated! 
> > 
> >Thanks, 
> >Gunnar 
> > 
> > 
> > 
> >-- 
> >View this message in context: 
> >http://r.789695.n4.nabble.com/Plot-device-stretched-on-Mac-any-advice-tp4666461.html
> >Sent from the R help mailing list archive at Nabble.com. 
> > 
> >__ 
> >[hidden email] mailing list 
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide 
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> 
> __ 
> [hidden email] mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 
> 
> 
> If you reply to this email, your message will be added to the discussion 
> below:
> http://r.789695.n4.nabble.com/Plot-device-stretched-on-Mac-any-advice-tp4666461p4666486.html
> To unsubscribe from Plot device stretched on Mac - any advice?, click here.
> NAML





--
View this message in context: 
http://r.789695.n4.nabble.com/Plot-device-stretched-on-Mac-any-advice-tp4666461p4666487.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Balanced design throws "design unbalanced, cannot proceed" error

2013-05-07 Thread Krysta Chauncey
 I think this means an unequal sample in different conditions. But it seems
to mean something else. . .

I have a data set like below

participgroup   device  width   length  accep   thresh  rating
d-rating1   RA  Dingo   nom nom Y   5   8
 31   RA  Dingo   nom longY   4   6
   21   RA  Dingo   fat nom Y   4   6
 21   RA  Dingo   fat longN   6   4
-2

and I'm running an ANOVA on it like so

aov.AMIDS_d <- aov(d.rating ~ group*device*width*length +
Error(particip/(device*width*length))+group,data.AMIDS_d)

This works ok until I try to print the condition means like so

print(model.tables(aov.AMIDS_d,"means"),digits=3)

and it says

Error in model.tables.aovlist(aov.AMIDS_d, "means") : design is
unbalanced so cannot proceed

According to the design, it ought to be balanced, so I need to check my
data structure. I tried

table(data.AMIDS_d[,2:5])

to give a table of observations per condition and got this

, , width = fat, length = long

 devicegroup Dingo SNAR
   NR12   12
   NV12   12
   RA12   12
, , width = nom, length = long

 devicegroup Dingo SNAR
   NR12   12
   NV12   12
   RA12   12
, , width = fat, length = nom

 devicegroup Dingo SNAR
   NR12   12
   NV12   12
   RA12   12
, , width = nom, length = nom

 devicegroup Dingo SNAR
   NR12   12
   NV12   12
   RA12   12

which looks both correct and balanced. So what am I missing, where's this
error coming from?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot device stretched on Mac - any advice?

2013-05-07 Thread Jeff Newmiller
Mac OSX-specific questions should be directed to r-sig-mac.

Also, you should search before posting... this has been answered many times. I 
recommend search terms "R aspect ratio".
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Gunnar  wrote:

>Hi,
>
>I am using R 3.0.0 on MacOSX 10.7.5 and I am have problem with
>visualizing
>data in R. When I open a simple plot,
>
>plot (1,1);
>
>
>
>
>I get a rectangular window with the labels stretched (see example). I
>have
>searched for a solution online, but have been unable to find an answer
>on
>how to make the default device appear as a square. If you have any
>suggestions which functions/parameters to look up or how to solve this,
>it
>would be greatly appreciated!
>
>Thanks,
>Gunnar
>
>
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/Plot-device-stretched-on-Mac-any-advice-tp4666461.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge two dataframe with "by", and problems with the common field

2013-05-07 Thread William Dunlap
> If the "a" columns are distinct, then at least one of them needs a new name 
> in the
> merged table, and the simplest option is to rename the columns appropriately 
> in d1 and
> d2 (since they apparently represent different data anyway).

You can also use the 'suffixes. argument to merge to control the naming of the
common column names that are not used as 'by' columns:
  > merge(d1,d2,by="b", suffixes=c("", ".y"))
b a c d a.y  f
  1 4 1 5 6   1  8
  2 5 2 6 7   2  9
  3 6 3 7 8   3 10

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Jeff Newmiller
> Sent: Tuesday, May 07, 2013 12:10 AM
> To: jpm miao; r-help
> Subject: Re: [R] Merge two dataframe with "by", and problems with the common 
> field
> 
> Either d1$a and d2$a are always the same, or they are not.
> 
> If they are already the same, you can either omit one of them in the merge:
> 
> merge(d1, d2[,-2], by="b")
> 
> or you can use a set of columns for your by:
> 
> merge(d1,d2, by=c("a","b"))
> 
> If the "a" columns are distinct, then at least one of them needs a new name 
> in the
> merged table, and the simplest option is to rename the columns appropriately 
> in d1 and
> d2 (since they apparently represent different data anyway).
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---
> Sent from my phone. Please excuse my brevity.
> 
> jpm miao  wrote:
> 
> >Hi,
> >
> > From time to time I merge two dataframes with possibly a common field.
> >Then the common field is no longer present,but what are present
> >fieldname.x
> >and fieldname.y. How can I fix the problem so that I can still call by
> >the
> >orignal fieldname? If you don't understand my problem, please see the
> >example below.
> >
> >   Thanks
> >
> >Miao
> >
> >
> >> d1
> >  a b c
> >1 1 4 5
> >2 2 5 6
> >3 3 6 7
> >> d2
> >  d a  f b
> >1 6 1  8 4
> >2 7 2  9 5
> >3 8 3 10 6
> >> d3<-merge(d1, d2, by="b")
> >> d3
> >  b a.x c d a.y  f
> >1 4   1 5 6   1  8
> >2 5   2 6 7   2  9
> >3 6   3 7 8   3 10
> >> d3["a"]
> >Error in `[.data.frame`(d3, "a") : undefined columns selected
> >> d3["a.x"]
> >  a.x
> >1   1
> >2   2
> >3   3
> >
> > [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How does one set up logical functions?

2013-05-07 Thread Gabor Grothendieck
On Tue, May 7, 2013 at 10:02 AM, Neotropical bat risk assessments
 wrote:
> Hi all,
>
> I am trying to set up logical function(s) to deal with two adjustments
> to a blood glucose value.
> I have been dinking around in Excel and assume this will be much easier
> in R.
>
> DF is date-time, BG value in mg/dL,test strip
> 4/3/13 19:20105 Aviva-491350
> 4/4/13 21:0374  Aviva-491350
> 4/6/13 17:4081  Aviva-491640
> 4/6/13 17:4082  Aviva-491350
> 4/6/13 22:48106 Aviva-491640
> 4/6/13 22:48102 Aviva-491350
> 4/7/13 5:32 87  Aviva-491350
> 4/7/13 5:32 103 Aviva-491640
>
>
> What I need are the high and low ranges based on "acceptable" standards
> of the measured values.
>
> The logical expressions need to be
> IF BG =>100 then "High limit" would = (BG+(BG*.15))
> IF BG =>100 then "Low limit" would = (BG-(BG*.15))
> and
> IF BG <100 then "High limit" would = (BG+15)
> IF BG <100 then "Low limit" would = (BG-15)
>
> The standards are written as: 95% of the individual glucose results
> shall fall within ą15 mg/dL of the reference results at glucose
> concentrations less than 100 mg/dL and within ą15% at glucose
> concentrations greater than or equal to 100 mg/dL.
>
> Then I need to plot the measured value and also show the high & low
> "acceptable" values.
>

Here it is using gglot2:


library(ggplot2)
library(gridExtra)

DF2 <- transform(DF,
datetime = as.POSIXct(DF2[[1]], format = "%m/%d/%y %H:%M"),
lower = ifelse(BG < 100, BG - 15, BG * 0.85),
upper = ifelse(BG < 100, BG + 15, BG * 1.15))


ggplot(DF2, aes(datetime, BG)) +
   geom_point() +
   geom_line() +
   geom_smooth(aes(ymin = lower, ymax = upper), stat = "identity") +
   geom_linerange(aes(ymin = lower, ymax = upper)) +
   annotation_custom(tableGrob(DF2, gp = gpar(cex = 0.5)), ymin = 120) +
   coord_cartesian(ylim = c(60, 150)) +
   xlab("") +
   ylab("Blood Glucose") +
   ggtitle("Blood Glucose Levels")


--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot device stretched on Mac - any advice?

2013-05-07 Thread Gunnar
Hi,

I am using R 3.0.0 on MacOSX 10.7.5 and I am have problem with visualizing
data in R. When I open a simple plot,

plot (1,1);


 

I get a rectangular window with the labels stretched (see example). I have
searched for a solution online, but have been unable to find an answer on
how to make the default device appear as a square. If you have any
suggestions which functions/parameters to look up or how to solve this, it
would be greatly appreciated!

Thanks,
Gunnar



--
View this message in context: 
http://r.789695.n4.nabble.com/Plot-device-stretched-on-Mac-any-advice-tp4666461.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Orthogonal transformation option in pgmm-plm

2013-05-07 Thread Eva Yamila da Silva Catela
Hi,
I'm a pgmm (plm) user and would like to know if a orthogonal transformation
is available, as in Stata xtabond2. Can someone help me? Thanks! Kinds
regards,
Eva

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extracting the residuals from models working with ordinal multinomial data

2013-05-07 Thread Jesús Fernández Moya
Hello

I am having some problems for extracting the residuals from models
working with ordinal multinomial data.

Either working with the polr() function or the plsRglm () function,
the residuals are "NULL". I guess this is because the data is
multinomial but I do not know how to solve it.

I have read the following in internet:
"can you tell us how residuals would be defined in principle for a
model with categorical responses? If you do "your model"$fitted.values
you obtain a matrix of probabilities. You could define residuals in
terms of correct prediction (defining the most likely outcome as the
prediction, as in the default predict method for polr objects) -- or
you could compute an n-by-n table of true values and predicted values.
Alternatively you could reduce the ordinal data back to an integer
scale and compute a mean outcome as the prediction ... but I can't see
that there's any unique way to define the residuals in the first
place."

However it seems clear, I don't understand how to this. May be some of
you can help me out with a more extense explanation of how to do it
and/or with some commands to do it?

Thank you very much

jesús

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How does one set up logical functions?

2013-05-07 Thread Bruce Miller
Hi all,

I am trying to set up logical function(s) to deal with two adjustments 
to a blood glucose value.
I have been dinking around in Excel and assume this will be much easier 
in R.

DF is date-time, BG value in mg/dL,test strip
4/3/13 19:20105 Aviva-491350
4/4/13 21:0374  Aviva-491350
4/6/13 17:4081  Aviva-491640
4/6/13 17:4082  Aviva-491350
4/6/13 22:48106 Aviva-491640
4/6/13 22:48102 Aviva-491350
4/7/13 5:32 87  Aviva-491350
4/7/13 5:32 103 Aviva-491640


What I need are the high and low ranges based on "acceptable" standards 
of the measured values.

The logical expressions need to be
IF BG =>100 then "High limit" would = (BG+(BG*.15))
IF BG =>100 then "Low limit" would = (BG-(BG*.15))
and
IF BG <100 then "High limit" would = (BG+15)
IF BG <100 then "Low limit" would = (BG-15)

The standards are written as: 95% of the individual glucose results 
shall fall within ±15 mg/dL of the reference results at glucose 
concentrations less than 100 mg/dL and within ±15% at glucose 
concentrations greater than or equal to 100 mg/dL.

Then I need to plot the measured value and also show the high & low 
"acceptable" values.

Thanks for any who respond.

Bruce


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Superimpose exponential density function to histogram

2013-05-07 Thread Manta
Cant believe it was that 



--
View this message in context: 
http://r.789695.n4.nabble.com/Superimpose-exponential-density-function-to-histogram-tp4666468p4666480.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Superimpose exponential density function to histogram

2013-05-07 Thread Rui Barradas

Hello,

Try

curve(dexp(x, rate=lambda), add=TRUE)

Hope this helps,

Rui Barradas

Em 07-05-2013 14:18, Manta escreveu:

Dear all,

I have a large vector of durations (in seconds) and I create an histogram as
follows:

hist(durations,breaks=500,xlim=c(0,2000),main="",xlab="Duration
(Seconds)",ylab="Frequency (%)",prob=TRUE)

Next, I would like to superimpose the exponential distribution with the
maximum likelihood estimate of lambda. To do so, I first calculate get the
estimate of lambda and then try to add the curve in the following way (with
error):

library(MASS)
lambda=fitdistr(durations,"exponential")$estimate
curve(rexp(1,rate=lambda),add=TRUE)

Error in curve(rexp(1,rate=lambda),add=TRUE) : 'expr' must be a
function, or a call or an expression containing 'x'.

So I need to have an 'x', OK. But doing this does not work either:
curve(dexp(durations,rate=lambda),add=TRUE)

What am I doing wrong?





--
View this message in context: 
http://r.789695.n4.nabble.com/Superimpose-exponential-density-function-to-histogram-tp4666468.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How does one set up logical functions?

2013-05-07 Thread jim holtman
Try this:

> input <- read.table(text = "date time BG test
+ 4/3/13 19:20105 Aviva-491350
+ 4/4/13 21:0374  Aviva-491350
+ 4/6/13 17:4081  Aviva-491640
+ 4/6/13 17:4082  Aviva-491350
+ 4/6/13 22:48106 Aviva-491640
+ 4/6/13 22:48102 Aviva-491350
+ 4/7/13 5:32 87  Aviva-491350
+ 4/7/13 5:32 103 Aviva-491640", as.is = TRUE, header = TRUE)
> # set limits
> input$High <- ifelse(input$BG >= 100
+ , input$BG * 1.15
+ , input$BG + 15
+ )
> input$Low <- ifelse(input$BG >= 100
+ , input$BG * 0.85
+ , input$BG - 15
+ )
> input
date  time  BG test   High   Low
1 4/3/13 19:20 105 Aviva-491350 120.75 89.25
2 4/4/13 21:03  74 Aviva-491350  89.00 59.00
3 4/6/13 17:40  81 Aviva-491640  96.00 66.00
4 4/6/13 17:40  82 Aviva-491350  97.00 67.00
5 4/6/13 22:48 106 Aviva-491640 121.90 90.10
6 4/6/13 22:48 102 Aviva-491350 117.30 86.70
7 4/7/13  5:32  87 Aviva-491350 102.00 72.00
8 4/7/13  5:32 103 Aviva-491640 118.45 87.55
>



On Tue, May 7, 2013 at 10:02 AM, Neotropical bat risk assessments <
neotropical.b...@gmail.com> wrote:

> Hi all,
>
> I am trying to set up logical function(s) to deal with two adjustments
> to a blood glucose value.
> I have been dinking around in Excel and assume this will be much easier
> in R.
>
> DF is date-time, BG value in mg/dL,test strip
> 4/3/13 19:20105 Aviva-491350
> 4/4/13 21:0374  Aviva-491350
> 4/6/13 17:4081  Aviva-491640
> 4/6/13 17:4082  Aviva-491350
> 4/6/13 22:48106 Aviva-491640
> 4/6/13 22:48102 Aviva-491350
> 4/7/13 5:32 87  Aviva-491350
> 4/7/13 5:32 103 Aviva-491640
>
>
> What I need are the high and low ranges based on "acceptable" standards
> of the measured values.
>
> The logical expressions need to be
> IF BG =>100 then "High limit" would = (BG+(BG*.15))
> IF BG =>100 then "Low limit" would = (BG-(BG*.15))
> and
> IF BG <100 then "High limit" would = (BG+15)
> IF BG <100 then "Low limit" would = (BG-15)
>
> The standards are written as: 95% of the individual glucose results
> shall fall within ą15 mg/dL of the reference results at glucose
> concentrations less than 100 mg/dL and within ą15% at glucose
> concentrations greater than or equal to 100 mg/dL.
>
> Then I need to plot the measured value and also show the high & low
> "acceptable" values.
>
> Thanks for any who respond.
>
> Bruce
>
>
> --
> Bruce W. Miller, PhD.
> Neotropical bat risk assessments
>
> If we lose the bats, we may lose much of the tropical vegetation and the
> lungs of the planet
>
> Using acoustic sampling to map species distributions for >15 years.
>
> Providing Interactive identification keys to the vocal signatures of New
> World Bats
>
> For various project details see:
>
> https://sites.google.com/site/batsoundservices/
>
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How does one set up logical functions?

2013-05-07 Thread Rui Barradas

Hello,

See if the following is what you want.


dat <-
structure(list(DF = c("4/3/13 19:20", "4/4/13 21:03", "4/6/13 17:40",
"4/6/13 17:40", "4/6/13 22:48", "4/6/13 22:48", "4/7/13 5:32",
"4/7/13 5:32"), BG = c(105L, 74L, 81L, 82L, 106L, 102L, 87L,
103L), test_strip = c("Aviva-491350", "Aviva-491350", "Aviva-491640",
"Aviva-491350", "Aviva-491640", "Aviva-491350", "Aviva-491350",
"Aviva-491640")), .Names = c("DF", "BG", "test_strip"), class = 
"data.frame", row.names = c(NA,

-8L))

idx <- dat$DF < 100
HighLimit <- LowLimit <- numeric(nrow(dat))
HighLimit[idx] <- dat$BG[idx] + 15
LowLimit[idx] <- dat$BG[idx] - 15
HighLimit[!idx] <- dat$BG[!idx] + dat$BG[!idx]*0.15
LowLimit[!idx] <- dat$BG[!idx] - dat$BG[!idx]*0.15

x <- as.POSIXct(dat$DF, format = "%m/%d/%y %H:%M")
yl <- range(c(dat$BG, HighLimit, LowLimit))
plot(x, dat$BG, ylim = yl, type = "b")
lines(x, HighLimit)
lines(x, LowLimit)


Hope this helps,

Rui Barradas

Em 07-05-2013 15:02, Neotropical bat risk assessments escreveu:

Hi all,

I am trying to set up logical function(s) to deal with two adjustments
to a blood glucose value.
I have been dinking around in Excel and assume this will be much easier
in R.

DF is date-time, BG value in mg/dL,test strip
4/3/13 19:20105 Aviva-491350
4/4/13 21:0374  Aviva-491350
4/6/13 17:4081  Aviva-491640
4/6/13 17:4082  Aviva-491350
4/6/13 22:48106 Aviva-491640
4/6/13 22:48102 Aviva-491350
4/7/13 5:32 87  Aviva-491350
4/7/13 5:32 103 Aviva-491640


What I need are the high and low ranges based on "acceptable" standards
of the measured values.

The logical expressions need to be
IF BG =>100 then "High limit" would = (BG+(BG*.15))
IF BG =>100 then "Low limit" would = (BG-(BG*.15))
and
IF BG <100 then "High limit" would = (BG+15)
IF BG <100 then "Low limit" would = (BG-15)

The standards are written as: 95% of the individual glucose results
shall fall within ±15 mg/dL of the reference results at glucose
concentrations less than 100 mg/dL and within ±15% at glucose
concentrations greater than or equal to 100 mg/dL.

Then I need to plot the measured value and also show the high & low
"acceptable" values.

Thanks for any who respond.

Bruce




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using as.integer(NA) in the .C function

2013-05-07 Thread cgenolin
Damm... I am reading the WRE, but I am only at page 83. I start to try to
play with NAOK to early.

Anyway, exactly the same function for numeric instead of integer will give
different results:

--- 8<  C code ---
void hein2(double *a, double *b, double* c){
*c = (*a + *b);
}
--- 8< ---
--- 8< - R code ---
 .C("hein2",as.numeric(NA),as.numeric(1),as.numeric(1),NAOK=TRUE)[[3]]
[1] NA
--- 8< - 
That's why I find the results of "hein" stranges...

Christophe



--
View this message in context: 
http://r.789695.n4.nabble.com/Using-as-integer-NA-in-the-C-function-tp4666470p4666476.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using as.integer(NA) in the .C function

2013-05-07 Thread Berend Hasselman

On 07-05-2013, at 15:30, cgenolin  wrote:

> Hi the list,
> I am including some C code in a R program using the .C interface. I want to
> deal with NA values, but the result is strange:
> 
> --- 8<  C code ---
> void hein(int *a, int *b, int* c){
>  *c = (*a + *b);
> }
> --- 8< ---
> --- 8< - R code ---
>> .C("hein",as.integer(NA),as.integer(1),as.integer(1),NAOK=TRUE)[[3]]
> [1] -2147483647
> --- 8< -
> The result should be NA, isn't it?

Why?

> What wrong il my code?


Read the manual "Writing R Extensions".
See section 5.10.3 and section 6.4 in that manual (I'm referring to the pdf 
available on CRAN for R-3.0.0).
There may be more references in the manual but these were the first I found by 
searching, which you should have done.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How does one set up logical functions?

2013-05-07 Thread Neotropical bat risk assessments
Hi all,

I am trying to set up logical function(s) to deal with two adjustments 
to a blood glucose value.
I have been dinking around in Excel and assume this will be much easier 
in R.

DF is date-time, BG value in mg/dL,test strip
4/3/13 19:20105 Aviva-491350
4/4/13 21:0374  Aviva-491350
4/6/13 17:4081  Aviva-491640
4/6/13 17:4082  Aviva-491350
4/6/13 22:48106 Aviva-491640
4/6/13 22:48102 Aviva-491350
4/7/13 5:32 87  Aviva-491350
4/7/13 5:32 103 Aviva-491640


What I need are the high and low ranges based on "acceptable" standards 
of the measured values.

The logical expressions need to be
IF BG =>100 then "High limit" would = (BG+(BG*.15))
IF BG =>100 then "Low limit" would = (BG-(BG*.15))
and
IF BG <100 then "High limit" would = (BG+15)
IF BG <100 then "Low limit" would = (BG-15)

The standards are written as: 95% of the individual glucose results 
shall fall within ±15 mg/dL of the reference results at glucose 
concentrations less than 100 mg/dL and within ±15% at glucose 
concentrations greater than or equal to 100 mg/dL.

Then I need to plot the measured value and also show the high & low 
"acceptable" values.

Thanks for any who respond.

Bruce


-- 
Bruce W. Miller, PhD.
Neotropical bat risk assessments

If we lose the bats, we may lose much of the tropical vegetation and the lungs 
of the planet

Using acoustic sampling to map species distributions for >15 years.

Providing Interactive identification keys to the vocal signatures of New World 
Bats

For various project details see:

https://sites.google.com/site/batsoundservices/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using as.integer(NA) in the .C function

2013-05-07 Thread cgenolin
Hi the list,
I am including some C code in a R program using the .C interface. I want to
deal with NA values, but the result is strange:

--- 8<  C code ---
void hein(int *a, int *b, int* c){
  *c = (*a + *b);
}
--- 8< ---
--- 8< - R code ---
> .C("hein",as.integer(NA),as.integer(1),as.integer(1),NAOK=TRUE)[[3]]
[1] -2147483647
--- 8< -
The result should be NA, isn't it? What wrong il my code?

Christophe



--
View this message in context: 
http://r.789695.n4.nabble.com/Using-as-integer-NA-in-the-C-function-tp4666470.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Tinn-R news

2013-05-07 Thread Jose Claudio Faria
Dears Tinn-R users,

A new version of the Editor/GUI/IDE Tinn-R (2.4.1.6) was released today.
News at: http://sourceforge.net/p/tinn-r/news/2013/05/tinn-r-2416-released/

The project now has its proper page:
http://nbcgib.uesc.br/lec/software/des/editores/tinn-r/en

Download is available in both:
1-  WebPage: http://nbcgib.uesc.br/lec/software/des/editores/tinn-r/en
2-  SoureForge: http://sourceforge.net/projects/tinn-r/

All the best,
--
///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\
Jose Claudio Faria
Estatistica
UESC/DCET/Brasil
joseclaudio.faria at gmail.com
Telefones:
55(73)3680.5545 - UESC
55(73)9100.7351 - TIM
55(73)8817.6159 - OI
///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Superimpose exponential density function to histogram

2013-05-07 Thread Manta
Dear all,

I have a large vector of durations (in seconds) and I create an histogram as
follows:

hist(durations,breaks=500,xlim=c(0,2000),main="",xlab="Duration
(Seconds)",ylab="Frequency (%)",prob=TRUE)

Next, I would like to superimpose the exponential distribution with the
maximum likelihood estimate of lambda. To do so, I first calculate get the
estimate of lambda and then try to add the curve in the following way (with
error):

library(MASS)
lambda=fitdistr(durations,"exponential")$estimate
curve(rexp(1,rate=lambda),add=TRUE)

Error in curve(rexp(1,rate=lambda),add=TRUE) : 'expr' must be a
function, or a call or an expression containing 'x'.

So I need to have an 'x', OK. But doing this does not work either:
curve(dexp(durations,rate=lambda),add=TRUE)

What am I doing wrong?





--
View this message in context: 
http://r.789695.n4.nabble.com/Superimpose-exponential-density-function-to-histogram-tp4666468.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with biomaRt::getSequence.

2013-05-07 Thread Pascal Oettli

Hi,

Do you permission to write inside /mnt/ephemeral0/mysqltmp/?

Regards,
Pascal

On 05/07/2013 06:41 PM, Mohammad Tanvir Ahamed wrote:

Hi,
I can run the code some days ago . But cant run now.

Problem 1: Output is ok
ensembl = useDataset("hsapiens_gene_ensembl",mart=ensembl)
utr5 = getSequence(chromosome=3, start=185514033, end=185535839, 
type="entrezgene",seqType="5utr", mart=ensembl)
Output :

   5utr  entrezgene
  
Sequence unavailable  10644
  
GGAGCGCCGGGTACCGGGCCGAGCCGCGGGCTCTCAAGAGACGG  10644
3 
GGCGGAGGAGGAGGAGAGACGAGGGCAGCGGAGGAGGCGAGGAGCGCCGGGTACCGGGCCGAGCCGCGGGCTCTCAAGAGACGG
  10644
  
CGGAGGAGGCGAGGAGCGCCGGGTACCGGGCCGAGCCGCGGGCTCTCAAGAGACGG  10644
  No UTR is annotated 
for this transcript  10644

Problem 2:Problem is here

protein = getSequence(id=c(100, 5728),type="entrezgene",seqType="peptide", 
mart=ensembl)

Error in getBM(c(seqType, type), filters = type, values = id, mart = mart,  :
   Query ERROR: caught BioMart::Exception::Database: Error during query 
execution: Can't create/write to file '/mnt/ephemeral0/mysqltmp/#sql_40a_0.MYI' 
(Errcode: 2)

I need help please.

/...Tanvir Ahamed

[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with biomaRt::getSequence.

2013-05-07 Thread Mohammad Tanvir Ahamed
Hi,
I can run the code some days ago . But cant run now. 

Problem 1: Output is ok
ensembl = useDataset("hsapiens_gene_ensembl",mart=ensembl)
utr5 = getSequence(chromosome=3, start=185514033, end=185535839, 
type="entrezgene",seqType="5utr", mart=ensembl) 
Output : 
                                                                                
              5utr  entrezgene
                                                                             
Sequence unavailable      10644
                                             
GGAGCGCCGGGTACCGGGCCGAGCCGCGGGCTCTCAAGAGACGG      10644
3 
GGCGGAGGAGGAGGAGAGACGAGGGCAGCGGAGGAGGCGAGGAGCGCCGGGTACCGGGCCGAGCCGCGGGCTCTCAAGAGACGG
      10644
                                 
CGGAGGAGGCGAGGAGCGCCGGGTACCGGGCCGAGCCGCGGGCTCTCAAGAGACGG      10644
                                                         No UTR is annotated 
for this transcript      10644
 
Problem 2:Problem is here

protein = getSequence(id=c(100, 5728),type="entrezgene",seqType="peptide", 
mart=ensembl)

Error in getBM(c(seqType, type), filters = type, values = id, mart = mart,  : 
  Query ERROR: caught BioMart::Exception::Database: Error during query 
execution: Can't create/write to file '/mnt/ephemeral0/mysqltmp/#sql_40a_0.MYI' 
(Errcode: 2)

I need help please.

/...Tanvir Ahamed

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting elements from a matrix using a vector containing indices

2013-05-07 Thread peter dalgaard

On May 7, 2013, at 06:39 , Mark Coletti wrote:

> I have a matrix of data that has a corresponding vector of indices.  I
> would like to use those indices to extract specific matrix elements into a
> new vector.  In other words, I have an R X C matrix with a corresponding
> vector of C elements that have numbers mapping into specific elements of
> the matrix.  I'd like to generate a new vector C long that contains those
> elements.
> 
> My first approach is to iteratively grind through each matrix column to
> manually extract the element corresponding to that column as referenced by
> the appropriate index, and then appending that to a vector.  However, my R
> instincts are that any time one has to craft a for loop to do something in
> R that you are very likely Doing It Wrong; there's probably a very simple
> one liner in R that will do all that that for loop would do.
> 
> I know there must be a simple "R way" to do this, but I remain flummoxed on
> how to do so.  I've been feebly poking at the plyr package's maplyr() and
> colwise() since I have a matrix and want to extract an array from it, and
> this is inherently a column wise operation, so ... hmm.
> 
> Should I just give up and use a for loop?


As Berend says, you're not specifying the problem very clearly. Are you looking 
for something like this (indexing with a matrix)?

> M <- matrix(round(rnorm(20,20,10)),4,5)
> M
 [,1] [,2] [,3] [,4] [,5]
[1,]   29   19   18   130
[2,]   24   24   11   25   10
[3,]   209   12   11   24
[4,]   283   16   17   32
> Ix <- sample(1:4,5,replace=TRUE)
> Ix
[1] 3 2 2 4 3
> M[cbind(Ix,1:5)]
[1] 20 24 11 17 24


-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R does not subset

2013-05-07 Thread John Field

This typically occurs because of sloppy manual data entry outside of R. To
relieve further analysis pain, you can manually clean the data (usually only
effective for one-time analyses) or use R to fix problems right after
loading the data (there are multiple methods for doing this... I prefer
using ?sub on character data before creating the factor).



str_trim in package stringr is great for this.
John







--
View this message in context: 
http://r.789695.n4.nabble.com/R-does-not-subset-tp4666129p4666453.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] recommended workflow for creating functions (was: Re: [Rd] Patch proposal for R style consistency (concerning deparse.c))

2013-05-07 Thread Liviu Andronic
(moving to r-help)

Dear all,
I think Paul is raising a useful question here: What is the
recommended workflow for creating a new function?

R prides itself for letting users to create and use home-brewed
functions: it's easy to maintain and re-use, doesn't clutter the
global environment with intermediary objects, has clear input and
output elements. All fine points, and I genuinely see the advantage in
using functions, but after several years of using R I still stumble at
step 1: building a function.

The thing is that I'm not a whizz programmer, so it takes me a lot of
time and mind gymnastics to imagine what the intermediary objects
within a function, when executed, would look like. Thus I need to see
and examine most intermediary objects while building the function. The
alternative is to create the input objects and use only the first
element while building the function, but this clutters the global
workspace and I need to worry about overwriting existing objects.
Another way would be to use browser() within the function, but this
seems clunky to me.

How do you deal with this? Is there a neat way to build a new function
in a "sandbox", where I can access all objects in the global workspace
but cannot overwrite them? Am I approaching this in the wrong way?

Opinions welcome. Regards,
Liviu


On Sun, May 5, 2013 at 7:37 PM, Paul Johnson  wrote:
> It quite often happens that when I'm developing a function, I have to step
> through line by line to see what's going on.  That' won't work for me if I
> have "else" by itself at the beginning of a line.
>
> As an Objective-C programmer, I very much think it looks nicer to write
>
> if ( )
> {
>blah
> }
> else
> {
>blah
> }
>
> But in R, I can't step through that, so I don't write it that way.
>
> The fact that you don't run into that makes me think that I'm preparing
> functions incorrectly.  If you don't check your code line by line, what is
> your work flow?
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] hi.

2013-05-07 Thread Silvia Lucato

Breakthrough Diet Exposed: Celebrity Doctor Uncovers The "Holy Grail of Weight 
Loss" 
 http://www.trainingloyalcompanions.com/yrxccw.php 












  












 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pR2 stumped

2013-05-07 Thread Achim Zeileis

On Mon, 6 May 2013, ivo welch wrote:


Dear R experts:  I am stumped.I am trying to pick off the mcfadden
R^2 for a probit.  simple, me thinks---except my code works only in my
main program, but not in my sub!?I am probably doing something
obviously wrong, but I have stared at my code for a while now and I
feel even more stupid than usually.  am I doing something obviously
wrong here?  advice, as always, appreciated.  /iaw


pscl:::pR2.polr calls update(object, ~ 1) to obtain the fitted 
log-likelihood of the null model (intercept only). If the calls are nested 
deeply enough, re-evaluating the model call does not work because it 
happens in the wrong environment.


A workaround could be to do the updating within your summarize() function 
directly and then call pscl:::pR2Work(llh, llhNull, n) directly. Or if you 
really just need the McFadden pseudo R-squared, it is then probably 
simplest to compute 1 - llh/llhNull yourself...


Best,
Z



library(MASS)
library(pscl)

summarize <- function(p) {
 pp <- polr( factor(r1) ~ factor(r2), Hess=TRUE, data=p, method="probit" )
 print(pp)
 cat(" (sub) now we do the pseudo R^2\n")
 print(pR2(pp))  ## it fails in the summarize sub program, wanting 'p'  huh?
 cat(" ok\n")
}


d <- data.frame(r1=factor( c("a", "a", "b", "b", "c", "c") ),
r2=factor( c("a", "b", "c", "b", "c", "a") ))


### it works in the main program
pp <- (polr( factor(r1) ~ factor(r2), Hess=TRUE, data=d, method="probit" ))
print(pp)
cat(" (main) now we do the pseudo R^2\n")
print(pR2(pp))
cat(" ok\n")
summarize(d)

Error in eval(expr, envir, enclos) : object 'p' not found.


Ivo Welch (ivo.we...@gmail.com)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge two dataframe with "by", and problems with the common field

2013-05-07 Thread Jeff Newmiller
Either d1$a and d2$a are always the same, or they are not.

If they are already the same, you can either omit one of them in the merge:

merge(d1, d2[,-2], by="b")

or you can use a set of columns for your by:

merge(d1,d2, by=c("a","b"))

If the "a" columns are distinct, then at least one of them needs a new name in 
the merged table, and the simplest option is to rename the columns 
appropriately in d1 and d2 (since they apparently represent different data 
anyway).
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

jpm miao  wrote:

>Hi,
>
> From time to time I merge two dataframes with possibly a common field.
>Then the common field is no longer present,but what are present
>fieldname.x
>and fieldname.y. How can I fix the problem so that I can still call by
>the
>orignal fieldname? If you don't understand my problem, please see the
>example below.
>
>   Thanks
>
>Miao
>
>
>> d1
>  a b c
>1 1 4 5
>2 2 5 6
>3 3 6 7
>> d2
>  d a  f b
>1 6 1  8 4
>2 7 2  9 5
>3 8 3 10 6
>> d3<-merge(d1, d2, by="b")
>> d3
>  b a.x c d a.y  f
>1 4   1 5 6   1  8
>2 5   2 6 7   2  9
>3 6   3 7 8   3 10
>> d3["a"]
>Error in `[.data.frame`(d3, "a") : undefined columns selected
>> d3["a.x"]
>  a.x
>1   1
>2   2
>3   3
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge two dataframe with "by", and problems with the common field

2013-05-07 Thread Rainer Schuermann
Not sure whether this really helps you but at least it works for your sample:

d3 <- merge( d1, d2, by = c(  "a", "b" )  )

> d3
>   
>
  a b c d  f

 
1 1 4 5 6  8

 
2 2 5 6 7  9

 
3 3 6 7 8 10

Rgds,
Rainer


On Tuesday 07 May 2013 14:33:12 jpm miao wrote:
> Hi,
> 
>From time to time I merge two dataframes with possibly a common field.
> Then the common field is no longer present,but what are present fieldname.x
> and fieldname.y. How can I fix the problem so that I can still call by the
> orignal fieldname? If you don't understand my problem, please see the
> example below.
> 
>Thanks
> 
> Miao
> 
> 
> > d1
>   a b c
> 1 1 4 5
> 2 2 5 6
> 3 3 6 7
> > d2
>   d a  f b
> 1 6 1  8 4
> 2 7 2  9 5
> 3 8 3 10 6
> > d3<-merge(d1, d2, by="b")
> > d3
>   b a.x c d a.y  f
> 1 4   1 5 6   1  8
> 2 5   2 6 7   2  9
> 3 6   3 7 8   3 10
> > d3["a"]
> Error in `[.data.frame`(d3, "a") : undefined columns selected
> > d3["a.x"]
>   a.x
> 1   1
> 2   2
> 3   3
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge two dataframe with "by", and problems with the common field

2013-05-07 Thread Jim Lemon

On 05/07/2013 04:33 PM, jpm miao wrote:

Hi,

From time to time I merge two dataframes with possibly a common field.
Then the common field is no longer present,but what are present fieldname.x
and fieldname.y. How can I fix the problem so that I can still call by the
orignal fieldname? If you don't understand my problem, please see the
example below.

Thanks

Miao



d1

   a b c
1 1 4 5
2 2 5 6
3 3 6 7

d2

   d a  f b
1 6 1  8 4
2 7 2  9 5
3 8 3 10 6

d3<-merge(d1, d2, by="b")
d3

   b a.x c d a.y  f
1 4   1 5 6   1  8
2 5   2 6 7   2  9
3 6   3 7 8   3 10

d3["a"]

Error in `[.data.frame`(d3, "a") : undefined columns selected

d3["a.x"]

   a.x
1   1
2   2
3   3


Hi jpm miao,
Because you have a column named "a" in both data frames, the merge 
function adds ".x" and ".y" to the fields with common names. You could 
change the name of one column, for example, change the name of the "a" 
column in d2 to "e". You could also drop one of the "a" columns in this 
case as the two columns are identical.


d3<-merge(d1, d2[,c("d","f","b")], by="b")

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.