Re: [R] processing matrix equation derived from rows of two matrices

2013-04-12 Thread arun
Hi,

May be this helps:

 tb[1,]%*%(((val-rep(meansb79[1,],5))^2)/6)
#    [,1]
#[1,] 1.47619
tryvarb-c(1,2,3,4,4,4,4)
 var(tryvarb)
#[1] 1.47619

tb[2,]%*%(((val-rep(meansb79[2,],5))^2)/6)
# [,1]
#[1,] 1.904762

sapply(seq_len(nrow(tb)),function(i) tb[i,]%*%(((val-rep(meansb79[i,],5))^2/6)))
# [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619
# [8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143
#[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143
#[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238
#[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000
#[36] 0.4761905

#or
mat1-matrix(((val-rep(meansb79,each=5))^2/6),ncol=5,byrow=TRUE)

diag(t(apply(tb,1,function(x) x))%*% apply(mat1,1,function(x) x))
# [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619
 #[8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143
#[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143
#[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238
#[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000
#[36] 0.4761905

#or
 diag(apply(tb,1,function(x) x%*% t(mat1)))
# [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619
 #[8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143
#[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143
#[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238
#[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000
#[36] 0.4761905



A.K.


I have a matrix (tb) that represents all samples of size n=7 from a group of 
N=9, with scores c(1,2,3,4,4,4,4,5,5).  I want to 
calculate the variance for every sample.  Here is my code.  The 
bottom shows the matrix equations and an attempt to process it for each 
row. I got the strategies from reading the r-help, but neither works. (I
 do not understand the syntax well enough.) Any suggestions.  (I need to
 do may :additional matrices in the same way.) Thanks. Jan 

require(combinat) 
n=7 
base79-combn(c(1,2,3,4,4,4,4,5,5), n, tabulate,nbins=5) #find 
all samples,n=7 (gives cols of  7 cases  in sample like 1 1 1 4 0 ) 
tb-t(base79) 
val-c(1,2,3,4,5)  #values on the scale 
meansb79-t(base79)%*% (val/7) 

tb[ ,1])%*%(((val-rep(meansb79[1,],5))^2)/6)   #computes the sample variance 
for the first sample 
#check 
tryvarb-c(1,2,3,4,4,4,4) 
var(tryvarb) 
  
#Now I try to get the variance for each sample (row) in tb, but neither of the 
following attempts work. 

trybase - apply(tb,1,function(i) 
  t(base79[,i])%*%(((val-rep(meansb79[i,],5))^2)/6)) 

#or 

domatrix.f - function(tb, meansb79) { 
     a - nrow(A); b - nrow(B); 
    C - matrix(NA, nrow=a, ncol=b); 
    for (i in 1:a) 
      for (j in 1:b) 
        C[i,j] - t(A[,i])%*%(((val-rep(B[i,],5))^2)/6) } 
domatrix.f(tb, meansb79)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating %d/%m/%Y %H:%M:%S format from separate date and time columns

2013-04-12 Thread Cat Cowie
Hi R forum,

Each row of my data (below) show a new contact event between animals.
In order to ultimately look at the patterns of intervals between
contacts, I need to calculate a contact end time. The contact starts
at the date and time shown in V4 and V5, and lasts for the duration
shown IN SECONDS in V6:


 data2- read.csv(file=file.choose(), header=F, sep= )
 head(data2)
  V1  V2  V3 V4   V5 V6
1  3 PO4 CO1 2011-04-29 07:27:21 28
2  3 PO4 CO1 2011-04-24 05:57:39 20
3  3 PO4 CO1 2011-04-14 10:29:49  4
4  3 PO4 CO1 2011-04-16 07:27:31 63
5  3 PO4 CO1 2011-04-18 15:46:20  1
6  3 PO4 CO1 2011-04-18 15:45:57  1

To start with I have tried to make the start data and time into one new column:

startt- strptime((paste(data2$V4, data2$V5)), %d/%m/%Y %H:%M:%S)


This executes without any warnings, but returns a full column of NA
values. It would be great to fix this, and then to know how to
correctly add column V6 as seconds to the resulting column.


The problem is further exacerbated by an error with dput() with this
data. It's a large dataset of over 9000 rows, and when I call:

dput(head(data2,50))

It returns dput(), but for all the data (i.e. not the first 50 rows).
This of course does not fit on the workstation screen and therefore I
cannot find out what class it has assigned to any of the data. The
times appear sorted, suggesting they are being classed as a factor?
Sorry I can't provide dput() data!

Thanks, Cat

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating %d/%m/%Y %H:%M:%S format from separate date andtimecolumns

2013-04-12 Thread Gerrit Eichner

Hello, Cat,

see inline below.

 Hth  --  Gerrit

On Fri, 12 Apr 2013, Cat Cowie wrote:


Hi R forum,

Each row of my data (below) show a new contact event between animals.
In order to ultimately look at the patterns of intervals between
contacts, I need to calculate a contact end time. The contact starts
at the date and time shown in V4 and V5, and lasts for the duration
shown IN SECONDS in V6:



data2- read.csv(file=file.choose(), header=F, sep= )
head(data2)

 V1  V2  V3 V4   V5 V6
1  3 PO4 CO1 2011-04-29 07:27:21 28
2  3 PO4 CO1 2011-04-24 05:57:39 20
3  3 PO4 CO1 2011-04-14 10:29:49  4
4  3 PO4 CO1 2011-04-16 07:27:31 63
5  3 PO4 CO1 2011-04-18 15:46:20  1
6  3 PO4 CO1 2011-04-18 15:45:57  1

To start with I have tried to make the start data and time into one new column:

startt- strptime((paste(data2$V4, data2$V5)), %d/%m/%Y %H:%M:%S)


Shouldn't you use - instead of / as it is used in data$V4?




This executes without any warnings, but returns a full column of NA
values. It would be great to fix this, and then to know how to
correctly add column V6 as seconds to the resulting column.


The problem is further exacerbated by an error with dput() with this
data. It's a large dataset of over 9000 rows, and when I call:

dput(head(data2,50))

It returns dput(), but for all the data (i.e. not the first 50 rows).
This of course does not fit on the workstation screen and therefore I
cannot find out what class it has assigned to any of the data. The
times appear sorted, suggesting they are being classed as a factor?
Sorry I can't provide dput() data!

Thanks, Cat

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assessing the fit of a nonlinear model to a new dataset

2013-04-12 Thread Rebecca Lester
Thank you all for your useful suggestions.

Jean, I had already tried to use the maxiter command in that way, but it simply 
told me that the model had not converged.

Based on Profs Ripley and Nash's comments, I have opted to use an alternative 
approach, and am creating a distribution of plausible models, and comparing my 
sum of squared residuals to those arising from that distribution. This seems to 
work.

Thanks again.

Cheers,

Rebecca

-
Rebecca Lester
Lecturer, Freshwater Ecology
School of Life  Environmental Sciences
Deakin University
PO Box 423, Warrnambool, 3280

Ph: +61 3 5563 3330
Fax: +61 3 5563 3462


-Original Message-
From: Prof J C Nash (U30A) [mailto:nas...@uottawa.ca] 
Sent: Saturday, 6 April 2013 7:42 AM
To: r-help@r-project.org; Rebecca Lester
Subject: Re: Assessing the fit of a nonlinear model to a new dataset

Given nls has a lot of C code (and is pretty complicated), I doubt you'll find 
much joy doing that.

nlxb from my nlmrt package is all in R, but you'll need to do quite a bit of 
work at each stage. I don't form the J' J matrix, and do a Marquardt 
approximation by adding appropriate rows to the J matrix then do a qr 
decomposition on that.

In any event, the Hessian (which  J' J is only a rather poor appriximation to) 
is what you want, and it may not be positive definite at the iterates, so you 
have infinite standard errors. Well, if the curvature was 0, they'd be 
infinite. Since the curvature is negative, maybe the SEs are more than 
infinite, if that has any meaning.

I have one problem for which I generated 1000 starting points and 75% had the 
Hessian indefinite. That is a simple logistic nonlinear regression, albeit with 
nasty data.

JN


 Message: 90 Date: Fri, 5 Apr 2013 05:06:57 + From: Rebecca Lester 
 rebecca.les...@deakin.edu.au To: r-help@r-project.org 
 r-help@r-project.org Subject: [R] Assessing the fit of a nonlinear 
 model to a new dataset Message-ID: 
 5a72faa65583bc45a816a698a960e92788612...@mbox-f-3.du.deakin.edu.au 
 Content-Type: text/plain Hi all, I am attempting to apply a nonlinear 
 model developed using nls to a new dataset and assess the fit of that 
 model. At the moment, I am using the fitted model from my fit dataset 
 as the starting point for an nls fit for my test dataset (see below). 
 I would like to be able to view the t-statistic and p-values for each 
 of the iterations using the trace function, but have not yet worked 
 out how to do this. Any other suggestions are also welcome. Many 
 thanks, Rebecca
 model.wa - nls(y ~ A*(x^B), start=list(A=107614,B=-0.415)) # create 
 nls() power model for WA data
 summary(model.wa) # model summary

 Formula: y ~ A * (x^B)

 Parameters:
 Estimate Std. Error t value Pr(|t|)
 A  7.644e+04  1.240e+04   6.165 4.08e-06 ***
 B -3.111e-01  4.618e-02  -6.736 1.15e-06 ***
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Residual standard error: 5605 on 21 degrees of freedom

 Number of iterations to convergence: 6 Achieved convergence tolerance: 
 7.184e-06
   (6 observations deleted due to missingness)


 model.vic - nls(y.vic ~ A*(x.vic^B), start = list(A = 7.644e+04, B = 
 -3.111e-01), trace = T)
 3430193778 :  76440.-0.3111
 2634092902 :  48251.9235397-0.2552481
 2614516166 :  27912.1921354-0.1772322
 2521588892 :  32718.3764594-0.1862611
 2521233646 :  32476.4536126-0.1836836
 2521230904 :  32553.0767231-0.1841362
 2521230824 :  32540.063480-0.184059
 2521230822 :  32542.2970040-0.1840721




 Important Notice: The contents of this email are intended solely for the 
 named addressee and are confidential; any unauthorised use, reproduction or 
 storage of the contents is expressly prohibited. If you have received this 
 email in error, please delete it and any attachments immediately and advise 
 the sender by return email or telephone.

 Deakin University does not warrant that this email and any attachments are 
 error or virus free.

   [[alternative HTML version deleted]]



 --


Important Notice: The contents of this email are intended solely for the named 
addressee and are confidential; any unauthorised use, reproduction or storage 
of the contents is expressly prohibited. If you have received this email in 
error, please delete it and any attachments immediately and advise the sender 
by return email or telephone.

Deakin University does not warrant that this email and any attachments are 
error or virus free.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating %d/%m/%Y %H:%M:%S format from separate date andtimecolumns

2013-04-12 Thread Cat Cowie
Hi Gerrit,

Thanks for your quick reply - this, in combination with reversing the
date as it is shown in my data, worked perfectly.

startt- strptime((paste(data2$V4, data2$V5)), %Y-%m-%d %H:%M:%S)

Apologies for the silly question it seems, but you've saved me a lot
of time. Always learning
Thanks, Cat



On 12 April 2013 15:00, Gerrit Eichner
gerrit.eich...@math.uni-giessen.de wrote:
 Hello, Cat,

 see inline below.

  Hth  --  Gerrit


 On Fri, 12 Apr 2013, Cat Cowie wrote:

 Hi R forum,

 Each row of my data (below) show a new contact event between animals.
 In order to ultimately look at the patterns of intervals between
 contacts, I need to calculate a contact end time. The contact starts
 at the date and time shown in V4 and V5, and lasts for the duration
 shown IN SECONDS in V6:


 data2- read.csv(file=file.choose(), header=F, sep= )
 head(data2)

  V1  V2  V3 V4   V5 V6
 1  3 PO4 CO1 2011-04-29 07:27:21 28
 2  3 PO4 CO1 2011-04-24 05:57:39 20
 3  3 PO4 CO1 2011-04-14 10:29:49  4
 4  3 PO4 CO1 2011-04-16 07:27:31 63
 5  3 PO4 CO1 2011-04-18 15:46:20  1
 6  3 PO4 CO1 2011-04-18 15:45:57  1

 To start with I have tried to make the start data and time into one new
 column:

 startt- strptime((paste(data2$V4, data2$V5)), %d/%m/%Y %H:%M:%S)


 Shouldn't you use - instead of / as it is used in data$V4?



 This executes without any warnings, but returns a full column of NA
 values. It would be great to fix this, and then to know how to
 correctly add column V6 as seconds to the resulting column.


 The problem is further exacerbated by an error with dput() with this
 data. It's a large dataset of over 9000 rows, and when I call:

 dput(head(data2,50))

 It returns dput(), but for all the data (i.e. not the first 50 rows).
 This of course does not fit on the workstation screen and therefore I
 cannot find out what class it has assigned to any of the data. The
 times appear sorted, suggesting they are being classed as a factor?
 Sorry I can't provide dput() data!

 Thanks, Cat

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating %d/%m/%Y %H:%M:%S format from separate date and time columns

2013-04-12 Thread Pascal Oettli
Hi,

The following should work:

 x - paste(paste(data2$V4, data2$V5), data2$V6, sep='.')
 startt - strptime(x, %Y-%m-%d %H:%M:%OS)

Regards,
Pascal


2013/4/12 Cat Cowie cat.e.co...@gmail.com

 Hi R forum,

 Each row of my data (below) show a new contact event between animals.
 In order to ultimately look at the patterns of intervals between
 contacts, I need to calculate a contact end time. The contact starts
 at the date and time shown in V4 and V5, and lasts for the duration
 shown IN SECONDS in V6:


  data2- read.csv(file=file.choose(), header=F, sep= )
  head(data2)
   V1  V2  V3 V4   V5 V6
 1  3 PO4 CO1 2011-04-29 07:27:21 28
 2  3 PO4 CO1 2011-04-24 05:57:39 20
 3  3 PO4 CO1 2011-04-14 10:29:49  4
 4  3 PO4 CO1 2011-04-16 07:27:31 63
 5  3 PO4 CO1 2011-04-18 15:46:20  1
 6  3 PO4 CO1 2011-04-18 15:45:57  1

 To start with I have tried to make the start data and time into one new
 column:

 startt- strptime((paste(data2$V4, data2$V5)), %d/%m/%Y %H:%M:%S)


 This executes without any warnings, but returns a full column of NA
 values. It would be great to fix this, and then to know how to
 correctly add column V6 as seconds to the resulting column.


 The problem is further exacerbated by an error with dput() with this
 data. It's a large dataset of over 9000 rows, and when I call:

 dput(head(data2,50))

 It returns dput(), but for all the data (i.e. not the first 50 rows).
 This of course does not fit on the workstation screen and therefore I
 cannot find out what class it has assigned to any of the data. The
 times appear sorted, suggesting they are being classed as a factor?
 Sorry I can't provide dput() data!

 Thanks, Cat

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ordination Plotting: Warning: Species scores not available

2013-04-12 Thread Jari Oksanen
Micah Bennett micahgbennett at yahoo.com writes:

 
 Hi,
 
 I am working with a species-by-trait .csv file (columns=traits,
rows=species) and get the following
 warning message when trying to plot results of both metaMDS and pcoa: 
 
 Warning message:
 In ordiplot(x, choices = choices, type = type, display = display,  :
   Species scores not available
 
 I am using a Gower's transformation in both procedures within the metaMDS
or pcoa functions, and I know I am
 not using something already made into a distance matrix. There is missing
trait data for some species, but
 there must be a way to plot the trait locations/vectors. There has been at
least one posting about this in
 years past but it did not receive a satisfactory response.
 
 Any help would be much appreciated.
 
 -Micah
   [[alternative HTML version deleted]]

Dear Micah Bennett,

I don't know why you say that there must be a way: there is no obligation
to have this feature, but someone can write code to do so. However, as far
as I know, this cannot be directly.

Both metaMDS() and PCoA (which PCoA? Which function, which package?) work 
with dissimilarities, and dissimilarities have no information about
original column variables. If you calculate dissimilarities within 
metaMDS, the function knows the original column values, and may be able
add scores for those. However, it is not able to add those scores if you
have missing values. Actually, I was not able to run metaMDS with missing
values unless I suppressed calculating column scores (wascores = FALSE).
So there is no direct way of getting those. I don't know how you calcualted
PCoA, but the standard R function cmdscale (or its variant wcmdscale in 
vegan) only accept dissimilarities as input and are unable to add scores
for original cover values. Perhaps there is a function called pcoa() 
somewhere (labdsv?) which can calculate the dissimilarities internally and
is able to add column scores, but consult appropriate documentation in
that package.

You may be able to calculate column scores after the analysis. The
default scores in vegan::metaMDS() are wascores() which do not handle
missing values. No hope there. However, if you want to have fitted 
vectors or class centroids, you can use envfit(..., na.rm = TRUE) which
will use the complete.cases() subset of the ordination scores for
fitting the vectors or class centroids. If you want to have NA-handling
in wascores, you better submit new code to vegan and we will incorporate
that in the next release.

Best wishes, Jari Oksanen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating %d/%m/%Y %H:%M:%S format from separate date andtimecolumns

2013-04-12 Thread Jeff Newmiller
avoid using F, spell it out

Use an appropriate TZ... e.g. use Sys.setenv( TZ=Etc/GMT+5 ) before 
converting if your time is local standard time year round

Use difftime to add to time values for consistent results

endt - startt + as.difftime( data2$V6, units=secs )

Use the str function to explore your data
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Cat Cowie cat.e.co...@gmail.com wrote:

Hi Gerrit,

Thanks for your quick reply - this, in combination with reversing the
date as it is shown in my data, worked perfectly.

startt- strptime((paste(data2$V4, data2$V5)), %Y-%m-%d %H:%M:%S)

Apologies for the silly question it seems, but you've saved me a lot
of time. Always learning
Thanks, Cat



On 12 April 2013 15:00, Gerrit Eichner
gerrit.eich...@math.uni-giessen.de wrote:
 Hello, Cat,

 see inline below.

  Hth  --  Gerrit


 On Fri, 12 Apr 2013, Cat Cowie wrote:

 Hi R forum,

 Each row of my data (below) show a new contact event between
animals.
 In order to ultimately look at the patterns of intervals between
 contacts, I need to calculate a contact end time. The contact starts
 at the date and time shown in V4 and V5, and lasts for the duration
 shown IN SECONDS in V6:


 data2- read.csv(file=file.choose(), header=F, sep= )
 head(data2)

  V1  V2  V3 V4   V5 V6
 1  3 PO4 CO1 2011-04-29 07:27:21 28
 2  3 PO4 CO1 2011-04-24 05:57:39 20
 3  3 PO4 CO1 2011-04-14 10:29:49  4
 4  3 PO4 CO1 2011-04-16 07:27:31 63
 5  3 PO4 CO1 2011-04-18 15:46:20  1
 6  3 PO4 CO1 2011-04-18 15:45:57  1

 To start with I have tried to make the start data and time into one
new
 column:

 startt- strptime((paste(data2$V4, data2$V5)), %d/%m/%Y %H:%M:%S)


 Shouldn't you use - instead of / as it is used in data$V4?



 This executes without any warnings, but returns a full column of NA
 values. It would be great to fix this, and then to know how to
 correctly add column V6 as seconds to the resulting column.


 The problem is further exacerbated by an error with dput() with this
 data. It's a large dataset of over 9000 rows, and when I call:

 dput(head(data2,50))

 It returns dput(), but for all the data (i.e. not the first 50
rows).
 This of course does not fit on the workstation screen and therefore
I
 cannot find out what class it has assigned to any of the data. The
 times appear sorted, suggesting they are being classed as a factor?
 Sorry I can't provide dput() data!

 Thanks, Cat

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bnlearn: how to compute boot strength with mmhc and a blacklist

2013-04-12 Thread Marco Scutari
Dear Leonore,

On Wed, Apr 10, 2013 at 5:51 PM, Leonore Wigger leonore.wig...@unil.ch wrote:
 Question: I have specified a blacklist. I would have expected this to
 completely disallow the arcs on the blacklist. But the result shows that
 some of the blacklisted arcs have a strength  0 (rows 7,8,10,11). It
 seems that only the arc that was blacklisted in both directions was
 actually banned (x1-x2, in rows 9 and 12). What is the reason for this?
 Is there a way to completely disallow all blacklisted arcs, such that
 their strength is 0.0? Or is there a compelling reason why that should
 not be done?

Because by default boot.strength() runs with cpdag = TRUE. This
means that reversible arcs can have positive strength in both
directions. You should set cpdag = FALSE to get the result you are
expecting. In that case the probabilities of the arc directions should
be taken with a grain of salt, as they can be influenced by many
things (optimized = TRUE/FALSE, order of the variables in the data
set) unless you are doing causal modelling.

 This code makes 50 different networks from the same data, then uses them
 as input for custom.strength. The networks are constructed using the
 algorithm hc. A different network is produced every time hc is
 invoked because a random starting network is supplied to the parameter
 start. I would like to do the same thing, but use mmhc instead of
 hc. However, in my hands, the networks that are constructed by mmhc
 are all identical, and I am not sure how to introduce a random element
 into the construction. Question: Which parameters do I need to give to
 mmhc in order to obtain a different network every time it is run on
 the same data set?

This is not surprising, because mmhc() does not have a start
argument, so it's starting from the same network over and over. There
is no way to provide a random seed to mmhc(), so the only way to
perturb it is through bootstrap.

Marco

-- 
Marco Scutari, Ph.D.
Research Associate, Genetics Institute (UGI)
University College London (UCL), United Kingdom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pairwise ANOSIM test

2013-04-12 Thread Joon-Taek Yoo
Hello

I am studying ANOSIM (analysis of similarities) using a package(vegan)
provided in r statistics.
I am wondering how to do pairwise ANOSIM test among three groups.
Can anybody please explain this question.

Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] werttemberg

2013-04-12 Thread Ali Zanaty
 http://aragon-vida.com/11a-hdnsdsjreunx-1.php 










 well, good night and thank you very much 
932d4a1dff1375c06208c4cf2a7675e527431024e7d241382978df47d0f8c43f 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error loading workspaces after upgrade

2013-04-12 Thread Tamas Barjak
Dear Members,
I upgraded R from 2.15.3 to 3.0.0 (binary) on my WinXP and now I can't
load my workspaces. The error message is:

Error: requested primitive type is not consistent with cached value
During startup - Warning message:
unable to restore saved data in .RData

Any idea to fix that?

Thanks, Tamas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error loading workspaces after upgrade

2013-04-12 Thread PIKAL Petr
Hm. Sometimes workspace cannot be loaded when some packages which vere used for 
creationd objects in workspaces are missing, but in that case the error message 
is different.

So you either can delete .RData but it obviously **delete** all objects or you 
need to find which package is missing from your R3.0.0 installation. 

If you want to keep only some objects from your workspace return back to 2.15.3 
and save them explicitely

see
?save

Regards
Petr

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Tamas Barjak
 Sent: Friday, April 12, 2013 1:14 PM
 To: r-help@r-project.org
 Subject: [R] Error loading workspaces after upgrade
 
 Dear Members,
 I upgraded R from 2.15.3 to 3.0.0 (binary) on my WinXP and now I can't
 load my workspaces. The error message is:
 
 Error: requested primitive type is not consistent with cached value
 During startup - Warning message:
 unable to restore saved data in .RData
 
 Any idea to fix that?
 
 Thanks, Tamas
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bnlearn: how to compute boot strength with mmhc and a blacklist

2013-04-12 Thread Leonore Wigger
Thanks a lot, Marco. This works. And your remark regarding less reliable 
arc strengths with cpdag=FALSE was very helpful.

Leonore

Le 12/04/2013 11:08, Marco Scutari a écrit :

Dear Leonore,

On Wed, Apr 10, 2013 at 5:51 PM, Leonore Wigger leonore.wig...@unil.ch wrote:

Question: I have specified a blacklist. I would have expected this to
completely disallow the arcs on the blacklist. But the result shows that
some of the blacklisted arcs have a strength  0 (rows 7,8,10,11). It
seems that only the arc that was blacklisted in both directions was
actually banned (x1-x2, in rows 9 and 12). What is the reason for this?
Is there a way to completely disallow all blacklisted arcs, such that
their strength is 0.0? Or is there a compelling reason why that should
not be done?

Because by default boot.strength() runs with cpdag = TRUE. This
means that reversible arcs can have positive strength in both
directions. You should set cpdag = FALSE to get the result you are
expecting. In that case the probabilities of the arc directions should
be taken with a grain of salt, as they can be influenced by many
things (optimized = TRUE/FALSE, order of the variables in the data
set) unless you are doing causal modelling.


This code makes 50 different networks from the same data, then uses them
as input for custom.strength. The networks are constructed using the
algorithm hc. A different network is produced every time hc is
invoked because a random starting network is supplied to the parameter
start. I would like to do the same thing, but use mmhc instead of
hc. However, in my hands, the networks that are constructed by mmhc
are all identical, and I am not sure how to introduce a random element
into the construction. Question: Which parameters do I need to give to
mmhc in order to obtain a different network every time it is run on
the same data set?

This is not surprising, because mmhc() does not have a start
argument, so it's starting from the same network over and over. There
is no way to provide a random seed to mmhc(), so the only way to
perturb it is through bootstrap.

Marco



--
Leonore Wigger
University of Lausanne
Genomic Technologies Facility
Génopode Building
1015 Lausanne
Switzerland

++41 (0)21 692 41 16

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A strange behaviour in the graphical function curve

2013-04-12 Thread Julio Sergio
Berend Hasselman bhh at xs4all.nl writes:

 
 Yes. curve expects the function you give it to return a vector if the input 
argument is a vector.
 This is clearly documented for the argument expr of curve.

Thanks a lot, Berend! 

In fact, I didn't read carefully the documentation of curve. Anyway, I 
wouldn't know how to transform my function into one that would be accepted by 
curve, so your feedback has been very useful for me.

 -Sergio.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RODBC MSSQL query with date - time tag

2013-04-12 Thread Rob Steenburgh

Greetings:

I am trying to query an MSSQL database which contains time series data 
using RODBC.


Using Server Management Studio, I can retrieve the data with this query:

Select * from tb_ace_mag_1m where time_tag='2012-01-08 00:00:00' AND 
time_tag'2012-01-08 03:00:00'


However, when I try to accomplish this using R:
 sqlQuery(channel1,Select * from tb_ace_mag_1m where 
time_tag='2012-01-08 00:00:00' AND time_tag'2012-01-08 03:00:00', max=10)


I obtain no results:

 [1] time_taginsert_time dsflag  numpts  gse_bx gse_by  
gse_bz  gse_lat gse_lon gsm_bx gsm_by  gsm_bz  
gsm_lat gsm_lon bt

0 rows (or 0-length row.names)

If I omit the time_tag parameter, I get the following:

 sqlQuery(channel1,select * from tb_ace_mag_1m, max=10)
  time_tag insert_time dsflag numpts gse_bx
gse_by   gse_bz gse_lat  gse_lon  gsm_bx gsm_by 
gsm_bzgsm_lat  gsm_lon   bt
1  2013-01-21 15:23:00 2013-01-21 15:28:26  0 37  0.34268624 
-2.814154 -0.460940033  -9.2350445 276.9429  0.33351120 -2.713921 
-0.8790606 -17.822054 277.0059 2.872170
2  2013-01-21 15:24:00 2013-01-21 15:29:29  0 60  0.45056427 
-2.862180 -0.404557437  -7.9486165 278.9461  0.44123372 -2.769705 
-0.8321871 -16.526619 279.0516 2.925534
3  2013-01-21 15:25:00 2013-01-21 15:30:00  0 15  0.16728164 
-2.787083 -1.013695598 -19.9538670 273.4348  0.15818150 -2.602005 
-1.4240593 -28.647099 273.4789 2.970420


How do I properly format the query in RODBC to obtain the results I seek?
I was unable to discover a solution in the archives, although it appears 
I'm not the only one who has struggled with date-time queries.


Thanks,
Rob Steenburgh
NOAA/NWS Space Weather Prediction Center

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] produce a map in ARCGIS from a GAM output

2013-04-12 Thread Lauria, Valentina
Hi,

I am trying to predict the habitat suitability of lobster with GAMs. I need to 
produce a map in ArcGis of the predicted densities. I did some search and the 
function predict.gam seems to do the job.

Is this the right way to do it? when you apply the function you get the 
predicted values (different from your input data points) but I do not 
understand what is the procedure with the spatial coordinates, I mean what lat 
and long have the new points? Can I use the same ones of my observations?

Thank you very much for your time,
Valentina

Dr. Valentina Lauria
Postdoctoral researcher
Room 118, Martin Ryan Institute
Department of Earth and Ocean Sciences
National University of Ireland, Galway
Ireland

www.nephrops.euhttp://www.nephrops.eu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] odfWeave: Some questions about potential formatting options

2013-04-12 Thread Milan Bouchet-Valat
Le jeudi 11 avril 2013 à 13:40 -0700, Paul Miller a écrit :
Hello All,
 
 Learning to use the odfWeave package. I really like the package. It has good 
 documentation, makes some very nice looking tables, and seems to have lots of 
 options for customizing output.
 
 There are a few things I'd like to do that don't seem to be covered in the 
 documentation though. So I'm not sure if they're possible or not. 
 
 Here's a list of some things I'd like to be able to do:
 
 1. Make titles generated by odfTableCaption/odfFigureCaption bold 
 
 2. Add footnotes to tables (using something other than odfCat)
 
 3. Control the width of columns
 
 4. Control the alignment of columns (first column left and centered otherwise)
 
 Is it possible to do any or all of these things using odfWeave?
 For points 1 and 4, you have to use a style, and edit the style via 
 LibreOffice to make the text bold or change its alignement. See tableStyles() 
 for the latter.


For point 3 (column widths), I needed to make small changes to odfWeave to 
support that part of the ODF spec. I've sent them to Max Kuhn for review, but 
so far he probably could not find the time to study them.

If you are interested, you can grab the modified version of the package from 
here :
http://nalimilan.perso.neuf.fr/transfert/odfWeave.tar.gz

Then you can set the width like that:

currentDefs - getStyleDefs()

currentDefs$firstColumn$type - Table Column
currentDefs$firstColumn$columnWidth - 5 cm

currentDefs$secondColumn$type - Table Column
currentDefs$secondColumn$columnWidth - 3 cm

setStyleDefs(currentDefs)

Then you can use these styles this way:
odfTable(df, styles=style,
 colStyles=c(firstColumn, secondColumn))


For point 2, I'm not sure what you mean. Do you want to include a legend below 
the table, or do you want to use real footnotes?


Regards

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems with the DSL package

2013-04-12 Thread Vini
Hello,

I'm trying to use R with an distributed File system (HDFS) and when I launch
a MapReduce job with DSL package, I've got this message :
*Streaming Command Failed!
Error in !unlist(lapply(chunks, is.null)) : invalid argument type*

I am using the Cloudera CDH4 Virtual Machine, R version is 2.15.3 and Hadoop
0.20.2 CDH3u4.

Commands lines :
/library(DSL)
storage-DStorage(type=HDFS, base_dir=/user/cloudera/dc)
dl - DList( line1 = This is the first line., line2 = Now, the second
line., DStorage=storage )
res - DLapply( dl, function(x) unlist(strsplit(x,  )) )/

Somebody knows something about this problem?

Thanks and Regards, 

Vincent.





--
View this message in context: 
http://r.789695.n4.nabble.com/Problems-with-the-DSL-package-tp4664059.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gc()

2013-04-12 Thread Keith S Weintraub
Folks,
  I have some computations in a function that create some large matrices.

I have been in the habit in these circumstances to call null out a matrix 
once used and call gc().

Some pseudo code:

theFunction-function(x, y, z, 1) {
   myMatrix  - black.box.function(x, y, z, 1)
   interesting.data-myFunc(myMatrix)
   myMatrix-NULL
   gc()
   
   interesting.data/pi
}

Would it be different if I chose to replace myMatrix-NULL with rm(myMatrix)?

The gc docs are fine but I was wondering if there were any other tips or links 
on the subject.

Thanks for your time,
KW

--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] produce a map in ARCGIS from a GAM output

2013-04-12 Thread Barry Rowlingson
On Fri, Apr 12, 2013 at 10:37 AM, Lauria, Valentina
valentina.lau...@nuigalway.ie wrote:
 Hi,

 I am trying to predict the habitat suitability of lobster with GAMs. I need 
 to produce a map in ArcGis of the predicted densities. I did some search and 
 the function predict.gam seems to do the job.

 Is this the right way to do it? when you apply the function you get the 
 predicted values (different from your input data points) but I do not 
 understand what is the procedure with the spatial coordinates, I mean what 
 lat and long have the new points? Can I use the same ones of my observations?

 I suspect you'll get a better response if you:

 a) ask on R-sig-geo, the mailing list for all things R and mappy
 b) try and ask with a reproducible example - it doesn't have to be
your data, but something similar, and small, and that we can run to
get a better idea of your problem. Show us how you are running
predict.gam, at least, and what your data looks like.

To get data from R into ArcGIS you probably want to either use the
raster package to create a grid of values, or the rgdal package to
create a shapefile of point locations and estimates. Exactly how you
get that from your predict.gam results is a bit guesswork until we
know a bit more about how you are running predict.gam

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding unequal tables

2013-04-12 Thread arun
Hi,
table.x- as.table(as.matrix(read.table(text=
A    B C    D
2 4  6 5
,sep=,header=TRUE)))
table.y- as.table(as.matrix(read.table(text=
C  D
5   1
,sep=,header=TRUE))) 
library(plyr)
library(reshape2)
vec1-rowSums(join(melt(table.x)[,-1], 
melt(table.y)[,-1],by=Var2)[,-1],na.rm=TRUE)
names(vec1)- colnames(table.x)
 table.xy-as.table(vec1)
table.xy
# A  B  C  D 
# 2  4 11  6 

#or
 dat1-merge(table.x,table.y,all=TRUE)[,-1]
table.xy-as.table(with(dat1,tapply(Freq,list(Var2),FUN=sum)))
 table.xy
# A  B  C  D 
# 2  4 11  6 
 str(table.xy)
# 'table' int [1:4(1d)] 2 4 11 6
# - attr(*, dimnames)=List of 1
#  ..$ : chr [1:4] A B C D

#or
 datNew-join(as.data.frame(table.x),as.data.frame(table.y),type=full)[,-1]
res-aggregate(Freq~Var2,data=datNew,sum)[,2]
 names(res)- colnames(table.x)
 table.xy-as.table(res)
table.xy
# A  B  C  D 
# 2  4 11  6 
A.K.


This is probably really simple, but I'm missing something: 

I have two tables generated with the table() function that have 
overlapping columns. I want to add the entries where the column is the 
same. For instance: 

table.x 
A    B     C    D 
2     4      6     5 

table.y 

C  D 
5   1 

Output needed: 
A   B   C    D 
2    4    11  6

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Stat question: How to deal w/ negative outliers?

2013-04-12 Thread ramoss
Hello all,

I have a question:  I am using the interquantile method to spot outliers 
it gives me values of  say 234  -120 or for the higher  lower benchmarks.
I don't have any issues w/ the higher end.  However I don't have any
negative values. My lowest possible value is 0. Should I consider 0 as an
outlier?

Thanks ahead  for your thoughts



--
View this message in context: 
http://r.789695.n4.nabble.com/Stat-question-How-to-deal-w-negative-outliers-tp4664068.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Read the data from a text file and reshape the data

2013-04-12 Thread Janesh Devkota
Hi Arun,

Thank you so much for your answer. It surely does help. Having to know
different approaches for the same problem has given me more insights on R
language. I appreciate your time and effort.

Best,

Janesh Devkota


On Thu, Apr 11, 2013 at 10:00 PM, arun smartpink...@yahoo.com wrote:

 Hi,
 May be this helps:
  lines1- readLines(WAT_DEP.DAT.part)
 indx- which(grepl([*],lines1))
 indx2-indx[seq(from=indx[2],length(indx),by=2)]+1
 lines2-str_trim(lines1[indx2],side=left)
 dat1-read.table(text=lines2,sep=,header=FALSE)

 library(stringr)
 lst1- lapply(split(indx,((seq_along(indx)-1)%/%2)+1), function(x)
 {x1-seq(max(x)+2,max(x)+2+49,by=1); x2- 
 str_trim(lines1[x1][!is.na(lines1[x1])],side=left);
 x3-as.vector(as.matrix(read.table(text=x2[x2!=],header=FALSE,sep=)));
 x4- if(length(x3) 500) c(x3,rep(NA,500-length(x3))) else x3;x4 })
 dat2- as.data.frame(do.call(cbind,lst1))
 colnames(dat2)-paste(t,colnames(dat2),_,dat1[,2],sep=)
  dim(dat2)
 #[1] 500 622
 dat2[1:3,1:3]
 #  t1_0.00208 t2_0.00417 t3_0.00625
 #1  3.224  4.124  4.502
 #2  3.205  4.118  4.500
 #3  3.189  4.114  4.498

 A.K.

 - Original Message -
 From: Janesh Devkota janesh.devk...@gmail.com
 To: r-help@r-project.org
 Cc:
 Sent: Thursday, April 11, 2013 5:55 PM
 Subject: [R] Read the data from a text file and reshape the data

 I have a data set for different time intervals. The data has three comment
 lines before data for each time interval. For each time interval there are
 500 data points. I want to change the dataset such that I have the
 following
 format:



 t1t2t3   

 0.00208 0.00417 0.00625 .

 a1   a2   a3 ...

 b1   b2   b3 ...

 c1c2c3 .

 ...

 



 The link to the file is as follows:
 https://www.dropbox.com/s/hc8n3qcai1mlxca/WAT_DEP.DAT



 As you will see on the file, time for each interval is the second data of
 the third line before the data starts. For the first time, t= 0.00208. I
 need to change the data in several rows into one column. At last I need to
 create a dataframe with the format shown above. In the sample above, a1,
 b1,
 c1 are the data for time t1, and so on.



 The sample data is as follows:





 ** N:SNAPSHOTTIME  DELT[S]

 ** WATER DEPTH [M]: (HP(L),L=2,LA)

   18000.00208   0.1

  3.224 3.221 3.220 3.217 3.216 3.214 3.212
 3.210 3.209 3.207

  3.205 3.203 3.202 3.200 3.199 3.197 3.196
 3.193 3.192 3.190

  3.189 3.187 3.186 3.184 3.184 3.182 3.181
 3.179 3.178 3.176

  3.175 3.174 3.173 3.171 3.170 3.169 3.168
 3.167 3.166 3.164

  3.164 3.162 3.162 3.160 3.160 3.158 3.158
 3.156 3.156 3.155

  3.154 3.153 3.152 3.151 3.150 3.150 3.149
 3.149 3.147 3.147

  3.146 3.146 3.145 3.145 3.144 3.144 3.143
 3.143 3.142 3.142

  3.141 3.142 3.141 3.141 3.140 3.141 3.140
 3.140 3.139 3.140

  3.139 3.140 3.139 3.140 3.139 3.140 3.139
 3.140 3.139 3.140

  3.139 3.140 3.140 3.140 3.140 3.141 3.141
 3.142 3.141 3.142

  3.142 3.142 3.143 3.143 3.144 3.144 3.145
 3.145 3.146 3.146

  3.147 3.148 3.149 3.149 3.150 3.150 3.152
 3.152 3.153 3.154

  3.155 3.156 3.157 3.158 3.159 3.160 3.161
 3.162 3.163 3.164

  3.165 3.166 3.168 3.169 3.170 3.171 3.173
 3.174 3.176 3.176

  3.178 3.179 3.181 3.182 3.184 3.185 3.187
 3.188 3.190 3.191

  3.194 3.195 3.196 3.198 3.199 3.202 3.203
 3.205 3.207 3.209

  3.210 3.213 3.214 3.217 3.218 3.221 3.222
 3.225 3.226 3.229

  3.231 3.233 3.235 3.238 3.239 3.242 3.244
 3.247 3.248 3.251

  3.253 3.256 3.258 3.261 3.263 3.266 3.268
 3.271 3.273 3.276

  3.278 3.281 3.283 3.286 3.289 3.292 3.294
 3.297 3.299 3.303

  3.305 3.307 3.311 3.313 3.317 3.319 3.322
 3.325 3.328 3.331

  3.334 3.337 3.340 3.343 3.347 3.349 3.353
 3.356 3.359 3.362

  3.366 3.369 3.372 3.375 3.379 3.382 3.386
 3.388 3.392 3.395

  3.399 3.402 3.406 3.409 3.413 

Re: [R] A strange behaviour in the graphical function curve

2013-04-12 Thread Julio Sergio
Berend Hasselman bhh at xs4all.nl writes:

 
 Your function miBeta returns a scalar when the argument mu is a vector.
 Use Vectorize to vectorize it. Like this
 
   VmiBeta - Vectorize(miBeta,vectorize.args=c(mu))
   VmiBeta(c(420,440))
 
 and draw the curve with this
 
   curve(VmiBeta,xlim=c(370,430), xlab=mu, ylab=L(mu))
 
 Berend
 

Taking into account what you have pointed out, I reprogrammed my function 
as follows, as an alternative solution to yours:

   zetas - function(alpha) {z - qnorm(alpha/2); c(z,-z)}

   # First transformation function
   Tzx - function(z, sigma_p, mu_p) sigma_p*z + mu_p

   # Second transformation function
   Txz - function(x, sigma_p, mu_p) (x - mu_p)/sigma_p

   BetaG - function(mu, alpha, n, sigma, mu_0) {
 lasZ - zetas(alpha) # Zs corresponding to alpha
 sigma_M - sigma/sqrt(n) # sd of my distribution
 lasX - Tzx(lasZ, sigma_M, mu_0) # Transformed Zs into Xs
 # Now I consider mu to be a vector composed of m's 
 NewZ - lapply(mu, function(m) Txz(lasX, sigma_M, m))
 # NewZ is a list, the same length as mu, with 2D vectors
 # The result will be a vector, the same length as mu (and NewZ)
 sapply(NewZ, function(zz) pnorm(zz[2]) - pnorm(zz[1]))
   }

   miBeta - function(mu) BetaG(mu, 0.05, 36, 48, 400)

   plot(miBeta,xlim=c(370,430), xlab=mu, ylab=L(mu))

I hope this is useful to people following this discussion,

  -Sergio.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multivariate AICc

2013-04-12 Thread Vinicius Bastazini
 Dear all,
I would like to find out if anyone is aware of the existence of
 implemented functions, or codes,  for running multivariate AICc as it was
developed by* *Fujikoshi  Satoh (1997)* *. I've been searching for it,
 but goggle, rseek, and all other searching mechanism returns nothing.
Thanks!
Fujikoshi, Y.,  Satoh, K. (1997). Modified AIC and Cp in multivariate
linear regression. *Biometrika*, *84*(3), 707-716.

-- 
Vinicius A. G. Bastazini
Laboratório de Ecologia Quantitativa
PPG-Ecologia UFRGS

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] produce a map in ARCGIS from a GAM output

2013-04-12 Thread Lauria, Valentina
Dear Barry,

Thank you very much for your reply. I am trying to create a map in GIS of the 
habitat prediction of lobster using environmental predictors. I am attaching 
here a brief example of my script and data.

I am OK for the GAM fitting and evaluation (residuals plot etc) but I am having 
problems to create a prediction map (raster of my GAM output) in GIS. In 
particular when I run predict.gam I get a set of values which I guess are the 
probability of lobster densities in function of my environmental predictors, 
but I do not know how to geo-reference these points. I will read about the 
raster package as I have to extrapolate my predictions for a larger area (where 
we have no data). Any help and suggestion is very much appreciated. 

Best Regards,
Valentina

Dr. Valentina Lauria
Postdoctoral researcher
Room 118, Martin Ryan Institute
Department of Earth and Ocean Sciences
National University of Ireland, Galway
Ireland

www.nephrops.eu


From: b.rowling...@gmail.com [b.rowling...@gmail.com] on behalf of Barry 
Rowlingson [b.rowling...@lancaster.ac.uk]
Sent: 12 April 2013 14:15
To: Lauria, Valentina
Cc: r-help@R-project.org
Subject: Re: [R] produce a map in ARCGIS from a GAM output

On Fri, Apr 12, 2013 at 10:37 AM, Lauria, Valentina
valentina.lau...@nuigalway.ie wrote:
 Hi,

 I am trying to predict the habitat suitability of lobster with GAMs. I need 
 to produce a map in ArcGis of the predicted densities. I did some search and 
 the function predict.gam seems to do the job.

 Is this the right way to do it? when you apply the function you get the 
 predicted values (different from your input data points) but I do not 
 understand what is the procedure with the spatial coordinates, I mean what 
 lat and long have the new points? Can I use the same ones of my observations?

 I suspect you'll get a better response if you:

 a) ask on R-sig-geo, the mailing list for all things R and mappy
 b) try and ask with a reproducible example - it doesn't have to be
your data, but something similar, and small, and that we can run to
get a better idea of your problem. Show us how you are running
predict.gam, at least, and what your data looks like.

To get data from R into ArcGIS you probably want to either use the
raster package to create a grid of values, or the rgdal package to
create a shapefile of point locations and estimates. Exactly how you
get that from your predict.gam results is a bit guesswork until we
know a bit more about how you are running predict.gam

Barry
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] processing matrix equation derived from rows of two matrices

2013-04-12 Thread becksj
Thanks so much Arun.  If you have time, what reference do you use to learn 
seq_len(nrow…etc and other aspects of your code?

Again. Thanks very much.  Jan Beckstrand

 

From: arun kirshna [via R] [mailto:ml-node+s789695n4664037...@n4.nabble.com] 
Sent: Friday, April 12, 2013 2:01 AM
To: Beckstrand, Janis, NCOD
Subject: Re: processing matrix equation derived from rows of two matrices

 

Hi, 

May be this helps: 

 tb[1,]%*%(((val-rep(meansb79[1,],5))^2)/6) 
#[,1] 
#[1,] 1.47619 
tryvarb-c(1,2,3,4,4,4,4) 
 var(tryvarb) 
#[1] 1.47619 

tb[2,]%*%(((val-rep(meansb79[2,],5))^2)/6) 
# [,1] 
#[1,] 1.904762 

sapply(seq_len(nrow(tb)),function(i) 
tb[i,]%*%(((val-rep(meansb79[i,],5))^2/6))) 
# [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619 
# [8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143 
#[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143 
#[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238 
#[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000 
#[36] 0.4761905 

#or 
mat1-matrix(((val-rep(meansb79,each=5))^2/6),ncol=5,byrow=TRUE) 

diag(t(apply(tb,1,function(x) x))%*% apply(mat1,1,function(x) x)) 
# [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619 
 #[8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143 
#[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143 
#[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238 
#[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000 
#[36] 0.4761905 

#or 
 diag(apply(tb,1,function(x) x%*% t(mat1))) 
# [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619 
 #[8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143 
#[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143 
#[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238 
#[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000 
#[36] 0.4761905 



A.K. 


I have a matrix (tb) that represents all samples of size n=7 from a group of 
N=9, with scores c(1,2,3,4,4,4,4,5,5).  I want to 
calculate the variance for every sample.  Here is my code.  The 
bottom shows the matrix equations and an attempt to process it for each 
row. I got the strategies from reading the r-help, but neither works. (I 
 do not understand the syntax well enough.) Any suggestions.  (I need to 
 do may :additional matrices in the same way.) Thanks. Jan 

require(combinat) 
n=7 
base79-combn(c(1,2,3,4,4,4,4,5,5), n, tabulate,nbins=5) #find 
all samples,n=7 (gives cols of  7 cases  in sample like 1 1 1 4 0 ) 
tb-t(base79) 
val-c(1,2,3,4,5)  #values on the scale 
meansb79-t(base79)%*% (val/7) 

tb[ ,1])%*%(((val-rep(meansb79[1,],5))^2)/6)   #computes the sample variance 
for the first sample 
#check 
tryvarb-c(1,2,3,4,4,4,4) 
var(tryvarb) 
  
#Now I try to get the variance for each sample (row) in tb, but neither of the 
following attempts work. 

trybase - apply(tb,1,function(i) 
  t(base79[,i])%*%(((val-rep(meansb79[i,],5))^2)/6)) 

#or 

domatrix.f - function(tb, meansb79) { 
 a - nrow(A); b - nrow(B); 
C - matrix(NA, nrow=a, ncol=b); 
for (i in 1:a) 
  for (j in 1:b) 
C[i,j] - t(A[,i])%*%(((val-rep(B[i,],5))^2)/6) } 
domatrix.f(tb, meansb79) 

__ 
[hidden email] mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code. 





If you reply to this email, your message will be added to the discussion below:

http://r.789695.n4.nabble.com/processing-matrix-equation-derived-from-rows-of-two-matrices-tp4664033p4664037.html
 

To unsubscribe from processing matrix equation derived from rows of two 
matrices, click here 
http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4664033code=amFuaXMuYmVja3N0cmFuZEB2YS5nb3Z8NDY2NDAzM3wtNjAyODIyMzIx
 .
NAML 
http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
  





--
View this message in context: 
http://r.789695.n4.nabble.com/processing-matrix-equation-derived-from-rows-of-two-matrices-tp4664033p4664070.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

Re: [R] A strange behaviour in the graphical function curve

2013-04-12 Thread Jeff Newmiller

See below

On Fri, 12 Apr 2013, Julio Sergio wrote:


Berend Hasselman bhh at xs4all.nl writes:



Your function miBeta returns a scalar when the argument mu is a vector.
Use Vectorize to vectorize it. Like this

  VmiBeta - Vectorize(miBeta,vectorize.args=c(mu))
  VmiBeta(c(420,440))

and draw the curve with this

  curve(VmiBeta,xlim=c(370,430), xlab=mu, ylab=L(mu))

Berend



Taking into account what you have pointed out, I reprogrammed my function
as follows, as an alternative solution to yours:

  zetas - function(alpha) {z - qnorm(alpha/2); c(z,-z)}

  # First transformation function
  Tzx - function(z, sigma_p, mu_p) sigma_p*z + mu_p

  # Second transformation function
  Txz - function(x, sigma_p, mu_p) (x - mu_p)/sigma_p

  BetaG - function(mu, alpha, n, sigma, mu_0) {
lasZ - zetas(alpha) # Zs corresponding to alpha
sigma_M - sigma/sqrt(n) # sd of my distribution
lasX - Tzx(lasZ, sigma_M, mu_0) # Transformed Zs into Xs
# Now I consider mu to be a vector composed of m's
NewZ - lapply(mu, function(m) Txz(lasX, sigma_M, m))
# NewZ is a list, the same length as mu, with 2D vectors
# The result will be a vector, the same length as mu (and NewZ)
sapply(NewZ, function(zz) pnorm(zz[2]) - pnorm(zz[1]))
  }

  miBeta - function(mu) BetaG(mu, 0.05, 36, 48, 400)

  plot(miBeta,xlim=c(370,430), xlab=mu, ylab=L(mu))

I hope this is useful to people following this discussion,

 -Sergio.


Adding to what you have defined above, consider the benefit that true
vectorization provides:

BetaGv - function(mu, alpha, n, sigma, mu_0) {
  lasZ - zetas( alpha ) # Zs corresponding to alpha
  sigma_M - sigma/sqrt( n ) # sd of my distribution
  lasX - Tzx( lasZ, sigma_M, mu_0 ) # Transformed Zs into Xs
  # Now fold lasX and mu into matrices where columns are defined by lasX
  # and rows are defined by mu
  lasX_M - matrix( lasX, nrow=length(mu), ncol=2, byrow=TRUE )
  mu_M   - matrix( mu,   nrow=length(mu), ncol=2 ) )
  # Compute newZ
  NewZ - Txz( lasX_M, sigma_M, mu_M )
  # NewZ is a matrix, where the first column corresponds to lower Z, and
  # the second column corresponds to upper Z
  # The result will be a vector, the same length as mu
  pnorm(NewZ[,2]) - pnorm(NewZ[,1])
}

miBetav - function(mu) BetaGv(mu, 0.05, 36, 48, 400)

system.time( miBeta( seq( 370, 430, length.out=1e5 ) ) )
system.time( miBetav( seq( 370, 430, length.out=1e5 ) ) )

On my machine, I get:


system.time( miBeta( seq( 370, 430, length.out=1e5 ) ) )

   user  system elapsed
  1.300   0.024   1.476

system.time( miBetav( seq( 370, 430, length.out=1e5 ) ) )

   user  system elapsed
  0.076   0.000   0.073

So a bit of matrix manipulation speeds things up considerably.

(That being said, I have a feeling that some algorithmic optimization
could speed things up even more, using theoretical considerations.)

---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A strange behaviour in the graphical function curve

2013-04-12 Thread David Winsemius


On Apr 12, 2013, at 7:58 AM, Julio Sergio wrote:


Berend Hasselman bhh at xs4all.nl writes:



Your function miBeta returns a scalar when the argument mu is a  
vector.

Use Vectorize to vectorize it. Like this

 VmiBeta - Vectorize(miBeta,vectorize.args=c(mu))
 VmiBeta(c(420,440))

and draw the curve with this

 curve(VmiBeta,xlim=c(370,430), xlab=mu, ylab=L(mu))

Berend



Taking into account what you have pointed out, I reprogrammed my  
function

as follows, as an alternative solution to yours:

  zetas - function(alpha) {z - qnorm(alpha/2); c(z,-z)}

  # First transformation function
  Tzx - function(z, sigma_p, mu_p) sigma_p*z + mu_p

  # Second transformation function
  Txz - function(x, sigma_p, mu_p) (x - mu_p)/sigma_p

  BetaG - function(mu, alpha, n, sigma, mu_0) {
lasZ - zetas(alpha) # Zs corresponding to alpha
sigma_M - sigma/sqrt(n) # sd of my distribution
lasX - Tzx(lasZ, sigma_M, mu_0) # Transformed Zs into Xs
# Now I consider mu to be a vector composed of m's
NewZ - lapply(mu, function(m) Txz(lasX, sigma_M, m))
# NewZ is a list, the same length as mu, with 2D vectors
# The result will be a vector, the same length as mu (and NewZ)
sapply(NewZ, function(zz) pnorm(zz[2]) - pnorm(zz[1]))
  }

  miBeta - function(mu) BetaG(mu, 0.05, 36, 48, 400)

  plot(miBeta,xlim=c(370,430), xlab=mu, ylab=L(mu))

I hope this is useful to people following this discussion,


This could be completely tangential to your problem with vectorization  
of arguments to 'curve'. Feel free to ignore. The second of your  
functions looks to be doing the same as the R function 'scale'. I  
would have expected it to be applied first and then to have an  
'unscale' operation performed to restore (but I am not aware that  
there is an inbuilt function with that feature):


umat - sweep( mat, 2, attr(m2, 'scaled:scale'), '*')

umat - sweep( umat, 2, attr(m2, 'scaled:center'), '+')

# Or perhaps this where 'sx' was the result of a 'scale' call:

scale(sx, -attr(sx, 'scaled:center'), 1/attr(sx,'scaled:scale') ), - 
attr(sx, 'scaled:center')


--

David





 -Sergio.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stat question: How to deal w/ negative outliers?

2013-04-12 Thread Ista Zahn
Hi,

This is not an R question and is therefore not appropriate for this
list. You should post to a statistics forum such as
http://stats.stackexchange.com/. But the answer is NO!

Best,
Ista

On Fri, Apr 12, 2013 at 9:49 AM, ramoss ramine.mossad...@finra.org wrote:
 Hello all,

 I have a question:  I am using the interquantile method to spot outliers 
 it gives me values of  say 234  -120 or for the higher  lower benchmarks.
 I don't have any issues w/ the higher end.  However I don't have any
 negative values. My lowest possible value is 0. Should I consider 0 as an
 outlier?

 Thanks ahead  for your thoughts



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Stat-question-How-to-deal-w-negative-outliers-tp4664068.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stat question: How to deal w/ negative outliers?

2013-04-12 Thread S Ellison
 

 -Original Message-
 I have a question:  I am using the interquantile method to 
 spot outliers  it gives me values of  say 234  -120 or for 
 the higher  lower benchmarks.
 I don't have any issues w/ the higher end.  However I don't 
 have any negative values. My lowest possible value is 0. 
 Should I consider 0 as an outlier?
If your limits have been appropriately set and the lower outlier bound is -120, 
obviously not.

However, if your data are constrained positive and unusually small values are 
as likely to be erroneous as unusually large values, you probably should have 
set the bounds differently.

Try taking logs and see if that gives you a more or less symmetric 
distribution? 

S Ellison

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] processing matrix equation derived from rows of two matrices

2013-04-12 Thread arun
Hi Janis,
For example 
The goal here is to get results per row 
BTW, your original posting had tb[,1] 
tb[,1]%*%(((val-rep(meansb79[1,],5))^2)/6) 
#Error in tb[, 1] %*% (((val - rep(meansb79[1, ], 5))^2)/6) : 
 # non-conformable arguments
I guess it should be:

tb[1,]%*%(((val-rep(meansb79[1,],5))^2)/6) 
#    [,1]
#[1,] 1.47619

If you can split the codes  in the first solution:
seq_len(nrow(tb))
# [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#[26] 26 27 28 29 30 31 32 33 34 35 36
 sapply(seq_len(nrow(tb)),function(i) i)
 #[1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#[26] 26 27 28 29 30 31 32 33 34 35 36
#you can also do with lapply (makes it more easy to understand)
lapply(seq_len(nrow(tb)),function(i) i)[1:2]
#[[1]]
#[1] 1

#[[2]]
#[1] 2

 lapply(seq_len(nrow(tb)),function(i) tb[i,])[1:2]
#[[1]]
#[1] 1 1 1 4 0

#[[2]]
#[1] 1 1 1 3 1
 tb[1:2,]
 #    [,1] [,2] [,3] [,4] [,5]
#[1,]    1    1    1    4    0
#[2,]    1    1    1    3    1

lapply(seq_len(nrow(tb)),function(i) (((val-rep(meansb79[i,],5))^2/6)))[1:2]
#[[1]]
#[1] 0.765306122 0.217687075 0.003401361 0.122448980 0.574829932

#[[2]]
#[1] 0.87074830 0.27551020 0.01360544 0.08503401 0.48979592

(((val-rep(meansb79[1:2,],5))^2/6))
# [1] 0.765306122 0.275510204 0.003401361 0.085034014 0.574829932 0.870748299
 #[7] 0.217687075 0.013605442 0.122448980 0.489795918

lapply(seq_len(nrow(tb)),function(i)tb[i,] 
%*%(((val-rep(meansb79[i,],5))^2/6)))[1:2]
#[[1]]
 #   [,1]
#[1,] 1.47619

#[[2]]
 #    [,1]
#[1,] 1.904762
sapply(seq_len(nrow(tb)),function(i)tb[i,] 
%*%(((val-rep(meansb79[i,],5))^2/6)))[1:2]
#[1] 1.476190 1.904762

Hope it helps.
A.K.





- Original Message -
From: Beckstrand, Janis, NCOD janis.beckstr...@va.gov
To: smartpink...@yahoo.com
Cc: 
Sent: Friday, April 12, 2013 1:05 PM
Subject: FW: [R] processing matrix equation derived from rows of two matrices


Thanks so much Arun.  If you have time, what reference(s) do you use to
learn the techniques like seq_len(nrow etc and other aspects of your
code?

Again. Thanks very much.  Jan Beckstrand



From: arun kirshna [via R]
[mailto:ml-node+s789695n4664037...@n4.nabble.com]
Sent: Friday, April 12, 2013 2:01 AM
To: Beckstrand, Janis, NCOD
Subject: Re: processing matrix equation derived from rows of two
matrices



Hi, 

May be this helps: 

tb[1,]%*%(((val-rep(meansb79[1,],5))^2)/6) 
#        [,1] 
#[1,] 1.47619
tryvarb-c(1,2,3,4,4,4,4)
var(tryvarb)
#[1] 1.47619 

tb[2,]%*%(((val-rep(meansb79[2,],5))^2)/6) 
#         [,1] 
#[1,] 1.904762 

sapply(seq_len(nrow(tb)),function(i)
tb[i,]%*%(((val-rep(meansb79[i,],5))^2/6)))
# [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143
1.9047619 # [8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619
2.2857143 2.2857143 #[15] 2.2857143 1.9523810 1.9523810 2.2857143
2.2857143 2.2857143 2.2857143 #[22] 1.6190476 1.6190476 1.9047619
1.9047619 1.9047619 1.9047619 1.8095238 #[29] 0.9047619 0.9047619
1.1428571 1.1428571 1.1428571 1.1428571 1.000 #[36] 0.4761905 

#or
mat1-matrix(((val-rep(meansb79,each=5))^2/6),ncol=5,byrow=TRUE) 

diag(t(apply(tb,1,function(x) x))%*% apply(mat1,1,function(x) x)) # [1]
1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619
#[8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143
2.2857143 #[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143
2.2857143 2.2857143 #[22] 1.6190476 1.6190476 1.9047619 1.9047619
1.9047619 1.9047619 1.8095238 #[29] 0.9047619 0.9047619 1.1428571
1.1428571 1.1428571 1.1428571 1.000 #[36] 0.4761905 

#or
diag(apply(tb,1,function(x) x%*% t(mat1))) # [1] 1.4761905 1.9047619
1.9047619 1.9047619 1.9047619 2.2857143 1.9047619  #[8] 1.9047619
2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143 #[15]
2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143
#[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619
1.8095238 #[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571
1.1428571 1.000 #[36] 0.4761905 



A.K. 


I have a matrix (tb) that represents all samples of size n=7 from a 
group of N=9, with scores c(1,2,3,4,4,4,4,5,5).  I want to calculate 
the variance for every sample.  Here is my code.  The
bottom shows the matrix equations and an attempt to process it for each
row. I got the strategies from reading the r-help, but neither works.
(I  do not understand the syntax well enough.) Any suggestions.  (I need
to  do may :additional matrices in the same way.) Thanks. Jan 

require(combinat)
n=7
base79-combn(c(1,2,3,4,4,4,4,5,5), n, tabulate,nbins=5) #find
all samples,n=7 (gives cols of  7 cases  in sample like 1 1 1 4 0 ) 
tb-t(base79)
val-c(1,2,3,4,5)  #values on the scale meansb79-t(base79)%*% (val/7)

tb[ ,1])%*%(((val-rep(meansb79[1,],5))^2)/6)   #computes the sample
variance for the first sample 
#check 
tryvarb-c(1,2,3,4,4,4,4)
var(tryvarb)
  
#Now I try to get the variance for each sample (row) in tb, but neither
of the following 

Re: [R] A strange behaviour in the graphical function

2013-04-12 Thread Julio Sergio
Jeff Newmiller jdnewmil at dcn.davis.ca.us writes:

 
  system.time( miBeta( seq( 370, 430, length.out=1e5 ) ) )
 user  system elapsed
1.300   0.024   1.476
  system.time( miBetav( seq( 370, 430, length.out=1e5 ) ) )
 user  system elapsed

This is very interesting, Jeff. Of course, I appreciate very much
your contribution.


  -Sergio.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solving an integral in R gives the error “The integral is probably divergent”

2013-04-12 Thread peter dalgaard
But is it supposed to be t^{-3/2} or t^{-0.5}?? The formula has the former and 
the code the latter, and the integral is clearly divergent with the former. 

-pd

On Apr 12, 2013, at 04:51 , Thomas Lumley wrote:

 I don't get an error message (after I correct the missing line break after
 the comment
 
 b- sapply(a, Cfun, upper=1)
 b
  [1]  1.583458e-54  7.768026e-50  2.317562e-45  4.206260e-41  4.645737e-37
 3.123801e-33  1.279358e-29  3.193257e-26  4.860876e-23
 [10]  4.516582e-20  2.564400e-17  8.908932e-15  1.896996e-12  2.481084e-10
 1.998561e-08  9.946570e-07  3.067751e-05  5.862075e-04
 [19]  6.818952e-03  4.297061e-02  0.00e+00  3.175122e-01  3.723022e-01
 2.364930e-01  9.144836e-02  2.190878e-02  3.252754e-03
 [28]  2.983763e-04  1.685692e-05  5.849602e-07  1.244158e-08  1.619155e-10
 1.287603e-12  6.250149e-15  1.850281e-17  3.338241e-20
 [37]  3.668412e-23  2.454192e-26  9.991546e-30  2.474577e-33  3.727226e-37
 3.413319e-41  1.900112e-45  6.428505e-50  1.321588e-54
 [46]  1.650722e-59  1.252524e-64  5.772750e-70  1.615916e-75  2.746972e-81
 2.835655e-87  1.777399e-93 6.764271e-100 1.562923e-106
 [55] 2.192373e-113 1.866955e-120 9.651205e-128 3.028623e-135 5.769185e-143
 6.670835e-151 4.682023e-159 1.994643e-167 5.157808e-176
 [64] 8.095084e-185 7.711162e-194 4.458042e-203 1.564139e-212 3.330362e-222
 4.302974e-232 3.373500e-242 1.604721e-252 4.631224e-263
 [73] 8.108474e-274 8.611898e-285 5.547745e-296  0.00e+00  0.00e+00
 0.00e+00  0.00e+00  0.00e+00  0.00e+00
 [82]  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00
 0.00e+00  0.00e+00  0.00e+00  0.00e+00
 [91]  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00
 0.00e+00  0.00e+00  0.00e+00  0.00e+00
 [100]  0.00e+00
 
 
  -thomas
 
 
 
 On Tue, Apr 9, 2013 at 3:14 PM, Janesh Devkota 
 janesh.devk...@gmail.comwrote:
 
 I am trying to solve an integral in R. However, I am getting an error when
 I am trying to solve for that integral.
 
 The equation that I am trying to solve is as follows:
 
 $$ C_m = \frac{{abs{x}}e^{2x}}{\pi^{1/2}}\int_0^t t^{-3/2}e^{-x^2/t-t}dt $$
 
 [image: enter image description here]
 
 The code that I am using is as follows:
 
 a - seq(from=-10, by=0.5,length=100)
 ## Create a function to compute integrationCfun - function(XX, upper){
  integrand - function(x)x^(-0.5)*exp((-XX^2/x)-x)
  integrated - integrate(integrand, lower=0, upper=upper)$value
  (final - abs(XX)*pi^(-0.5)*exp(2*XX)*integrated) }
 
 
 b- sapply(a, Cfun, upper=1)
 
 The error that I am getting is as follows:
 
 Error in integrate(integrand, lower = 0, upper = upper) :
  the integral is probably divergent
 
 Does this mean I cannot solve the integral ?
 
 Any possible ways to fix this problem will be highly appreciated.The
 question can be found on
 
 http://stackoverflow.com/questions/15892586/solving-an-integral-in-r-gives-error-the-integral-is-probably-divergent
 also.
 
 Thanks.
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Thomas Lumley
 Professor of Biostatistics
 University of Auckland
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error loading workspaces after upgrade

2013-04-12 Thread Duncan Murdoch

On 12/04/2013 7:14 AM, Tamas Barjak wrote:

Dear Members,
I upgraded R from 2.15.3 to 3.0.0 (binary) on my WinXP and now I can't
load my workspaces. The error message is:

Error: requested primitive type is not consistent with cached value
During startup - Warning message:
unable to restore saved data in .RData

Any idea to fix that?


Tamas sent me a copy of the bad file, and I've got a workaround:

The problem is an object called .SavedPlots.  It was produced by the 
Windows graphics device when it was recording the history.  It can't be 
loaded into R 3.0.0, because some of the internal organization of things 
has changed.


The workaround is to restart 2.15.3, load the workspace, then

rm(.SavedPlots)

and save the workspace.  After that things should be fine.

I'll try to get something in place in R-patched so that this doesn't 
cause the whole .RData load to fail.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solving an integral in R gives the error The integral is probably divergent

2013-04-12 Thread Janesh Devkota
Hi Peter, 

It is supposed to be t^{3/2} instead of t^{0.5}. The code had a typo. Yes, I
figured out that when x =0, the solution is divergent because it is like
dividing something by zero. 

Thanks. 

Best, 
Janesh

-Original Message-
From: peter dalgaard [mailto:pda...@gmail.com] 
Sent: Friday, April 12, 2013 1:32 PM
To: Thomas Lumley
Cc: Janesh Devkota; groep R-help
Subject: Re: [R] Solving an integral in R gives the error The integral is
probably divergent

But is it supposed to be t^{-3/2} or t^{-0.5}?? The formula has the former
and the code the latter, and the integral is clearly divergent with the
former. 

-pd

On Apr 12, 2013, at 04:51 , Thomas Lumley wrote:

 I don't get an error message (after I correct the missing line break 
 after the comment
 
 b- sapply(a, Cfun, upper=1)
 b
  [1]  1.583458e-54  7.768026e-50  2.317562e-45  4.206260e-41  
 4.645737e-37
 3.123801e-33  1.279358e-29  3.193257e-26  4.860876e-23 [10]  
 4.516582e-20  2.564400e-17  8.908932e-15  1.896996e-12  2.481084e-10
 1.998561e-08  9.946570e-07  3.067751e-05  5.862075e-04 [19]  
 6.818952e-03  4.297061e-02  0.00e+00  3.175122e-01  3.723022e-01
 2.364930e-01  9.144836e-02  2.190878e-02  3.252754e-03 [28]  
 2.983763e-04  1.685692e-05  5.849602e-07  1.244158e-08  1.619155e-10
 1.287603e-12  6.250149e-15  1.850281e-17  3.338241e-20 [37]  
 3.668412e-23  2.454192e-26  9.991546e-30  2.474577e-33  3.727226e-37
 3.413319e-41  1.900112e-45  6.428505e-50  1.321588e-54 [46]  
 1.650722e-59  1.252524e-64  5.772750e-70  1.615916e-75  2.746972e-81
 2.835655e-87  1.777399e-93 6.764271e-100 1.562923e-106 [55] 
 2.192373e-113 1.866955e-120 9.651205e-128 3.028623e-135 5.769185e-143
 6.670835e-151 4.682023e-159 1.994643e-167 5.157808e-176 [64] 
 8.095084e-185 7.711162e-194 4.458042e-203 1.564139e-212 3.330362e-222
 4.302974e-232 3.373500e-242 1.604721e-252 4.631224e-263 [73] 
 8.108474e-274 8.611898e-285 5.547745e-296  0.00e+00  0.00e+00
 0.00e+00  0.00e+00  0.00e+00  0.00e+00 [82]  
 0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00
 0.00e+00  0.00e+00  0.00e+00  0.00e+00 [91]  
 0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00
 0.00e+00  0.00e+00  0.00e+00  0.00e+00 [100]  
 0.00e+00
 
 
  -thomas
 
 
 
 On Tue, Apr 9, 2013 at 3:14 PM, Janesh Devkota
janesh.devk...@gmail.comwrote:
 
 I am trying to solve an integral in R. However, I am getting an error 
 when I am trying to solve for that integral.
 
 The equation that I am trying to solve is as follows:
 
 $$ C_m = \frac{{abs{x}}e^{2x}}{\pi^{1/2}}\int_0^t 
 t^{-3/2}e^{-x^2/t-t}dt $$
 
 [image: enter image description here]
 
 The code that I am using is as follows:
 
 a - seq(from=-10, by=0.5,length=100) ## Create a function to compute 
 integrationCfun - function(XX, upper){  integrand - 
 function(x)x^(-0.5)*exp((-XX^2/x)-x)
  integrated - integrate(integrand, lower=0, upper=upper)$value  
 (final - abs(XX)*pi^(-0.5)*exp(2*XX)*integrated) }
 
 
 b- sapply(a, Cfun, upper=1)
 
 The error that I am getting is as follows:
 
 Error in integrate(integrand, lower = 0, upper = upper) :
  the integral is probably divergent
 
 Does this mean I cannot solve the integral ?
 
 Any possible ways to fix this problem will be highly appreciated.The 
 question can be found on
 
 http://stackoverflow.com/questions/15892586/solving-an-integral-in-r-
 gives-error-the-integral-is-probably-divergent
 also.
 
 Thanks.
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 --
 Thomas Lumley
 Professor of Biostatistics
 University of Auckland
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000
Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Why copying columns of a data.frame becomes numeric?

2013-04-12 Thread C W
Dear list,

I want the 1st, 2nd, 5th, and 6th columns of mtcars.  After copying them,
the columns become numeric class rather than data frame.

But, when I copy rows, they data frame retains its class.  Why is this?  I
don't see why copying rows vs columns is so different.

 class(mtcars)
[1] data.frame
 head(mtcars)
   mpg cyl disp  hp dratwt  qsec vs am gear carb
Mazda RX4 21.0   6  160 110 3.90 2.620 16.46  0  144
Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  144
Datsun 71022.8   4  108  93 3.85 2.320 18.61  1  141
Hornet 4 Drive21.4   6  258 110 3.08 3.215 19.44  1  031
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  032
Valiant   18.1   6  225 105 2.76 3.460 20.22  1  031
 a - mtcars$mpg
 class(a)
[1] numeric
 b - mtcars[1:5, ]
 class(b)
[1] data.frame


Thanks a lot,
Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why copying columns of a data.frame becomes numeric?

2013-04-12 Thread Sarah Goslee
It's another reason not to use $ for extraction.

By default, R reduces dimensionality when subsetting, so mtcars$mpg
actually returns a numeric vector. With $, there's no way to override
the default behavior.

 data(mtcars)
 a - mtcars$mpg
 class(a)
[1] numeric
 dim(a)
NULL

 a - mtcars[, mpg, drop=FALSE]
 class(a)
[1] data.frame
 dim(a)
[1] 32  1

Sarah

On Fri, Apr 12, 2013 at 3:32 PM, C W tmrs...@gmail.com wrote:
 Dear list,

 I want the 1st, 2nd, 5th, and 6th columns of mtcars.  After copying them,
 the columns become numeric class rather than data frame.

 But, when I copy rows, they data frame retains its class.  Why is this?  I
 don't see why copying rows vs columns is so different.

 class(mtcars)
 [1] data.frame
 head(mtcars)
mpg cyl disp  hp dratwt  qsec vs am gear carb
 Mazda RX4 21.0   6  160 110 3.90 2.620 16.46  0  144
 Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  144
 Datsun 71022.8   4  108  93 3.85 2.320 18.61  1  141
 Hornet 4 Drive21.4   6  258 110 3.08 3.215 19.44  1  031
 Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  032
 Valiant   18.1   6  225 105 2.76 3.460 20.22  1  031
 a - mtcars$mpg
 class(a)
 [1] numeric
 b - mtcars[1:5, ]
 class(b)
 [1] data.frame


 Thanks a lot,
 Mike


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why copying columns of a data.frame becomes numeric?

2013-04-12 Thread William Dunlap
 class(mtcars[1:4,]) # select some rows
[1] data.frame
 class(mtcars[,1:4]) # select some columns
[1] data.frame
 class(mtcars[,3]) # select one column
[1] numeric
 class(mtcars[, 3, drop=FALSE]) # select one column
[1] data.frame

I cannot give a definitive reason why it is done this way, but you do need
some way to get from a data.frame to a column it contains and some way
to get from a data.frame to a single-column data.frame.  The above methods
do give you that choice.

Also note that rows and columns of data.frame are intrinsically different.
A column is generally a vector of one type while a row is a list of 1-long
vectors.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of C W
 Sent: Friday, April 12, 2013 12:33 PM
 To: r-help
 Subject: [R] Why copying columns of a data.frame becomes numeric?
 
 Dear list,
 
 I want the 1st, 2nd, 5th, and 6th columns of mtcars.  After copying them,
 the columns become numeric class rather than data frame.
 
 But, when I copy rows, they data frame retains its class.  Why is this?  I
 don't see why copying rows vs columns is so different.
 
  class(mtcars)
 [1] data.frame
  head(mtcars)
mpg cyl disp  hp dratwt  qsec vs am gear carb
 Mazda RX4 21.0   6  160 110 3.90 2.620 16.46  0  144
 Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  144
 Datsun 71022.8   4  108  93 3.85 2.320 18.61  1  141
 Hornet 4 Drive21.4   6  258 110 3.08 3.215 19.44  1  031
 Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  032
 Valiant   18.1   6  225 105 2.76 3.460 20.22  1  031
  a - mtcars$mpg
  class(a)
 [1] numeric
  b - mtcars[1:5, ]
  class(b)
 [1] data.frame
 
 
 Thanks a lot,
 Mike
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error loading workspaces after upgrade

2013-04-12 Thread Tamas Barjak
YES, it's work! Fine! Thank you! I wait for the R-patch.

Regards,
Tamas


2013/4/12 Duncan Murdoch murdoch.dun...@gmail.com

 On 12/04/2013 7:14 AM, Tamas Barjak wrote:

 Dear Members,
 I upgraded R from 2.15.3 to 3.0.0 (binary) on my WinXP and now I can't
 load my workspaces. The error message is:

 Error: requested primitive type is not consistent with cached value
 During startup - Warning message:
 unable to restore saved data in .RData

 Any idea to fix that?


 Tamas sent me a copy of the bad file, and I've got a workaround:

 The problem is an object called .SavedPlots.  It was produced by the
 Windows graphics device when it was recording the history.  It can't be
 loaded into R 3.0.0, because some of the internal organization of things
 has changed.

 The workaround is to restart 2.15.3, load the workspace, then

 rm(.SavedPlots)

 and save the workspace.  After that things should be fine.

 I'll try to get something in place in R-patched so that this doesn't cause
 the whole .RData load to fail.

 Duncan Murdoch




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why copying columns of a data.frame becomes numeric?

2013-04-12 Thread Berend Hasselman

On 12-04-2013, at 21:32, C W tmrs...@gmail.com wrote:

 Dear list,
 
 I want the 1st, 2nd, 5th, and 6th columns of mtcars.  After copying them,
 the columns become numeric class rather than data frame.
 
 But, when I copy rows, they data frame retains its class.  Why is this?  I
 don't see why copying rows vs columns is so different.
 
 class(mtcars)
 [1] data.frame
 head(mtcars)
   mpg cyl disp  hp dratwt  qsec vs am gear carb
 Mazda RX4 21.0   6  160 110 3.90 2.620 16.46  0  144
 Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  144
 Datsun 71022.8   4  108  93 3.85 2.320 18.61  1  141
 Hornet 4 Drive21.4   6  258 110 3.08 3.215 19.44  1  031
 Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  032
 Valiant   18.1   6  225 105 2.76 3.460 20.22  1  031
 a - mtcars$mpg
 class(a)
 [1] numeric

Here you are assigning a single column of mtcars, which is a numeric vector, to 
another object. So that is a numeric vector.

 b - mtcars[1:5, ]
 class(b)
 [1] data.frame
 

Here you are assigning a couple of rows of the complete dataframe and the 
result is a dataframe.

If you want the  1st, 2nd, 5th, and 6th columns of mtcars in a new datafrmae 
why don't you do this:

 a - mtcars[,c(1,2,5,6)]

then

 class(a)
[1] data.frame


Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to change the date into an interval of date?

2013-04-12 Thread arun
Hi,

I am not sure I understand your question correctly.
dat1- read.table(text=
id    responsed_at number_of_connection 
 scores
1  12-01-2010   1   
   2
1  15-02-2010   2   
   3
1   01-04-2010  3   
   2
,sep=,header=TRUE,stringsAsFactors=FALSE) 
datNew- dat1
datNew$responsed_at- gsub(^.*\\-(.*\\-.*),\\1, datNew$responsed_at)
dat2-data.frame(responsed_at=format(seq(from=as.Date(01-01-2010,format=%d-%m-%Y),to=as.Date(01-04-2010,format=%d-%m-%Y),by=month),%m-%Y))
res-merge(datNew,dat2,all=TRUE,by=responsed_at)
 res$responsed_at[is.na(res$id)]-NA
res$responsed_at[!is.na(res$responsed_at)]- 
paste(gsub((.*)\\-.*\\-.*,\\1,dat1$responsed_at),-,res$responsed_at[!is.na(res$responsed_at)],sep=)
res$number_of_month- seq_len(nrow(res))-1

res
#  responsed_at id number_of_connection scores number_of_month
#1   12-01-2010  1    1  2   0
#2   15-02-2010  1    2  3   1
#3 NA NA   NA NA   2
#4   01-04-2010  1    3  2   3




Dear experts: 

 I have my table like this: 

id            responsed_at                 number of connection                
  scores 
1                  12-01-2010                                   1              
                                2 
1                  15-02-2010                                   2              
                                3 
1                   01-04-2010                                  3              
                                2 

I want to have one table like this: 

id             responsed_at              number of month               number 
of connection               scors 
1               12-01-2010                                   0      
                                           1                            
               2 
1                15-02-2010                                  1      
                                            2                            
               3 
1                 NA                                                
 2                                                NA                    
                   NA 
1                01-04-2010                                   3      
                                            3                            
              2 

I want to plus a column indicating the number of month when patients 
responsed.  How can I realize that in R? 
The rule for number of month is the date in one month, ex: from the 
first day to the last day of one month count one number of month. 

Thank you in avance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why copying columns of a data.frame becomes numeric?

2013-04-12 Thread C W
What if my data is much larger, and I don't know what column number but
know its name?  Do I have to grep its by name?
How about subset()?  Is that what people commonly use?

Mike

On Fri, Apr 12, 2013 at 3:44 PM, Berend Hasselman b...@xs4all.nl wrote:


 On 12-04-2013, at 21:32, C W tmrs...@gmail.com wrote:

  Dear list,
 
  I want the 1st, 2nd, 5th, and 6th columns of mtcars.  After copying them,
  the columns become numeric class rather than data frame.
 
  But, when I copy rows, they data frame retains its class.  Why is this?
  I
  don't see why copying rows vs columns is so different.
 
  class(mtcars)
  [1] data.frame
  head(mtcars)
mpg cyl disp  hp dratwt  qsec vs am gear carb
  Mazda RX4 21.0   6  160 110 3.90 2.620 16.46  0  144
  Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  144
  Datsun 71022.8   4  108  93 3.85 2.320 18.61  1  141
  Hornet 4 Drive21.4   6  258 110 3.08 3.215 19.44  1  031
  Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  032
  Valiant   18.1   6  225 105 2.76 3.460 20.22  1  031
  a - mtcars$mpg
  class(a)
  [1] numeric

 Here you are assigning a single column of mtcars, which is a numeric
 vector, to another object. So that is a numeric vector.

  b - mtcars[1:5, ]
  class(b)
  [1] data.frame
 

 Here you are assigning a couple of rows of the complete dataframe and the
 result is a dataframe.

 If you want the  1st, 2nd, 5th, and 6th columns of mtcars in a new
 datafrmae why don't you do this:

  a - mtcars[,c(1,2,5,6)]

 then

  class(a)
 [1] data.frame


 Berend




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why copying columns of a data.frame becomes numeric?

2013-04-12 Thread Berend Hasselman

On 12-04-2013, at 21:56, C W tmrs...@gmail.com wrote:

 What if my data is much larger, and I don't know what column number but know 
 its name?  Do I have to grep its by name?
 How about subset()?  Is that what people commonly use?
 

Continuing with mtcars and the desired columns

b - mtcars[,c(mpg,cyl,drat,wt)]

Berend

 Mike
 
 On Fri, Apr 12, 2013 at 3:44 PM, Berend Hasselman b...@xs4all.nl wrote:
 
 On 12-04-2013, at 21:32, C W tmrs...@gmail.com wrote:
 
  Dear list,
 
  I want the 1st, 2nd, 5th, and 6th columns of mtcars.  After copying them,
  the columns become numeric class rather than data frame.
 
  But, when I copy rows, they data frame retains its class.  Why is this?  I
  don't see why copying rows vs columns is so different.
 
  class(mtcars)
  [1] data.frame
  head(mtcars)
mpg cyl disp  hp dratwt  qsec vs am gear carb
  Mazda RX4 21.0   6  160 110 3.90 2.620 16.46  0  144
  Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  144
  Datsun 71022.8   4  108  93 3.85 2.320 18.61  1  141
  Hornet 4 Drive21.4   6  258 110 3.08 3.215 19.44  1  031
  Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  032
  Valiant   18.1   6  225 105 2.76 3.460 20.22  1  031
  a - mtcars$mpg
  class(a)
  [1] numeric
 
 Here you are assigning a single column of mtcars, which is a numeric vector, 
 to another object. So that is a numeric vector.
 
  b - mtcars[1:5, ]
  class(b)
  [1] data.frame
 
 
 Here you are assigning a couple of rows of the complete dataframe and the 
 result is a dataframe.
 
 If you want the  1st, 2nd, 5th, and 6th columns of mtcars in a new datafrmae 
 why don't you do this:
 
  a - mtcars[,c(1,2,5,6)]
 
 then
 
  class(a)
 [1] data.frame
 
 
 Berend
 
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing rows that are duplicates but column values are in reversed order

2013-04-12 Thread arun
Hi,
From your example data, 

dat1- read.table(text=
id1   id2   value
a  b   10
c  d    11
b a 10
c  e 12 
,sep=,header=TRUE,stringsAsFactors=FALSE)
#it is easier to get the output you wanted
dat1[!duplicated(dat1$value),]
#  id1 id2 value
#1   a   b    10
#2   c   d    11
#4   c   e    12

But, if you have cases like the one below (assuming that all those instances 
were there is reversed order have the same value)
dat2- read.table(text=
id1   id2   value
a  b   10
c  d    11
b a 10
e  c 12 
c  e 12 
,sep=,header=TRUE,stringsAsFactors=FALSE)
 dat2[apply(dat2[,-3],1,function(x) {x1- order(x); x1[1]x1[2]}),]
 # id1 id2 value
#1   a   b    10
#2   c   d    11
#5   c   e    12


#or you have cases like these:

dat3- read.table(text=
id1   id2   value
a  b   10
c  d    11
b a 10
a  b    10
e  c 12 
c  e 12
c  d 11 
,sep=,header=TRUE,stringsAsFactors=FALSE)

 dat3New-dat3[apply(dat3[,-3],1,function(x) {x1- order(x); x1[1]x1[2]}),]
dat3New[!duplicated(dat3New$value),]
#  id1 id2 value
#1   a   b    10
#2   c   d    11
#6   c   e    12
A.K.




Hi everybody, 

I was hoping that someone could help me with this problem. I 
have a table with 3 columns. Some rows contain duplicates where the 
identifiers in columns 1 and 2 are in reverse order, but the value 
associated with the row is the same. 

For example: 

id1   id2   value 
a      b       10 
c      d        11 
b     a         10 
c      e         12 

Rows 1 and 3 are duplicates (have the same value). I would like 
to retain only row 1 and delete row 3. Final table should look like 
this: 

id1   id2   value 
a      b       10 
c      d        11 
c      e         12 

Thanks in advance for any help provided. 

Vince

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why copying columns of a data.frame becomes numeric?

2013-04-12 Thread C W
I was thinking,
b - subset(mtcars, select=c(mpg, cyl, drat, wt))

Yours is even simpler. :)

Thanks
Mike

On Fri, Apr 12, 2013 at 4:06 PM, Berend Hasselman b...@xs4all.nl wrote:


 On 12-04-2013, at 21:56, C W tmrs...@gmail.com wrote:

  What if my data is much larger, and I don't know what column number but
 know its name?  Do I have to grep its by name?
  How about subset()?  Is that what people commonly use?
 

 Continuing with mtcars and the desired columns

 b - mtcars[,c(mpg,cyl,drat,wt)]

 Berend

  Mike
 
  On Fri, Apr 12, 2013 at 3:44 PM, Berend Hasselman b...@xs4all.nl wrote:
 
  On 12-04-2013, at 21:32, C W tmrs...@gmail.com wrote:
 
   Dear list,
  
   I want the 1st, 2nd, 5th, and 6th columns of mtcars.  After copying
 them,
   the columns become numeric class rather than data frame.
  
   But, when I copy rows, they data frame retains its class.  Why is
 this?  I
   don't see why copying rows vs columns is so different.
  
   class(mtcars)
   [1] data.frame
   head(mtcars)
 mpg cyl disp  hp dratwt  qsec vs am gear carb
   Mazda RX4 21.0   6  160 110 3.90 2.620 16.46  0  144
   Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  144
   Datsun 71022.8   4  108  93 3.85 2.320 18.61  1  141
   Hornet 4 Drive21.4   6  258 110 3.08 3.215 19.44  1  031
   Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  032
   Valiant   18.1   6  225 105 2.76 3.460 20.22  1  031
   a - mtcars$mpg
   class(a)
   [1] numeric
 
  Here you are assigning a single column of mtcars, which is a numeric
 vector, to another object. So that is a numeric vector.
 
   b - mtcars[1:5, ]
   class(b)
   [1] data.frame
  
 
  Here you are assigning a couple of rows of the complete dataframe and
 the result is a dataframe.
 
  If you want the  1st, 2nd, 5th, and 6th columns of mtcars in a new
 datafrmae why don't you do this:
 
   a - mtcars[,c(1,2,5,6)]
 
  then
 
   class(a)
  [1] data.frame
 
 
  Berend
 
 
 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Search for common character strings within a column

2013-04-12 Thread arun
Hi,
May be this helps:
Not sure how you wanted to select those two letters.  


dat1- read.table(text=
   Seq,Output
 A B B C D A C,Yes
 B C A C B D A C,Yes
C D A A C D,No
,sep=,,header=TRUE,stringsAsFactors=FALSE)
library(stringr)
lapply(str_split(str_trim(dat1$Seq), )[dat1$Output==Yes],function(x) 
{x1-t(combn(x,2)); apply(x1,1,paste0,collapse=)})
#[[1]]
# [1] AB AB AC AD AA AC BB BC BD BA BC BC BD BA BC
#[16] CD CA CC DA DC AC

#[[2]]
# [1] BC BA BC BB BD BA BC CA CC CB CD CA CC AC AB
#[16] AD AA AC CB CD CA CC BD BA BC DA DC AC

res- sapply(str_split(str_trim(dat1$Seq), )[dat1$Output==Yes],function(x) 
{x1-t(combn(x,2)); x2-table(apply(x1,1,paste0,collapse=)); 
x2[which.max(x2)]})
res
#BC BC 
# 4  4
 
dat1$MaxCombn-NA
 dat1$MaxCombn[dat1$Output==Yes]- names(res)
 dat1
#   Seq Output MaxCombn
#1    A B B C D A C    Yes   BC
#2  B C A C B D A C    Yes   BC
#3  C D A A C D No NA
A.K.


I have a dataset (data) that consists of two columns: Seq and output. 
Each entry in Seq is a combination of As,Bs,Cs and Ds and ranges from 5 – 30 
characters in length. Each sequence is associated with an output of 
either yes or no such that: 

      Seq            Output 
(1) A B B C D A C          Yes 
(2) B C A C B D A CYes 
(3) C D A A C DNo 

etc, etc. 

I want to find which 2 letter (A B, A C, A D, etc) strings are 
most associated with each output. Essentially I want to find which 2 
letter combinations occur most frequently in the column Seq, when the 
output is Yes. I’m new to R and can’t figure out a solution to this 
problem. 

Any help greatly appreciated! 

Cheers, 

AB

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] split date and time

2013-04-12 Thread Ye Lin
Hi R experts,

For example I have a dataset looks like this:


Number   TimeStamp   Value
1  1/1/2013 0:00 1
2 1/1/2013 0:01 2
3 1/1/2013 0:03 3

How can I split the TimeStamp Column into two and return a new table like
this:

Number   Date   Time   Value
1   1/1/2013 0:00 1
2 1/1/2013 0:01 2
3 1/1/2013 0:03 3

Thank!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Search for common character strings within a column

2013-04-12 Thread arun
HI,
In case if you wanted to take BC and CB as the same.
dat1- read.table(text=
   Seq,Output
 A B B C D A C,Yes
 B C A C B D A C,Yes
C D A A C D,No
,sep=,,header=TRUE,stringsAsFactors=FALSE)
lapply(str_split(str_trim(dat1$Seq), )[dat1$Output==Yes],function(x) 
{x1-t(combn(x,2)); 
x2-sapply(strsplit(apply(x1,1,paste0,collapse=),),function(x) 
paste(x[order(x)],collapse=)); table(x2)}) 
[[1]]
#x2
#AA AB AC AD BB BC BD CC CD 
# 1  4  4  2  1  4  2  1  2 

#[[2]]
#x2
#AA AB AC AD BB BC BD CC CD 
# 1  4  6  2  1  6  2  3  3 
dat1$MaxCombn- NA
res1-sapply(str_split(str_trim(dat1$Seq), )[dat1$Output==Yes],function(x) 
{x1-t(combn(x,2)); 
x2-sapply(strsplit(apply(x1,1,paste0,collapse=),),function(x) 
paste(x[order(x)],collapse=)); x3-table(x2); x3[x3%in% max(x3)]}) 

dat1$MaxCombn[dat1$Output==Yes]-lapply(res1,names)
 dat1
#   Seq Output   MaxCombn
#1    A B B C D A C    Yes AB, AC, BC
#2  B C A C B D A C    Yes AC, BC
#3  C D A A C D No NA

A.K.




- Original Message -
From: arun smartpink...@yahoo.com
To: R help r-help@r-project.org
Cc: 
Sent: Friday, April 12, 2013 4:37 PM
Subject: Re: Search for common character strings within a column 

Hi,
May be this helps:
Not sure how you wanted to select those two letters.  


dat1- read.table(text=
   Seq,Output
 A B B C D A C,Yes
 B C A C B D A C,Yes
C D A A C D,No
,sep=,,header=TRUE,stringsAsFactors=FALSE)
library(stringr)
lapply(str_split(str_trim(dat1$Seq), )[dat1$Output==Yes],function(x) 
{x1-t(combn(x,2)); apply(x1,1,paste0,collapse=)})
#[[1]]
# [1] AB AB AC AD AA AC BB BC BD BA BC BC BD BA BC
#[16] CD CA CC DA DC AC

#[[2]]
# [1] BC BA BC BB BD BA BC CA CC CB CD CA CC AC AB
#[16] AD AA AC CB CD CA CC BD BA BC DA DC AC

res- sapply(str_split(str_trim(dat1$Seq), )[dat1$Output==Yes],function(x) 
{x1-t(combn(x,2)); x2-table(apply(x1,1,paste0,collapse=)); 
x2[which.max(x2)]})
res
#BC BC 
# 4  4
 
dat1$MaxCombn-NA
 dat1$MaxCombn[dat1$Output==Yes]- names(res)
 dat1
#   Seq Output MaxCombn
#1    A B B C D A C    Yes   BC
#2  B C A C B D A C    Yes   BC
#3  C D A A C D No NA
A.K.


I have a dataset (data) that consists of two columns: Seq and output. 
Each entry in Seq is a combination of As,Bs,Cs and Ds and ranges from 5 – 30 
characters in length. Each sequence is associated with an output of 
either yes or no such that: 

      Seq                  Output 
(1) A B B C D A C             Yes 
(2) B C A C B D A C    Yes 
(3) C D A A C D         No 

etc, etc. 

I want to find which 2 letter (A B, A C, A D, etc) strings are 
most associated with each output. Essentially I want to find which 2 
letter combinations occur most frequently in the column Seq, when the 
output is Yes. I’m new to R and can’t figure out a solution to this 
problem. 

Any help greatly appreciated! 

Cheers, 

AB


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] split date and time

2013-04-12 Thread arun
Hi,
dat1- read.table(text=
Number,TimeStamp,Value
1,1/1/2013 0:00,1
2,1/1/2013 0:01,2
3,1/1/2013 0:03,3
,sep=,,header=TRUE,stringsAsFactors=FALSE)
dat2-data.frame(Number=dat1[,1],do.call(rbind,strsplit(dat1[,2], )), 
Value=dat1[,3])
 names(dat2)[2:3]- c(Date,Time)
 dat2
#  Number Date Time Value
#1  1 1/1/2013 0:00 1
#2  2 1/1/2013 0:01 2
#3  3 1/1/2013 0:03 3
A.K.




- Original Message -
From: Ye Lin ye...@lbl.gov
To: R help r-help@r-project.org
Cc: 
Sent: Friday, April 12, 2013 5:49 PM
Subject: [R] split date and time

Hi R experts,

For example I have a dataset looks like this:


Number   TimeStamp   Value
1              1/1/2013 0:00 1
2 1/1/2013 0:01 2
3 1/1/2013 0:03 3

How can I split the TimeStamp Column into two and return a new table like
this:

Number   Date           Time   Value
1               1/1/2013 0:00 1
2 1/1/2013 0:01 2
3 1/1/2013 0:03 3

Thank!

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: split date and time

2013-04-12 Thread arun




- Forwarded Message -
From: arun smartpink...@yahoo.com
To: Ye Lin ye...@lbl.gov
Cc: 
Sent: Friday, April 12, 2013 6:25 PM
Subject: Re: [R] split date and time



Hi Ye,

Is this okay?

dat2-cbind(dat1[,-2],do.call(rbind,strsplit(dat1[,2], 
)),stringsAsFactors=FALSE)
 dat2
#  Number Value    1    2
#1  1 1 1/1/2013 0:00
#2  2 2 1/1/2013 0:01
#3  3 3 1/1/2013 0:03
 colnames(dat2)[3:4]- c(Date,Time)
 dat2
#  Number Value Date Time
#1  1 1 1/1/2013 0:00
#2  2 2 1/1/2013 0:01
#3  3 3 1/1/2013 0:03
str(dat2)
#'data.frame':    3 obs. of  4 variables:
 #$ Number: int  1 2 3
 #$ Value : int  1 2 3
 #$ Date  : chr  1/1/2013 1/1/2013 1/1/2013
 #$ Time  : chr  0:00 0:01 0:03




From: Ye Lin ye...@lbl.gov
To: arun smartpink...@yahoo.com 
Sent: Friday, April 12, 2013 6:16 PM
Subject: Re: [R] split date and time





What if I have many columns, for example I have 50 columns in dat1, and say the 
TimeStamp column is the 10th, anyway to do that instead of listing all the 
other columns when building dat2?


Thanks!





On Fri, Apr 12, 2013 at 3:08 PM, arun smartpink...@yahoo.com wrote:

Hi,
dat1- read.table(text=
Number,TimeStamp,Value
1,1/1/2013 0:00,1
2,1/1/2013 0:01,2
3,1/1/2013 0:03,3
,sep=,,header=TRUE,stringsAsFactors=FALSE)
dat2-data.frame(Number=dat1[,1],do.call(rbind,strsplit(dat1[,2], )), 
Value=dat1[,3])
 names(dat2)[2:3]- c(Date,Time)
 dat2
#  Number Date Time Value
#1  1 1/1/2013 0:00 1
#2  2 1/1/2013 0:01 2
#3  3 1/1/2013 0:03 3
A.K.





- Original Message -
From: Ye Lin ye...@lbl.gov
To: R help r-help@r-project.org
Cc:
Sent: Friday, April 12, 2013 5:49 PM
Subject: [R] split date and time

Hi R experts,

For example I have a dataset looks like this:


Number   TimeStamp   Value
1              1/1/2013 0:00 1
2 1/1/2013 0:01 2
3 1/1/2013 0:03 3

How can I split the TimeStamp Column into two and return a new table like
this:

Number   Date           Time   Value
1               1/1/2013 0:00 1
2 1/1/2013 0:01 2
3 1/1/2013 0:03 3

Thank!

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fw: split date and time

2013-04-12 Thread Ye Lin
Thanks!


On Fri, Apr 12, 2013 at 3:30 PM, arun smartpink...@yahoo.com wrote:





 - Forwarded Message -
 From: arun smartpink...@yahoo.com
 To: Ye Lin ye...@lbl.gov
 Cc:
 Sent: Friday, April 12, 2013 6:25 PM
 Subject: Re: [R] split date and time



 Hi Ye,

 Is this okay?

 dat2-cbind(dat1[,-2],do.call(rbind,strsplit(dat1[,2],
 )),stringsAsFactors=FALSE)
  dat2
 #  Number Value12
 #1  1 1 1/1/2013 0:00
 #2  2 2 1/1/2013 0:01
 #3  3 3 1/1/2013 0:03
  colnames(dat2)[3:4]- c(Date,Time)
  dat2
 #  Number Value Date Time
 #1  1 1 1/1/2013 0:00
 #2  2 2 1/1/2013 0:01
 #3  3 3 1/1/2013 0:03
 str(dat2)
 #'data.frame':3 obs. of  4 variables:
  #$ Number: int  1 2 3
  #$ Value : int  1 2 3
  #$ Date  : chr  1/1/2013 1/1/2013 1/1/2013
  #$ Time  : chr  0:00 0:01 0:03



 
 From: Ye Lin ye...@lbl.gov
 To: arun smartpink...@yahoo.com
 Sent: Friday, April 12, 2013 6:16 PM
 Subject: Re: [R] split date and time





 What if I have many columns, for example I have 50 columns in dat1, and
 say the TimeStamp column is the 10th, anyway to do that instead of
 listing all the other columns when building dat2?


 Thanks!





 On Fri, Apr 12, 2013 at 3:08 PM, arun smartpink...@yahoo.com wrote:

 Hi,
 dat1- read.table(text=
 Number,TimeStamp,Value
 1,1/1/2013 0:00,1
 2,1/1/2013 0:01,2
 3,1/1/2013 0:03,3
 ,sep=,,header=TRUE,stringsAsFactors=FALSE)
 dat2-data.frame(Number=dat1[,1],do.call(rbind,strsplit(dat1[,2], )),
 Value=dat1[,3])
  names(dat2)[2:3]- c(Date,Time)
  dat2
 #  Number Date Time Value
 #1  1 1/1/2013 0:00 1
 #2  2 1/1/2013 0:01 2
 #3  3 1/1/2013 0:03 3
 A.K.
 
 
 
 
 
 - Original Message -
 From: Ye Lin ye...@lbl.gov
 To: R help r-help@r-project.org
 Cc:
 Sent: Friday, April 12, 2013 5:49 PM
 Subject: [R] split date and time
 
 Hi R experts,
 
 For example I have a dataset looks like this:
 
 
 Number   TimeStamp   Value
 1  1/1/2013 0:00 1
 2 1/1/2013 0:01 2
 3 1/1/2013 0:03 3
 
 How can I split the TimeStamp Column into two and return a new table
 like
 this:
 
 Number   Date   Time   Value
 1   1/1/2013 0:00 1
 2 1/1/2013 0:01 2
 3 1/1/2013 0:03 3
 
 Thank!
 
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] kNNimpute error

2013-04-12 Thread Aimee Kopolow
Hi all,

I'm trying to use kNNimpute in the imputation package to fill in missing
precipitation data for a data frame I have.


Example is:

okee:

Date rainfall
1997-05-010
1997-05-020
1997-05-03NA
1997-05-040
1997-05-050
...
2007-04-01NA
2007-04-02NA
2007-04-03NA
2007-04-04NA
2007-04-050
...  ..
where there are large swatches (30 days) of data missing in the ten year
time series.

I tried newokee-kNNImpute(okee, k=30, verbose = T) hoping it would impute
data for the rows with NA values according to weighted means of closest 30
non-NA neighbours and I got the following message back:

[1] imputing on 270 missing values with matrix size 7304
[1] Computing distance matrix...
[1] Distance matrix complete
[1] Imputing row   70
Error in which(missing.matrix[rowIndex, ]) : subscript out of bounds
In addition: Warning message:
In dist(x, upper = T) : NAs introduced by coercion


What syntax do I need to impute the precipitation data? Failing that, do
you have another recommendation of a method to use? Statistics is not my
strong point.

thank you for any help you are able to give,
Aimee.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Batch open netcdf files and get variables

2013-04-12 Thread Enhao Du
Hi, I'm new to R. I have some daily soil moisture data for the year 1979 in 
netcdf format such as these
sm19790101.1.nc
sm19790102.1.nc
.
.
.
sm19791231.1.nc

I need to average a variable called sm to monthly resolution. I've done these

days = formatC(1:31, width=2, flag=0)
ncfiles = lapply(days, function(d){
filename = paste(sm197901, d, .1.nc, sep=)
#print(filename)
open.nc(filename)
})

to open the files in January. Questions is: how may I get the variable from 
each opened file. Rnetcdf package has the var.get.nc() that only read object of 
class Netcdf returned from open.nc(). Any help would be appreciated.

Enhao
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CCA report species environment correlation

2013-04-12 Thread SRuhl
Hi everyone,
I did a CCA with R in the vegan package and got all the outputs I need/want
to report except for one, that is the species environment correlation
values. I only know of them because of one of the sources I read up on about
CCAs reported it and I have the suspicion that it´s a value reported when
you do a CCA in SPSS (which I have never done) and that just doesnßt come up
in R.
Can anyone tell me how to find it or calculate it myself? Do I need to run
posthoc stats or additional tests on my results?
Thanks in advance!



--
View this message in context: 
http://r.789695.n4.nabble.com/CCA-report-species-environment-correlation-tp4664089.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] remove colinear variables

2013-04-12 Thread jul 2-pom
Hello

I have a dataset with 151 dependent variables and estimated in 6 different
conditions, and each condition is repeated 3 times. It si proteomic and
therefore expensive, so few repetitions.
I want to conduct a stepwise discriminant analysis to identify the variables
that really matter. To do so, I will use greedy.wilks. and stepclass.
However, I need to remove colinear variables to run these functions. 
Can anyone tell me how to identify the colinear variables to then remove
them?

Thank you in advance
julien.



--
View this message in context: 
http://r.789695.n4.nabble.com/remove-colinear-variables-tp4664077.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] model frame and formula mismatch in model.matrix()

2013-04-12 Thread jul 2-pom
Hello everyone,

I am trying to fit the following model
All X. variables are continuous, while the conditions are categoricals.

model - lm(X2
+X3+X4+X5+X6+X7+X8+X9+X10+X11+X12+X13+X14+X15+X16+X17+X18+X19+X20+X21+X22+X23+X24+X25+X26+X27+X28+X29+X30+X31+X32+X33+X34+X35+X36+X37+X38+X39+X40+X41+X42+X43+X44+X45+X46+X47+X48+X49+X50+X51+X52+X53+X54+X55+X56+X57+X58+X59+X60+X61+X62+X63+X64+X65+X66+X67+X68+X69+X70+X71+X72+X73+X74+X75+X76+X77+X78+X79+X80+X81+X82+X83+X84+X85+X86+X87+X88+X89+X90+X91+X92+X93+X94+X95+X96+X97+X98+X99+X100+X101+X102+X103+X104+X105+X106+X107+X108+X109+X110+X111+X112+X113+X114+X115+X116+X117+X118+X119+X120+X121+X122+X123+X124+X125+X126+X127+X128+X129+X130+X131+X132+X133+X134+X135+X136+X137+X138+X139+X140+X141+X142+X143+X144+X145+X146+X147+X148+X149+X150+X151+X152
~ conditions, data = proteo)

but I get this error message:

Error in model.matrix.default(mt, mf, contrasts) : 
  model frame and formula mismatch in model.matrix()

can someone help me please?

Julien.



--
View this message in context: 
http://r.789695.n4.nabble.com/model-frame-and-formula-mismatch-in-model-matrix-tp4664093.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing rows that are duplicates but column values are in reversed order

2013-04-12 Thread vpr3
Thanks very much for your rapid help Arun.

Vince


On Apr 12, 2013, at 4:10 PM, arun kirshna [via R] wrote:

Hi,
From your example data,

dat1- read.table(text=
id1   id2   value
a  b   10
c  d11
b a 10
c  e 12
,sep=,header=TRUE,stringsAsFactors=FALSE)
#it is easier to get the output you wanted
dat1[!duplicated(dat1$value),]
#  id1 id2 value
#1   a   b10
#2   c   d11
#4   c   e12

But, if you have cases like the one below (assuming that all those instances 
were there is reversed order have the same value)
dat2- read.table(text=
id1   id2   value
a  b   10
c  d11
b a 10
e  c 12
c  e 12
,sep=,header=TRUE,stringsAsFactors=FALSE)
 dat2[apply(dat2[,-3],1,function(x) {x1- order(x); x1[1]x1[2]}),]
 # id1 id2 value
#1   a   b10
#2   c   d11
#5   c   e12


#or you have cases like these:

dat3- read.table(text=
id1   id2   value
a  b   10
c  d11
b a 10
a  b10
e  c 12
c  e 12
c  d 11
,sep=,header=TRUE,stringsAsFactors=FALSE)

 dat3New-dat3[apply(dat3[,-3],1,function(x) {x1- order(x); x1[1]x1[2]}),]
dat3New[!duplicated(dat3New$value),]
#  id1 id2 value
#1   a   b10
#2   c   d11
#6   c   e12
A.K.




Hi everybody,

I was hoping that someone could help me with this problem. I
have a table with 3 columns. Some rows contain duplicates where the
identifiers in columns 1 and 2 are in reverse order, but the value
associated with the row is the same.


For example:

id1   id2   value
a  b   10
c  d11
b a 10
c  e 12

Rows 1 and 3 are duplicates (have the same value). I would like
to retain only row 1 and delete row 3. Final table should look like
this:

id1   id2   value
a  b   10
c  d11
c  e 12

Thanks in advance for any help provided.

Vince

__
[hidden email]x-msg://23/user/SendEmail.jtp?type=nodenode=4664105i=0 
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/Removing-rows-that-are-duplicates-but-column-values-are-in-reversed-order-tp4664069p4664105.html
To unsubscribe from Removing rows that are duplicates but column values are in 
reversed order, click 
herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4664069code=dnByM0Bjb3JuZWxsLmVkdXw0NjY0MDY5fDEzODMwOTA4MTI=.
NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://r.789695.n4.nabble.com/Removing-rows-that-are-duplicates-but-column-values-are-in-reversed-order-tp4664069p4664107.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] BLAS with glmnet

2013-04-12 Thread Brian Pasley
I'm using a multithreaded BLAS library with R and I see the expected
speed improvements with matrix multiplication, svd, etc.  However,
glmnet continues to use only a single CPU.  Since this package is
compiled from Fortran, is this the expected behavior or is there a way
to compile the glmnet package so that it uses the multithreaded BLAS
library?
Thank you

Brian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with handling of attributes in xmlToList in XML package

2013-04-12 Thread santiago gil
Hello all,

I have a problem with the way attributes are dealt with in the
function xmlToList(), and I haven't been able to figure it out for
days now.

Say I have a document (produced by nmap) like this:

 mydoc - 'host starttime=1365204834 endtime=1365205860status 
 state=up reason=echo-reply reason_ttl=127/
address addr=XXX.XXX.XXX.XXX addrtype=ipv4/
portsport protocol=tcp portid=135state state=open
reason=syn-ack reason_ttl=127/service name=msrpc
product=Microsoft Windows RPC ostype=Windows method=probed
conf=10cpecpe:/o:microsoft:windows/cpe/service/port
port protocol=tcp portid=139state state=open
reason=syn-ack reason_ttl=127/service name=netbios-ssn
method=probed conf=10//port
/ports
times srtt=647 rttvar=71 to=10/
/host'

I want to store this as a list of lists, so I do:

mytree-xmlTreeParse(mydoc)
myroot-xmlRoot(mytree)
mylist-xmlToList(myroot)

Now my problem is that when I want to fetch the attributes of the
services running of each port, the behavior is not consistent:

 mylist[[ports]][[1]][[service]]$.attrs[name]
   name
msrpc
 mylist[[ports]][[2]][[service]]$.attrs[name]
Error in trash_list[[ports]][[2]][[service]]$.attrs :
  $ operator is invalid for atomic vectors

I understand that the way they are dfined in the documnt is not the
same, but I think there still should be a consistent behavior. I've
tried many combination of parameters for xmlTreeParse() but nothing
has helped me. I can't find a way to call up the name of the service
consistently regardless of whether the node has children or not. Any
tips?

All the best,


S.G.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RJSONIO Installation

2013-04-12 Thread John Henry
Trying to install rjsonio and I've run into a couple of issues.
1. When installing from R, it's starts to download, then I get
* installing *source* package 'RJSONIO' ...
ERROR: configuration failed for package 'RJSONIO'
* removing 'C:/Program Files/R/R-3.0.0/library/RJSONIO'

2. I've tried similar steps on the cmd(I'm win7).
I receive the same message, starts to install then, configuration failed.

3. Looking into the 'read_me' doc that comes with rjsonio it mentions
libjson C libraries.
They give a special command if you don't have this installed, which I tried,
but it didn't work.

At this point I want to generate the libjson libraries to try the other
'read_me' suggestions or just get it to work from R.
FYI, on the R CMD I've tried --binary and I've tried --build, neither work.
Please help, running out of options.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparison of Date format

2013-04-12 Thread arun


Hi,
 In the example you provided, it looks like the dates in Date2 happens first.  
So, I changed it a bit.  

DataA- read.table(text=
ID,Status,Date1,Date2            
1,A,3-Feb-01,15-May-01         
1,B,15-May-01,16-May-01         
1,A,16-May-01,3-Sep-01                     
1,B,3-Sep-01,13-Sep-01                     
1,C,13-Sep-01,26-Feb-04                     
2,A,9-Feb-01,24-May-01         
2,B,24-May-01,25-May-01                     
2,A,25-May-01,16-Mar-02                     
2,A,6-Mar-02,18-Mar-02
2,A,14-Sep-01,6-Mar-02         
,sep=,,header=TRUE,stringsAsFactors=FALSE)
library(stringr)
DataA[,3]- str_trim(DataA[,3])
DataA[,4]- str_trim(DataA[,4])
DataB- read.table(text=
ID     Date.Accident         
1   3-Sep-01  
1     20-Jan-05 
1   26-Feb-04        
2     6-Mar-02
,sep=,header=TRUE,stringsAsFactors=FALSE)

 
lst1-lapply(seq_len(nrow(DataB)),function(i) {x1-unlist(mapply(function(x,y) 
which(x==y),DataA[,3:4],DataB[i,2]));x2-if(length(x1)==2) 
DataA[x1[which.min(x1)],!names(DataA)%in%names(x1[which.max(x1)])] else 
if(length(x1)==1) DataA[x1,c(ID,Status,names(x1))] else NULL})

 lst2-lapply(lst1,data.frame)
lst2-lst2[lapply(lst2,nrow)!=0]
 lst2
#[[1]]
#  ID Status    Date2
#3  1  A 3-Sep-01

#[[2]]
#  ID Status Date2
#5  1  C 26-Feb-04

#[[3]]
#  ID Status    Date1
#9  2  A 6-Mar-02
library(plyr)
 dataNew-do.call(rbind,lapply(lst2,function(x) {colnames(x)[3]- 
colnames(DataB)[2];x}))
res-join(dataNew,DataB,by=c(Date.Accident,ID),type=right)
 res
#  Date.Accident ID Status
#1  3-Sep-01  1  A
#2 20-Jan-05  1   NA
#3 26-Feb-04  1  C
#4  6-Mar-02  2  A



#or you can split by ID
lst1New-lapply(unique(DataA$ID),function(i){x1- DataA[DataA$ID==i,]; x2- 
DataB[DataB$ID==i,]; do.call(rbind,lapply(seq_len(nrow(x2)),function(i) {x3- 
unlist(mapply(function(x,y) which(x==y), x1[,3:4],x2[i,2])); x4- 
if(length(x3)==2) x1[x3[which.min(x3)],!names(x1)%in%names(x3[which.max(x3)])] 
else if(length(x3)==1) x1[x3,c(ID,Status,names(x3))] else NULL})) })


 lst1New
#[[1]]
 # ID Status Date2
#3  1  A  3-Sep-01
#5  1  C 26-Feb-04

#[[2]]
 # ID Status    Date1
#9  2  A 6-Mar-02
 dataNew1- do.call(rbind,lapply(lst1New,function(x) {colnames(x)[3]- 
colnames(DataB)[2];x}))
 res1- join(dataNew1,DataB,by=c(Date.Accident,ID),type=right)
 res1
#  Date.Accident ID Status
#1  3-Sep-01  1  A
#2 20-Jan-05  1   NA
#3 26-Feb-04  1  C
#4  6-Mar-02  2  A
A.K.



 From: farnoosh sheikhi farnoosh...@yahoo.com
To: smartpink...@yahoo.com smartpink...@yahoo.com 
Sent: Friday, April 12, 2013 5:40 PM
Subject: Comparison of Date format 
 




 Hi there,

Hope all is well.
I have a complicated data and I need to create a new variable based on the date 
and description of the data.
I really appreciate if you can help me.
Here is how data look like:
DataA 
 DataB 
 
 
       ID Status Date1 Date2 
        ID Date.Accident 
 Result 
1   A 3-Feb-01 15-May-01 
 1 3-Sep-01 
 A 
1    B 15-May-01 16-May-01 
 1 20-Jan-05 
 NA 
1    A 16-May-01 3-Sep-01 
 
 
 
 
 
1    B 3-Sep-01 13-Sep-01 
 
 
 
 
 
1     C 13-Sep-01 26-Feb-04 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2    A 9-Feb-01 24-May-01 
 2 6-Mar-02 
 A 
2    B 24-May-01 25-May-01 
 
 
 
 
 
2    A 25-May-01 6-Mar-02 
 
 
 
 
 
2     A 6-Mar-02 18-Mar-02 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
I want to compare dataA to B for each ID. if Date 1 or Date 2 matches to 
Date.Accident return the result as status in dataA as a new result in Data B.
The trick here is I have two dates that are matched, but I want the status of 
the one that happen first. The sample size of each data is not the same.

I really appreciate your time and help.
Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solving an integral in R gives the error “The integral is probably divergent”

2013-04-12 Thread Thomas Lumley
On Sat, Apr 13, 2013 at 6:31 AM, peter dalgaard pda...@gmail.com wrote:

 But is it supposed to be t^{-3/2} or t^{-0.5}?? The formula has the former
 and the code the latter, and the integral is clearly divergent with the
 former.



I didn't look at that -- I just assumed that the code would reproduce the
reported error.

-thomas



 -pd

 On Apr 12, 2013, at 04:51 , Thomas Lumley wrote:

  I don't get an error message (after I correct the missing line break
 after
  the comment
 
  b- sapply(a, Cfun, upper=1)
  b
   [1]  1.583458e-54  7.768026e-50  2.317562e-45  4.206260e-41
  4.645737e-37
  3.123801e-33  1.279358e-29  3.193257e-26  4.860876e-23
  [10]  4.516582e-20  2.564400e-17  8.908932e-15  1.896996e-12
  2.481084e-10
  1.998561e-08  9.946570e-07  3.067751e-05  5.862075e-04
  [19]  6.818952e-03  4.297061e-02  0.00e+00  3.175122e-01
  3.723022e-01
  2.364930e-01  9.144836e-02  2.190878e-02  3.252754e-03
  [28]  2.983763e-04  1.685692e-05  5.849602e-07  1.244158e-08
  1.619155e-10
  1.287603e-12  6.250149e-15  1.850281e-17  3.338241e-20
  [37]  3.668412e-23  2.454192e-26  9.991546e-30  2.474577e-33
  3.727226e-37
  3.413319e-41  1.900112e-45  6.428505e-50  1.321588e-54
  [46]  1.650722e-59  1.252524e-64  5.772750e-70  1.615916e-75
  2.746972e-81
  2.835655e-87  1.777399e-93 6.764271e-100 1.562923e-106
  [55] 2.192373e-113 1.866955e-120 9.651205e-128 3.028623e-135
 5.769185e-143
  6.670835e-151 4.682023e-159 1.994643e-167 5.157808e-176
  [64] 8.095084e-185 7.711162e-194 4.458042e-203 1.564139e-212
 3.330362e-222
  4.302974e-232 3.373500e-242 1.604721e-252 4.631224e-263
  [73] 8.108474e-274 8.611898e-285 5.547745e-296  0.00e+00
  0.00e+00
  0.00e+00  0.00e+00  0.00e+00  0.00e+00
  [82]  0.00e+00  0.00e+00  0.00e+00  0.00e+00
  0.00e+00
  0.00e+00  0.00e+00  0.00e+00  0.00e+00
  [91]  0.00e+00  0.00e+00  0.00e+00  0.00e+00
  0.00e+00
  0.00e+00  0.00e+00  0.00e+00  0.00e+00
  [100]  0.00e+00
 
 
   -thomas
 
 
 
  On Tue, Apr 9, 2013 at 3:14 PM, Janesh Devkota janesh.devk...@gmail.com
 wrote:
 
  I am trying to solve an integral in R. However, I am getting an error
 when
  I am trying to solve for that integral.
 
  The equation that I am trying to solve is as follows:
 
  $$ C_m = \frac{{abs{x}}e^{2x}}{\pi^{1/2}}\int_0^t
 t^{-3/2}e^{-x^2/t-t}dt $$
 
  [image: enter image description here]
 
  The code that I am using is as follows:
 
  a - seq(from=-10, by=0.5,length=100)
  ## Create a function to compute integrationCfun - function(XX, upper){
   integrand - function(x)x^(-0.5)*exp((-XX^2/x)-x)
   integrated - integrate(integrand, lower=0, upper=upper)$value
   (final - abs(XX)*pi^(-0.5)*exp(2*XX)*integrated) }
 
 
  b- sapply(a, Cfun, upper=1)
 
  The error that I am getting is as follows:
 
  Error in integrate(integrand, lower = 0, upper = upper) :
   the integral is probably divergent
 
  Does this mean I cannot solve the integral ?
 
  Any possible ways to fix this problem will be highly appreciated.The
  question can be found on
 
 
 http://stackoverflow.com/questions/15892586/solving-an-integral-in-r-gives-error-the-integral-is-probably-divergent
  also.
 
  Thanks.
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
  --
  Thomas Lumley
  Professor of Biostatistics
  University of Auckland
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 --
 Peter Dalgaard, Professor,
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com











-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparison of Date format

2013-04-12 Thread arun
HI,
In cases like below:
DataA- read.table(text=
ID,Status,Date1,Date2            
1,A,3-Feb-01,15-May-01         
1,B,15-May-01,16-May-01         
1,A,16-May-01,3-Sep-01                     
1,B,3-Sep-01,13-Sep-01                     
1,C,13-Sep-01,26-Feb-04                     
2,A,9-Feb-01,24-May-01         
2,B,24-May-01,25-May-01                     
2,A,25-May-01,16-Mar-02                     
2,A,6-Mar-02,18-Mar-02
2,A,14-Sep-01,6-Mar-02    
,sep=,,header=TRUE,stringsAsFactors=FALSE)
library(stringr)
DataA[,3]- str_trim(DataA[,3])
DataA[,4]- str_trim(DataA[,4])
DataB- read.table(text=
ID     Date.Accident         
1   3-Sep-01  
1     20-Jan-05 
1   26-Feb-04
2   25-May-01        
2     6-Mar-02
,sep=,header=TRUE,stringsAsFactors=FALSE)
DataB[,2]- str_trim(DataB[,2])

#Needed modification in the 2nd method:
dataNew2-do.call(rbind,lapply(unique(DataA$ID),function(i){x1- 
DataA[DataA$ID==i,]; x2- DataB[DataB$ID==i,]; 
do.call(rbind,lapply(seq_len(nrow(x2)),function(i) {x3- 
unlist(mapply(function(x,y) which(x==y), x1[,3:4],x2[i,2])); x4- 
if(length(x3)==2) x1[x3[which.min(x3)],!names(x1)%in%names(x3[which.max(x3)])] 
else if(length(x3)==1) x1[x3,c(ID,Status,names(x3))] else NULL; 
if(length(x4)0) {colnames(x4)[3]- colnames(DataB)[2]; x4} else NULL})) }))
library(plyr)
res2- join(dataNew2,DataB,by=c(Date.Accident,ID),type=right)
res2
#  Date.Accident ID Status
#1  3-Sep-01  1  A
#2 20-Jan-05  1   NA
#3 26-Feb-04  1  C
#4 25-May-01  2  B
#5  6-Mar-02  2  A

Using the first method:
lst1-lapply(seq_len(nrow(DataB)),function(i) {x1-unlist(mapply(function(x,y) 
which(x==y),DataA[,3:4],DataB[i,2]));x2-if(length(x1)==2) 
DataA[x1[which.min(x1)],!names(DataA)%in%names(x1[which.max(x1)])] else 
if(length(x1)==1) DataA[x1,c(ID,Status,names(x1))] else NULL})

 lst2-lapply(lst1,data.frame)
lst2-lst2[lapply(lst2,nrow)!=0]
library(plyr)
 dataNew-do.call(rbind,lapply(lst2,function(x) {colnames(x)[3]- 
colnames(DataB)[2];x}))
res-join(dataNew,DataB,by=c(Date.Accident,ID),type=right)
 identical(res,res2)
#[1] TRUE
A.K.


- Original Message -
From: arun smartpink...@yahoo.com
To: farnoosh sheikhi farnoosh...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Saturday, April 13, 2013 12:41 AM
Subject: Re: Comparison of Date format 



Hi,
 In the example you provided, it looks like the dates in Date2 happens first.  
So, I changed it a bit.  

DataA- read.table(text=
ID,Status,Date1,Date2            
1,A,3-Feb-01,15-May-01         
1,B,15-May-01,16-May-01         
1,A,16-May-01,3-Sep-01                     
1,B,3-Sep-01,13-Sep-01                     
1,C,13-Sep-01,26-Feb-04                     
2,A,9-Feb-01,24-May-01         
2,B,24-May-01,25-May-01                     
2,A,25-May-01,16-Mar-02                     
2,A,6-Mar-02,18-Mar-02
2,A,14-Sep-01,6-Mar-02         
,sep=,,header=TRUE,stringsAsFactors=FALSE)
library(stringr)
DataA[,3]- str_trim(DataA[,3])
DataA[,4]- str_trim(DataA[,4])
DataB- read.table(text=
ID     Date.Accident         
1   3-Sep-01  
1     20-Jan-05 
1   26-Feb-04        
2     6-Mar-02
,sep=,header=TRUE,stringsAsFactors=FALSE)

 
lst1-lapply(seq_len(nrow(DataB)),function(i) {x1-unlist(mapply(function(x,y) 
which(x==y),DataA[,3:4],DataB[i,2]));x2-if(length(x1)==2) 
DataA[x1[which.min(x1)],!names(DataA)%in%names(x1[which.max(x1)])] else 
if(length(x1)==1) DataA[x1,c(ID,Status,names(x1))] else NULL})

 lst2-lapply(lst1,data.frame)
lst2-lst2[lapply(lst2,nrow)!=0]
 lst2
#[[1]]
#  ID Status    Date2
#3  1  A 3-Sep-01

#[[2]]
#  ID Status Date2
#5  1  C 26-Feb-04

#[[3]]
#  ID Status    Date1
#9  2  A 6-Mar-02
library(plyr)
 dataNew-do.call(rbind,lapply(lst2,function(x) {colnames(x)[3]- 
colnames(DataB)[2];x}))
res-join(dataNew,DataB,by=c(Date.Accident,ID),type=right)
 res
#  Date.Accident ID Status
#1  3-Sep-01  1  A
#2 20-Jan-05  1   NA
#3 26-Feb-04  1  C
#4  6-Mar-02  2  A



#or you can split by ID
lst1New-lapply(unique(DataA$ID),function(i){x1- DataA[DataA$ID==i,]; x2- 
DataB[DataB$ID==i,]; do.call(rbind,lapply(seq_len(nrow(x2)),function(i) {x3- 
unlist(mapply(function(x,y) which(x==y), x1[,3:4],x2[i,2])); x4- 
if(length(x3)==2) x1[x3[which.min(x3)],!names(x1)%in%names(x3[which.max(x3)])] 
else if(length(x3)==1) x1[x3,c(ID,Status,names(x3))] else NULL})) })


 lst1New
#[[1]]
 # ID Status Date2
#3  1  A  3-Sep-01
#5  1  C 26-Feb-04

#[[2]]
 # ID Status    Date1
#9  2  A 6-Mar-02
 dataNew1- do.call(rbind,lapply(lst1New,function(x) {colnames(x)[3]- 
colnames(DataB)[2];x}))
 res1- join(dataNew1,DataB,by=c(Date.Accident,ID),type=right)
 res1
#  Date.Accident ID Status
#1  3-Sep-01  1  A
#2 20-Jan-05  1   NA
#3 26-Feb-04  1  C
#4  6-Mar-02  2  A
A.K.



From: farnoosh sheikhi farnoosh...@yahoo.com
To: smartpink...@yahoo.com smartpink...@yahoo.com 
Sent: Friday, April 12, 2013 5:40 PM
Subject: Comparison of Date format