date:20130424

Hi,
It is not that clear.
If VAR1 is a match between columns AB001A, AB0002A, VAR2  between AB001A, AB362 
and VAR3 between AB0002A and AB362:
Also, I assume row8 match would be taken as 1.


dat1- read.table(text=
  S.No AB001A AB0002A AB362
   1   -/-    C/C   A/A    
    2   C/C    C/C   A/A    
    3   C/C    C/C   A/A    
    4   C/C    C/C   A/A    
    5   C/C    C/C   A/A    
    6   C/C    C/C   A/A    
    7   C/C    C/C   A/A    
    8   -/-    -/-   -/-    
    9   C/C    C/C   A/A    
    10  C/C    C/C   A/A    
    11  -/-    C/C   A/A    
    12  C/C    C/C   A/A    
    13  C/C    C/C   A/A    
    14  C/C    C/C   A/A    
    16  C/C    -/-   A/A    
    17   -/-    C/C   A/A    
    18   C/C    C/C   A/A    
    19  C/C    C/C   A/A
,sep=,header=TRUE,stringsAsFactors=FALSE)

library(plyr)
res-mutate(dat1,VAR1=1*(AB001A==AB0002A),VAR2=1*(AB001A==AB362),VAR3=1*(AB0002A==AB362),SUM=rowSums(cbind(VAR1,VAR2,VAR3)),MATCH=(SUM/3)*100,Rank=rank(MATCH)
 head(res)
#  S.No AB001A AB0002A AB362 VAR1 VAR2 VAR3 SUM    MATCH Rank
#1    1    -/- C/C   A/A    0    0    0   0  0.0  2.5
#2    2    C/C C/C   A/A    1    0    0   1 33.3 11.0
#3    3    C/C C/C   A/A    1    0    0   1 33.3 11.0
#4    4    C/C C/C   A/A    1    0    0   1 33.3 11.0
#5    5    C/C C/C   A/A    1    0    0   1 33.3 11.0
#6    6    C/C C/C   A/A    1    0    0   1 33.3 11.0

#or


 
res-mutate(dat1,VAR1=1*(AB001A==AB0002A),VAR2=1*(AB001A==AB362),VAR3=1*(AB0002A==AB362),SUM=rowSums(cbind(VAR1,VAR2,VAR3)),MATCH=(SUM/3)*100,Rank=rank(MATCH,ties.method=min))
 head(res)
#  S.No AB001A AB0002A AB362 VAR1 VAR2 VAR3 SUM    MATCH Rank
#1    1    -/- C/C   A/A    0    0    0   0  0.0    1
#2    2    C/C C/C   A/A    1    0    0   1 33.3    5
#3    3    C/C C/C   A/A    1    0    0   1 33.3    5
#4    4    C/C C/C   A/A    1    0    0   1 33.3    5
#5    5    C/C C/C   A/A    1    0    0   1 33.3    5
#6    6    C/C C/C   A/A    1    0    0   1 33.3    5
A.K.







Hi to all bloggers, 
 my data looks like this, 

S. No   AB001A  AB0002A AB362   VAR1    VAR2    VAR3    SUM %Match  Rank 
   1   -/-        C/C   A/A                         
    2   C/C        C/C   A/A                         
    3   C/C        C/C   A/A                         
    4   C/C        C/C   A/A                         
    5   C/C        C/C   A/A                         
    6   C/C        C/C   A/A                         
    7   C/C        C/C   A/A                         
    8   -/-        -/-   -/-                         
    9   C/C        C/C   A/A                         
    10  C/C        C/C   A/A                         
    11  -/-        C/C   A/A                         
    12  C/C        C/C   A/A                         
    13  C/C        C/C   A/A                         
    14  C/C        C/C   A/A                         
    16  C/C        -/-   A/A                         
    17   -/-        C/C   A/A                         
    18   C/C        C/C   A/A                         
    19  C/C        C/C   A/A                         
I want to match obs 3 with obs 2 if it exactly matched then score 
will be 1 else 0, that will be stored in var1 for AB001a, in var2 for 
ab0002a and in var3 for ab362 and i want to calculate sum of all the 1's
 and observation match percent and their rank (top ten matchers), I did 
this successfully in excel but it took me lot of time, i used if 
condition in excel like (=if(A3=A$2,1,0) and then i dragged among all 
obs and i did sum of all obs, their %match and rank. My question is how 
can i do this in R? can i use match package for this? or other packages 
will help me? my data is so big with 5,15,567 obs. can any one guide me 
how to do this in sas because i want to reduce my time to analyze my 
data. Thanking you Regards,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regression on stratified count data

2013-04-24 Thread peter dalgaard


On Apr 24, 2013, at 06:15 , meng wrote:

 Hi all:
 For stratified count data,how to perform regression analysis?
 
 My data:
 age case oc count
 1   1 121
 1   1 226
 1   2 117
 1   2 259
 2   1 118
 2   1 288
 2   2 1 7
 2   2 2   95
 
 age:
 1:40y
 2:40y
 
 case:
 1:patient
 2:health
 
 oc:
 1:use drug
 2:not use drug
 
 My purpose:
 Anaysis whether case and oc are correlated, and age is a stratified variable.
 
 My solution:
 1,Mantel-Haenszel test by using function mantelhaen.test
 2,loglinear regression by using function 
 glm(count~case*oc,family=poisson).But I don't know how to handle variable 
 age,which is the stratified variable.

The canonical way is to fit the model without 2nd order interaction: 

count~case*oc*age-case:oc:case  . 

(It may take the back of an envelope or two to realize that this is equivalent 
to the common OR assumption of the MH test.) 

Alternatively, use logistic regression 

glm(case ~ oc + age, family=binomial, weight=count, data=dd)

(NB: it is important that case is a factor here!)

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simulate user input in R

2013-04-24 Thread Vahe nr

Dear all,

Can we simulate user input in R ?
for example if we have a function which needs an input from the user to
continue its work, can we automate this step (simulate the input...)

Here is the sample:
choose between one of the grouping factor available :
c(Village, Country)
we need to enter Village or Country to continue.can I automatically
pass one of the values without user input.

Regards,
 Vahe

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Understanding how nls() works

2013-04-24 Thread Patrick Connolly

I'm trying to understand what goes on in the process that nls() uses.
This converges without much drama:

 rate.nls - nls(Dev ~ exp(rho * (T)) - exp(rho*Tmax - (Tmax - T)/del)+ 
+ lam , data = xx, trace = TRUE,
+ start = c(rho = 0.15, del = 7, lam = .02, Tmax = 30))
599.7841 :   0.15  7.00  0.02 30.00 
4.69849 :   0.14673457  6.83443060 -0.01417782 29.94392535 
0.06971343 :   0.14787121  6.75095966 -0.01281743 28.88851217 
0.01197127 :   0.1460683  6.8334550 -0.0128118 30.1599472 
0.008279834 :   0.14406976  6.92819920 -0.01281783 31.63914711 
0.002267513 :   0.1437983  6.9426081 -0.0127973 31.9153204 
0.002264977 :   0.14377323  6.94385235 -0.01279224 31.95091153 
0.002264977 :   0.14378546  6.94326629 -0.01278643 31.95045828 
0.002264977 :   0.14378124  6.94346882 -0.01278843 31.95065902 
0.002264977 :   0.14378271  6.94339837 -0.01278772 31.95059050 

If I make one parameter further away from optimum, I get this:

 rate.nls - nls(Dev ~ exp(rho * (T)) - exp(rho*Tmax - (Tmax - T)/del)+ 
+ lam , data = xx, trace = TRUE, 
+ nls.control(warnOnly = TRUE),
+ start = c(rho = 0.15, del = 7, lam = 1.02, Tmax = 30))
59.80968 :   0.15  7.00  1.02 30.00 
4.69849 :   0.14673457  6.83443060 -0.01417782 29.94392535 
0.06973771 :   0.14787046  6.75099517 -0.01281668 28.88853959 
0.0119916 :   0.1460709  6.8333430 -0.0128098 30.1600040 
0.008369133 :   0.14406664  6.92834092 -0.01281926 31.63978393 
0.002267664 :   0.1438008  6.9424901 -0.0127959 31.9151258 
0.002264977 :   0.1437771  6.9436655 -0.0127904 31.9507300 
0.002264977 :   0.14378253  6.94340678 -0.01278781 31.95059843 
0.002264977 :   0.1437849  6.9432936 -0.0127867 31.9504860 
Warning message:
In nls(Dev ~ exp(rho * (T)) - exp(rho * Tmax - (Tmax - T)/del) +  :
  step factor 0.000488281 reduced below 'minFactor' of 0.000976562


There's no surprise that the traces will be different, but I can't
understand why the first iteration gives exactly the same result in
both cases.  *Then* they diverge.  What is the explanation?  (Maybe if
I knew what the step factor refers to, it could be clearer but I
suspect it doesn't come in at that stage.)

I can't think of a way of making a minimal working example but I'd
guess that it's not necessary to explain the procedure.  I could
supply my xx dataframe if it's indispensible to the explanation.

TIA

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulate user input in R

2013-04-24 Thread Jeff Newmiller

Your example is not reproducible, so it is open to misinterpretation.

Normally the approach taken in R is not to wait for user input, but have the 
user call the function with the desired values. One interpretation of your 
question is that the function you are using is broken by design, and you should 
re-write it.

---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Vahe nr vne...@gmail.com wrote:

Dear all,

Can we simulate user input in R ?
for example if we have a function which needs an input from the user to
continue its work, can we automate this step (simulate the input...)

Here is the sample:
choose between one of the grouping factor available :
c(Village, Country)
we need to enter Village or Country to continue.can I automatically
pass one of the values without user input.

Regards,
 Vahe

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulate user input in R

2013-04-24 Thread Vahe nr

Thanks Jeff for your reply...
The code is not written by me it is the NDVITS package which is used in R.
TimeSeriesAnalysis {ndvits}
And at that point it requires my input.

Regards,
 Vahe


On Wed, Apr 24, 2013 at 11:04 AM, Jeff Newmiller
jdnew...@dcn.davis.ca.uswrote:

 Your example is not reproducible, so it is open to misinterpretation.

 Normally the approach taken in R is not to wait for user input, but have
 the user call the function with the desired values. One interpretation of
 your question is that the function you are using is broken by design, and
 you should re-write it.

 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
 Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 ---
 Sent from my phone. Please excuse my brevity.

 Vahe nr vne...@gmail.com wrote:

 Dear all,
 
 Can we simulate user input in R ?
 for example if we have a function which needs an input from the user to
 continue its work, can we automate this step (simulate the input...)
 
 Here is the sample:
 choose between one of the grouping factor available :
 c(Village, Country)
 we need to enter Village or Country to continue.can I automatically
 pass one of the values without user input.
 
 Regards,
  Vahe
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] News package

2013-04-24 Thread Aswathy Nair

Hi,

Is there any package available in R to download news content?


Thanks in advance.
Ashy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] pglm package: fitted values and residuals

2013-04-24 Thread alfonso . carfora


I'm using the package pglm and I'have estimated a random probit model.
I need to save in a vector the fitted values and the residuals of the  
model but I can not do it.


I tried with the command fitted.values using the following procedure  
without results:


library(pglm)

m1_S-pglm(Feed ~ Cons_PC_1 + imp_gen_1 + LGDP_PC_1 + lnEI_1 +  
SH_Ren_1,data,family=binomial(probit),model=random,method=bfgs,index=c(Year,IDCountry))


m1_S$fitted.values
residuals(m1)


Can someone help me about it?

Thanks


**  
IL MERITO DEGLI STUDENTI VIENE RICONOSCIUTO

Il 5 per mille all'Universita' degli Studi di Napoli Parthenope incrementa le 
borse di studio agli studenti - codice fiscale 80018240632
http://www.uniparthenope.it/index.php/5xmille 


http://www.uniparthenope.it/index.php/it/avvisi-sito-di-ateneo/2943-la-parthenope-premia-il-tuo-voto-di-diploma-ed-il-tuo-imegno-con-i-proventi-del-5-per-mille

Questa informativa e' inserita in automatico dal sistema al fine esclusivo 
della realizzazione dei fini istituzionali dell'ente.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’

2013-04-24 Thread Liviu Andronic

Dear all,
I've bumped into the: Error in loadNamespace(name) : there is no
package called ‘R.utils’ error. I've already read a bit on this (
http://www.cybaea.net/Blogs/Data/A-warning-on-the-R-save-format.html )
but I have a follow-up question.

Given a workspace that automatically loads a package that I don't
really need/want (e.g. ‘R.utils’), how do I identify which object
requires this package to load? I would like to avoid loading ‘R.utils’
every time I open an R session.

Regards,
Liviu


 sessionInfo()
R version 2.15.3 (2013-03-01)
Platform: i686-pc-linux-gnu (32-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8
 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] datasets  grDevices splines   graphics  utils stats
methods   base

other attached packages:
 [1] R.utils_1.23.2R.oo_1.13.0   R.methodsS3_1.4.2
tables_0.7.57 reshape2_1.2.2
 [6] car_2.0-15nnet_7.3-6MASS_7.3-23
Hmisc_3.10-1  survival_2.37-2
[11] foreign_0.8-53

loaded via a namespace (and not attached):
[1] cluster_1.14.3   grid_2.15.3  lattice_0.20-13  plyr_1.8
 rstudio_0.97.312
[6] stringr_0.6.2tools_2.15.3


-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Looking for a better code for my problem.

2013-04-24 Thread Christofer Bogaso

Hello again,

Let say I have following data:

Dat - structure(list(AA = structure(c(3L, 1L, 2L, 1L, 2L, 3L, 3L, 2L,
3L, 1L, 1L, 3L, 3L, 2L, 2L, 3L, 2L, 1L, 1L, 1L), .Label = c(A,
B, C), class = factor), BB = structure(c(2L, 3L, 2L, 2L,
2L, 3L, 2L, 2L, 2L, 1L, 1L, 2L, 3L, 1L, 3L, 2L, 1L, 2L, 2L, 3L
), .Label = c(a, b, c), class = factor), CC = 1:20), .Names = c(AA,
BB, CC), row.names = c(NA, -20L), class = data.frame)

Now I want to select a subset of that 'Dat', for which:
1. First column will contain ALL A
2. First column will contain those B for which BB = b in second column.

Therefore I tries following:

 Only_A - Dat[Dat[, 'AA'] == A, ]
 Only_B - Dat[Dat[, 'AA'] == B, ]
 rbind(Only_A, Only_B[Only_B[, 'BB'] == b, ])
   AA BB CC
2   A  c  2
4   A  b  4
10  A  a 10
11  A  a 11
18  A  b 18
19  A  b 19
20  A  c 20
3   B  b  3
5   B  b  5
8   B  b  8


However I believe there must be some better code to achieve that which
is tidier, i.e. there must be some 1-liner code.

Can somebody suggest any better approach if possible?

Thanks and regards,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looking for a better code for my problem.

2013-04-24 Thread Rui Barradas


Hello,

You can easily make of the following a one-liner. Note that the order of 
rows is not the same as in your code, so identical() will return FALSE.



idx - Dat[, 'AA'] == A | (Dat[, 'AA'] == B  Dat[, 'BB'] == b)
res2 - Dat[idx, ]


Hope this helps,

Rui Barradas

Em 24-04-2013 11:21, Christofer Bogaso escreveu:

Hello again,

Let say I have following data:

Dat - structure(list(AA = structure(c(3L, 1L, 2L, 1L, 2L, 3L, 3L, 2L,
3L, 1L, 1L, 3L, 3L, 2L, 2L, 3L, 2L, 1L, 1L, 1L), .Label = c(A,
B, C), class = factor), BB = structure(c(2L, 3L, 2L, 2L,
2L, 3L, 2L, 2L, 2L, 1L, 1L, 2L, 3L, 1L, 3L, 2L, 1L, 2L, 2L, 3L
), .Label = c(a, b, c), class = factor), CC = 1:20), .Names = c(AA,
BB, CC), row.names = c(NA, -20L), class = data.frame)

Now I want to select a subset of that 'Dat', for which:
1. First column will contain ALL A
2. First column will contain those B for which BB = b in second column.

Therefore I tries following:


Only_A - Dat[Dat[, 'AA'] == A, ]
Only_B - Dat[Dat[, 'AA'] == B, ]
rbind(Only_A, Only_B[Only_B[, 'BB'] == b, ])

AA BB CC
2   A  c  2
4   A  b  4
10  A  a 10
11  A  a 11
18  A  b 18
19  A  b 19
20  A  c 20
3   B  b  3
5   B  b  5
8   B  b  8


However I believe there must be some better code to achieve that which
is tidier, i.e. there must be some 1-liner code.

Can somebody suggest any better approach if possible?

Thanks and regards,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] extract function extracting only NA values


Hello,

I have five raster files in ASCII format. With four of them I have no 
problem extracting values based on a set of X and Y coordinates. 
Unfortunately with one of the files all I managed to extract is NA 
values. To verify the problem I have opened the raster with ArcMap and 
there are no NA values where I am extracting. I have also plotted the 
spatial point class on top of the raster in R and it does correspond to 
the correct locations.


These are some of the commands I am using, and as I already pointed out 
that works perfectly with other raster files.


sp-SpatialPoints(xysp)
xy$rasterimg-extract(rasterimg,sp)

Can anyone help? At this point I am rather clueless about this.

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regression on stratified count data

2013-04-24 Thread Achim Zeileis


On Wed, 24 Apr 2013, meng wrote:


Hi all:
For stratified count data,how to perform regression analysis?

My data:
age case oc count
1   1 121
1   1 226
1   2 117
1   2 259
2   1 118
2   1 288
2   2 1 7
2   2 2   95

age:
1:40y
2:40y

case:
1:patient
2:health

oc:
1:use drug
2:not use drug

My purpose:
Anaysis whether case and oc are correlated, and age is a stratified variable.

My solution:
1,Mantel-Haenszel test by using function mantelhaen.test
2,loglinear regression by using function glm(count~case*oc,family=poisson).But I don't 
know how to handle variable age,which is the stratified variable.


Instead of using glm(family = poisson) it is also convenient to use 
loglm() from package MASS for the associated convenience table.


The code below shows how to set up the contingency table, visualize it 
using package vcd, and then fit two models using loglm. The models 
considered are conditional independence of case and drug given age and the 
no three-way interaction already suggested by Peter. Both models are 
also accompanied by visualizations of the residuals.


## contingency table with nice labels
tab - expand.grid(drug = 1:2, case = 1:2, age = 1:2)
tab$count - c(21, 26, 17, 59, 18, 88, 7, 95)
tab$age - factor(tab$age, levels = 1:2, labels = c(40, 40))
tab$case - factor(tab$case, levels = 1:2, labels = c(patient, 
healthy))

tab$drug - factor(tab$drug, levels = 1:2, labels = c(yes, no))
tab - xtabs(count ~ age + drug + case, data = tab)

## visualize case explained by drug given age
library(vcd)
mosaic(case ~ drug | age, data = tab,
  split_vertical = c(TRUE, TRUE, FALSE))

## test wheter drug and case are independent given age
m1 - loglm(~ age/(drug + case), data = tab)
m1

## visualize corresponding residuals from independence model
mosaic(case ~ drug | age, data = tab,
  split_vertical = c(TRUE, TRUE, FALSE),
  residuals_type = deviance,
  gp = shading_hcl, gp_args = list(interpolate = 1.2))
mosaic(case ~ drug | age, data = tab,
  split_vertical = c(TRUE, TRUE, FALSE),
  residuals_type = pearson,
  gp = shading_hcl, gp_args = list(interpolate = 1.2))

## test whether there is no three-way interaction
## (i.e., dependence of case on drug is the same for both age groups)
m2 - loglm(~ (age + drug + case)^2, data = tab)
m2

## visualization (with default pearson residuals)
mosaic(case ~ drug | age, data = tab,
  expected = ~ (age + drug + case)^2,
  split_vertical = c(TRUE, TRUE, FALSE),
  gp = shading_hcl, gp_args = list(interpolate = 1.2))



Many thanks for your help.

My best.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looking for a better code for my problem.

2013-04-24 Thread Jorge I Velez

Try

subset(Dat, AA == A | (AA == B  BB == b))

HTH,
Jorge.-


On Wed, Apr 24, 2013 at 8:21 PM, Christofer Bogaso 
bogaso.christo...@gmail.com wrote:

 Hello again,

 Let say I have following data:

 Dat - structure(list(AA = structure(c(3L, 1L, 2L, 1L, 2L, 3L, 3L, 2L,
 3L, 1L, 1L, 3L, 3L, 2L, 2L, 3L, 2L, 1L, 1L, 1L), .Label = c(A,
 B, C), class = factor), BB = structure(c(2L, 3L, 2L, 2L,
 2L, 3L, 2L, 2L, 2L, 1L, 1L, 2L, 3L, 1L, 3L, 2L, 1L, 2L, 2L, 3L
 ), .Label = c(a, b, c), class = factor), CC = 1:20), .Names =
 c(AA,
 BB, CC), row.names = c(NA, -20L), class = data.frame)

 Now I want to select a subset of that 'Dat', for which:
 1. First column will contain ALL A
 2. First column will contain those B for which BB = b in second column.

 Therefore I tries following:

  Only_A - Dat[Dat[, 'AA'] == A, ]
  Only_B - Dat[Dat[, 'AA'] == B, ]
  rbind(Only_A, Only_B[Only_B[, 'BB'] == b, ])
AA BB CC
 2   A  c  2
 4   A  b  4
 10  A  a 10
 11  A  a 11
 18  A  b 18
 19  A  b 19
 20  A  c 20
 3   B  b  3
 5   B  b  5
 8   B  b  8


 However I believe there must be some better code to achieve that which
 is tidier, i.e. there must be some 1-liner code.

 Can somebody suggest any better approach if possible?

 Thanks and regards,

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extract function extracting only NA values

So I think I might have found what is causing this problem. The values I 
have in this raster


 summary(ps0011yme)
layer
Min.-9458.911
1st Qu.   1955256.000
Median   10618870.000
3rd Qu.  79577730.000
Max.167780500.000
NA's0.000

From ArcMap though I get different values
min -69826224 and max 167780496

So maybe when I am importing the rest are in R something goes wrong and 
all the values below a certain threshold are considered NA. Is this a 
bug or do I get to use a specific parameter for the raster function?



On 4/24/2013 13:10, Fabio Berzaghi wrote:

Hello,

I have five raster files in ASCII format. With four of them I have no 
problem extracting values based on a set of X and Y coordinates. 
Unfortunately with one of the files all I managed to extract is NA 
values. To verify the problem I have opened the raster with ArcMap and 
there are no NA values where I am extracting. I have also plotted the 
spatial point class on top of the raster in R and it does correspond 
to the correct locations.


These are some of the commands I am using, and as I already pointed 
out that works perfectly with other raster files.


sp-SpatialPoints(xysp)
xy$rasterimg-extract(rasterimg,sp)

Can anyone help? At this point I am rather clueless about this.

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bootstrapping a 11x10 matrix

2013-04-24 Thread Adams, Jean

Herbejie Rose,

You could use the boot() function in the R package boot.  For example:

# example data matrix
m - matrix(sample(11*10), ncol=10)

# function to calculate column means for indexed rows of matrix
myfun - function(data, i) {
apply(data[i, ], 2, mean)
}

# 1000 bootstrap samples
library(boot)
myboot - boot(m, myfun, R=1000)
myboot
dim(myboot$t)

Jean



On Tue, Apr 23, 2013 at 10:41 AM, Herbejie Rose Cuizon 
herbejier...@gmail.com wrote:

 Good evening! This is Herbejie Rose

 I need your help for me to compute the following:

 I just want to ask on how to bootstrap a 11x10 matrix  to obtain 1000
 bootstrap samples and compute the 10 dimensional mean per bootstrap sample.
 the 11x10 dimension of the matrix has 11 subjects and 10 variables. Just
 want to have 11x10 per bootrap sample with replacement and compute the 10
 dimensional mean.


 Hope to have a positive response on this matter..


 Thank you so much.



 Sincerely yours,


 Herbejie Rose

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] News package

2013-04-24 Thread R. Michael Weylandt

On Wed, Apr 24, 2013 at 9:02 AM, Aswathy Nair ashy4...@gmail.com wrote:
 Hi,

 Is there any package available in R to download news content?

What news source are you looking for?

You could, e.g., use the twitteR package, but to my knowledge for
things like Google News or the BBC you'll need to likely roll your own
with the XML and RCurl packages.

MW

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’


On 13-04-24 5:46 AM, Liviu Andronic wrote:

Dear all,
I've bumped into the: Error in loadNamespace(name) : there is no
package called ‘R.utils’ error. I've already read a bit on this (
http://www.cybaea.net/Blogs/Data/A-warning-on-the-R-save-format.html )
but I have a follow-up question.

Given a workspace that automatically loads a package that I don't
really need/want (e.g. ‘R.utils’), how do I identify which object
requires this package to load? I would like to avoid loading ‘R.utils’
every time I open an R session.


That's not easy, because the code in R that triggers that error has no 
idea of the name of the object it is loading.


You could try a binary search to find out, but it will be tedious:
1. Install R.utils.
2. Load the workspace successfully.
3. Delete half the objects, and save it.
4. Uninstall R.utils, and see if you can load the workspace.

At this point you'll know if there's an object needing R.utils still 
left or not, and you can repeat the steps until you find a single object 
that causes the problem.  (But it might not be the only one, so deleting 
it from the original workspace might not solve your problem.)


A better approach is to *never* save and load workspaces unless you know 
exactly what is in them.  Always reply no to the question about saving 
your workspace (or set that as the default).  If you accidentally end up 
with a workspace being loaded, delete it.


Duncan Murdoch



Regards,
Liviu



sessionInfo()

R version 2.15.3 (2013-03-01)
Platform: i686-pc-linux-gnu (32-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8
  [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] datasets  grDevices splines   graphics  utils stats
methods   base

other attached packages:
  [1] R.utils_1.23.2R.oo_1.13.0   R.methodsS3_1.4.2
tables_0.7.57 reshape2_1.2.2
  [6] car_2.0-15nnet_7.3-6MASS_7.3-23
Hmisc_3.10-1  survival_2.37-2
[11] foreign_0.8-53

loaded via a namespace (and not attached):
[1] cluster_1.14.3   grid_2.15.3  lattice_0.20-13  plyr_1.8
  rstudio_0.97.312
[6] stringr_0.6.2tools_2.15.3




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’

2013-04-24 Thread Martin Morgan


On 04/24/2013 06:03 AM, Duncan Murdoch wrote:

On 13-04-24 5:46 AM, Liviu Andronic wrote:

Dear all,
I've bumped into the: Error in loadNamespace(name) : there is no
package called ‘R.utils’ error. I've already read a bit on this (
http://www.cybaea.net/Blogs/Data/A-warning-on-the-R-save-format.html )
but I have a follow-up question.

Given a workspace that automatically loads a package that I don't
really need/want (e.g. ‘R.utils’), how do I identify which object
requires this package to load? I would like to avoid loading ‘R.utils’
every time I open an R session.


That's not easy, because the code in R that triggers that error has no idea of
the name of the object it is loading.


Maybe traceback() can provide some hints? I did, more or less arbitrarily

library(rms)
a = list(fun=ie.setup)
save(a, file=/tmp/a.rda)
remove.packages(rms)

and then in a new session

 load(/tmp/a.rda)
Error in loadNamespace(name) : there is no package called 'rms'
 traceback()
7: stop(e)
6: value[[3L]](cond)
5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
4: tryCatchList(expr, classes, parentenv, handlers)
3: tryCatch(loadNamespace(name), error = function(e) stop(e))
2: getNamespace(c(rms, 3.6-3))
1: load(/tmp/a.rda)

with the line numbered 2 giving me the necessary hint.

Martin




You could try a binary search to find out, but it will be tedious:
1. Install R.utils.
2. Load the workspace successfully.
3. Delete half the objects, and save it.
4. Uninstall R.utils, and see if you can load the workspace.

At this point you'll know if there's an object needing R.utils still left or
not, and you can repeat the steps until you find a single object that causes the
problem.  (But it might not be the only one, so deleting it from the original
workspace might not solve your problem.)

A better approach is to *never* save and load workspaces unless you know exactly
what is in them.  Always reply no to the question about saving your workspace
(or set that as the default).  If you accidentally end up with a workspace being
loaded, delete it.

Duncan Murdoch



Regards,
Liviu



sessionInfo()

R version 2.15.3 (2013-03-01)
Platform: i686-pc-linux-gnu (32-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8
  [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] datasets  grDevices splines   graphics  utils stats
methods   base

other attached packages:
  [1] R.utils_1.23.2R.oo_1.13.0   R.methodsS3_1.4.2
tables_0.7.57 reshape2_1.2.2
  [6] car_2.0-15nnet_7.3-6MASS_7.3-23
Hmisc_3.10-1  survival_2.37-2
[11] foreign_0.8-53

loaded via a namespace (and not attached):
[1] cluster_1.14.3   grid_2.15.3  lattice_0.20-13  plyr_1.8
  rstudio_0.97.312
[6] stringr_0.6.2tools_2.15.3




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about the loglm and glm---Re:Re: Regression on stratified count data

2013-04-24 Thread Achim Zeileis


On Wed, 24 Apr 2013, meng wrote:


Hi,Achim:
Can all the analysis you mentioned via loglm be performed via
glm(...,family=poisson)?


Yes.

## transform table back to data.frame
df - as.data.frame(tab)

## fit models: conditional independence, no-three way interaction,
## and saturated
g1 - glm(Freq ~ age/(drug + case), data = df, family = poisson)
g2 - glm(Freq ~ (age + drug + case)^2, data = df, family = poisson)
g3 - glm(Freq ~ age * drug * case, data = df, family = poisson)

## likelihood-ratio tests (against saturated)
anova(g1, g3, test = Chisq)
anova(g2, g3, test = Chisq)

## compare fitted frequencies (which are essentially equal)
all.equal(as.numeric(fitted(g1)),
  as.data.frame(as.table(fitted(m1)))$Freq)
all.equal(as.numeric(fitted(g2)),
  as.data.frame(as.table(fitted(m2)))$Freq)

The difference is mainly that loglm() has a specialized user interface and 
that it uses a different optimizer (iterative proportional fitting rather 
than iterative reweighted least squares).


Best,
Z


Many thanks.







At 2013-04-24 19:37:10,Achim Zeileis achim.zeil...@uibk.ac.at wrote:
On Wed, 24 Apr 2013, meng wrote:

 Hi all:
 For stratified count data,how to perform regression analysis?

 My data:
 age case oc count
 1   1 121
 1   1 226
 1   2 117
 1   2 259
 2   1 118
 2   1 288
 2   2 1 7
 2   2 2   95

 age:
 1:40y
 2:40y

 case:
 1:patient
 2:health

 oc:
 1:use drug
 2:not use drug

 My purpose:
 Anaysis whether case and oc are correlated, and age is a stratified varia
ble.

 My solution:
 1,Mantel-Haenszel test by using function mantelhaen.test
 2,loglinear regression by using function glm(count~case*oc,family=poisson
).But I don't know how to handle variable age,which is the stratified vari
able.

Instead of using glm(family = poisson) it is also convenient to use 
loglm() from package MASS for the associated convenience table.

The code below shows how to set up the contingency table, visualize it 
using package vcd, and then fit two models using loglm. The models 
considered are conditional independence of case and drug given age and the 
no three-way interaction already suggested by Peter. Both models are 
also accompanied by visualizations of the residuals.

## contingency table with nice labels
tab - expand.grid(drug = 1:2, case = 1:2, age = 1:2)
tab$count - c(21, 26, 17, 59, 18, 88, 7, 95)
tab$age - factor(tab$age, levels = 1:2, labels = c(40, 40))
tab$case - factor(tab$case, levels = 1:2, labels = c(patient, 
healthy))
tab$drug - factor(tab$drug, levels = 1:2, labels = c(yes, no))
tab - xtabs(count ~ age + drug + case, data = tab)

## visualize case explained by drug given age
library(vcd)
mosaic(case ~ drug | age, data = tab,
   split_vertical = c(TRUE, TRUE, FALSE))

## test wheter drug and case are independent given age
m1 - loglm(~ age/(drug + case), data = tab)
m1

## visualize corresponding residuals from independence model
mosaic(case ~ drug | age, data = tab,
   split_vertical = c(TRUE, TRUE, FALSE),
   residuals_type = deviance,
   gp = shading_hcl, gp_args = list(interpolate = 1.2))
mosaic(case ~ drug | age, data = tab,
   split_vertical = c(TRUE, TRUE, FALSE),
   residuals_type = pearson,
   gp = shading_hcl, gp_args = list(interpolate = 1.2))

## test whether there is no three-way interaction
## (i.e., dependence of case on drug is the same for both age groups)
m2 - loglm(~ (age + drug + case)^2, data = tab)
m2

## visualization (with default pearson residuals)
mosaic(case ~ drug | age, data = tab,
   expected = ~ (age + drug + case)^2,
   split_vertical = c(TRUE, TRUE, FALSE),
   gp = shading_hcl, gp_args = list(interpolate = 1.2))


 Many thanks for your help.

 My best.
    [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.h
tml
 and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extract function extracting only NA values


ok so i imported the raster as a dataframe and everything is fine
   Min.   :-69826220
   Max.   :167780500

so it's something with values lower than - that are interpreted as 
NA by the raster function



On 4/24/2013 13:52, Fabio Berzaghi wrote:
So I think I might have found what is causing this problem. The values 
I have in this raster


 summary(ps0011yme)
layer
Min.-9458.911
1st Qu.   1955256.000
Median   10618870.000
3rd Qu.  79577730.000
Max.167780500.000
NA's0.000

From ArcMap though I get different values
min -69826224 and max 167780496

So maybe when I am importing the rest are in R something goes wrong 
and all the values below a certain threshold are considered NA. Is 
this a bug or do I get to use a specific parameter for the raster 
function?



On 4/24/2013 13:10, Fabio Berzaghi wrote:

Hello,

I have five raster files in ASCII format. With four of them I have no 
problem extracting values based on a set of X and Y coordinates. 
Unfortunately with one of the files all I managed to extract is NA 
values. To verify the problem I have opened the raster with ArcMap 
and there are no NA values where I am extracting. I have also plotted 
the spatial point class on top of the raster in R and it does 
correspond to the correct locations.


These are some of the commands I am using, and as I already pointed 
out that works perfectly with other raster files.


sp-SpatialPoints(xysp)
xy$rasterimg-extract(rasterimg,sp)

Can anyone help? At this point I am rather clueless about this.

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to make a raster image in R from my own data set

2013-04-24 Thread Kristi Glover

Hi R-user,
I was trying to make a raster map with WGS84 projection in R, but I could not 
make it. I found one data set in Google that data is almost the same format as 
of mine. I wanted to make a raster map of temperature with 1 degree spatial 
resolution for the global scale. 
I could make it in GIS software but I do have many variables (to be many raster 
images) and ultimately I am importing them to R for further analysis. 
Therefore, I wanted to make them in R, if possible. 

It would be great if you give some hints on how script look like  in creating a 
raster map from my own data set (I have provided link for your references, this 
is an example data set).

I am really appropriating for your help.

#--
#create a raster map from scratch

install.packages(raster, dependencies=TRUE)
library(raster)  # raster data
install.packages(rgdal, dependencies=TRUE)
library(rgdal)  # input/output, projections
install.packages(rgeos, dependencies=TRUE)
library(rgeos)  # geometry ops
install.packages(spdep, dependencies=TRUE)
library(spdep)  # spatial dependence
install.packages(pastecs, dependencies=TRUE)
library(pastecs)
pts-read.table.url(https://www.betydb.org//miscanthusyield.csv;, header=T, 
sep=,)
proj4string(pts)=- CRS(+proj=longlat +datum=WGS84)
#---

Cheers,
Kristi
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’


On 13-04-24 10:12 AM, Martin Morgan wrote:

On 04/24/2013 06:03 AM, Duncan Murdoch wrote:

On 13-04-24 5:46 AM, Liviu Andronic wrote:

Dear all,
I've bumped into the: Error in loadNamespace(name) : there is no
package called ‘R.utils’ error. I've already read a bit on this (
http://www.cybaea.net/Blogs/Data/A-warning-on-the-R-save-format.html )
but I have a follow-up question.

Given a workspace that automatically loads a package that I don't
really need/want (e.g. ‘R.utils’), how do I identify which object
requires this package to load? I would like to avoid loading ‘R.utils’
every time I open an R session.


That's not easy, because the code in R that triggers that error has no idea of
the name of the object it is loading.


Maybe traceback() can provide some hints? I did, more or less arbitrarily

library(rms)
a = list(fun=ie.setup)
save(a, file=/tmp/a.rda)
remove.packages(rms)

and then in a new session

   load(/tmp/a.rda)
Error in loadNamespace(name) : there is no package called 'rms'
   traceback()
7: stop(e)
6: value[[3L]](cond)
5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
4: tryCatchList(expr, classes, parentenv, handlers)
3: tryCatch(loadNamespace(name), error = function(e) stop(e))
2: getNamespace(c(rms, 3.6-3))
1: load(/tmp/a.rda)

with the line numbered 2 giving me the necessary hint.


That tells you that some object needs the rms package, but I don't see 
how you would know it is a that is the problem.  We already knew that 
rms was needed from the error message.


What I've done sometimes in debugging is to change that error to a 
warning in the getNamespace() function, and add some tracing code to the 
serialization code to print the names of objects as they are loaded. 
(This goes in ReadItem in src/main/serialize.c.)


I wouldn't expect Liviu to make those changes, but perhaps a verbose 
option could be added to load(), so that it could be available to users.


Duncan Murdoch



Martin




You could try a binary search to find out, but it will be tedious:
1. Install R.utils.
2. Load the workspace successfully.
3. Delete half the objects, and save it.
4. Uninstall R.utils, and see if you can load the workspace.

At this point you'll know if there's an object needing R.utils still left or
not, and you can repeat the steps until you find a single object that causes the
problem.  (But it might not be the only one, so deleting it from the original
workspace might not solve your problem.)

A better approach is to *never* save and load workspaces unless you know exactly
what is in them.  Always reply no to the question about saving your workspace
(or set that as the default).  If you accidentally end up with a workspace being
loaded, delete it.

Duncan Murdoch



Regards,
Liviu



sessionInfo()

R version 2.15.3 (2013-03-01)
Platform: i686-pc-linux-gnu (32-bit)

locale:
   [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8
   [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
   [7] LC_PAPER=C LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] datasets  grDevices splines   graphics  utils stats
methods   base

other attached packages:
   [1] R.utils_1.23.2R.oo_1.13.0   R.methodsS3_1.4.2
tables_0.7.57 reshape2_1.2.2
   [6] car_2.0-15nnet_7.3-6MASS_7.3-23
Hmisc_3.10-1  survival_2.37-2
[11] foreign_0.8-53

loaded via a namespace (and not attached):
[1] cluster_1.14.3   grid_2.15.3  lattice_0.20-13  plyr_1.8
   rstudio_0.97.312
[6] stringr_0.6.2tools_2.15.3




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’

2013-04-24 Thread Liviu Andronic

Dear Duncan,


On Wed, Apr 24, 2013 at 4:57 PM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
 What I've done sometimes in debugging is to change that error to a warning
 in the getNamespace() function, and add some tracing code to the
 serialization code to print the names of objects as they are loaded. (This
 goes in ReadItem in src/main/serialize.c.)

 I wouldn't expect Liviu to make those changes, but perhaps a verbose
 option could be added to load(), so that it could be available to users.

That would be useful, indeed. As it stands save()/load() workspaces in
R seems very fragile and could easily trip users when working on
multiple machines or sharing their workspace with a colleague. In such
cases it is important to be able to quickly pinpoint the offending
object.

Regards,
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’

2013-04-24 Thread Liviu Andronic

Dear Duncan,


On Wed, Apr 24, 2013 at 3:03 PM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
 A better approach is to *never* save and load workspaces unless you know
 exactly what is in them.  Always reply no to the question about saving
 your workspace (or set that as the default).  If you accidentally end up
 with a workspace being loaded, delete it.

I must admit that I'm a bit surprised by this. I was always under the
impression that saving/restoring workspaces was the proper workflow in
R. If you use R interactively (e.g., not by running scripts), how else
would you store your data, intermediary results, etc., while working
on a project? Am I missing something?

Regards,
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’

2013-04-24 Thread R. Michael Weylandt

On Wed, Apr 24, 2013 at 4:08 PM, Liviu Andronic landronim...@gmail.com wrote:
 Dear Duncan,


 On Wed, Apr 24, 2013 at 3:03 PM, Duncan Murdoch
 murdoch.dun...@gmail.com wrote:
 A better approach is to *never* save and load workspaces unless you know
 exactly what is in them.  Always reply no to the question about saving
 your workspace (or set that as the default).  If you accidentally end up
 with a workspace being loaded, delete it.

 I must admit that I'm a bit surprised by this. I was always under the
 impression that saving/restoring workspaces was the proper workflow in
 R. If you use R interactively (e.g., not by running scripts), how else
 would you store your data, intermediary results, etc., while working
 on a project? Am I missing something?

I save the objects themselves (rather than the whole workspace) using
saveRDS() and readRDS() -- which I *think* are considered better
practice than save() and load() because they don't force names on you.

Cheers,
MW

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with raster function with values lower than -9999

As I have reported in a previous thread, I have been having problems 
importing a ASCII raster that has values that go from Min. :-69826220 
to  Max.   :167780500. The problem I am encountering is that when I use 
the raster function to import the ASCII file then every value smaller 
than - is reported as NA and the minimum value is -9458.


Is this a bug of the function and is there a workaround? When I import 
the same ASCII file as a data frame everything is fine and I get the 
whole range of values.


Any help is appreciated

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’

2013-04-24 Thread Hadley Wickham

 I must admit that I'm a bit surprised by this. I was always under the
 impression that saving/restoring workspaces was the proper workflow in
 R. If you use R interactively (e.g., not by running scripts), how else
 would you store your data, intermediary results, etc., while working
 on a project? Am I missing something?

You don't _store_ intermediate results - you recreate them from your
script.  If they are time consuming, then you can use readRDS/saveRDS
to cache the results. If you don't start with a clean workspace
frequently, it's difficult to tell whether or not your code is
reproducible.

Hadley

--
Chief Scientist, RStudio
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the joy of spreadsheets (off-topic)

2013-04-24 Thread peter dalgaard

In case you haven't noticed, this is making the rounds in the media, including
a handful of references to R. See e.g.

http://news.slashdot.org/story/13/04/17/0215211/excel-error-contributes-to-problems-with-austerity-study

I suppose we can't fortune()'ify anonymous quotes, but I kind of like this
exchange:

Bacon Bits: SPSS and R are very good at statistical analysis. Quantrix,
MapleSoft, IBM Algorithmics, and other software is for financial data modeling.
None of those is particularly appropriate for sharing data in a useful format
with peers. Excel is.

Hatta: R is extremely appropriate for sharing data in a useful format with
peers. It's completely free for one. But more importantly, it saves every
single step of your analysis. Send someone an Excel file, and who knows what
they've done to the data. Send someone your R project directory and they can
see exactly what you did.

The problem with sending R files to your peers isn't that the R files aren't
useful. It's that your peers aren't.

On Apr 16, 2013, at 19:25 , Sarah Goslee wrote:

Given that we occasionally run into problems with comparing Excel
results to R results, and other spreadsheet-induced errors, I thought
this might be of interest.

http://www.nextnewdeal.net/rortybomb/researchers-finally-replicated-reinhart-rogoff-and-there-are-serious-problems

The punchline:

If this error turns out to be an actual mistake Reinhart-Rogoff made,
well, all I can hope is that future historians note that one of the
core empirical points providing the intellectual foundation for the
global move to austerity in the early 2010s was based on someone
accidentally not updating a row formula in Excel.

Ouch.

(Note: I know nothing about the site, the author of the article, or
the study in question. I was pointed to it by someone else. But if
true: highly problematic.)

Sarah

--
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk Priv: pda...@gmail.com

Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’

2013-04-24 Thread John Kane

 -Original Message-
 From: h.wick...@gmail.com
 Sent: Wed, 24 Apr 2013 10:48:01 -0500
 To: landronim...@gmail.com
 Subject: Re: [R] identify object that causes Error in
 loadNamespace(name) : there is no package called ‘R.utils’

 I must admit that I'm a bit surprised by this. I was always under the
 impression that saving/restoring workspaces was the proper workflow in
 R. If you use R interactively (e.g., not by running scripts), how else
 would you store your data, intermediary results, etc., while working
 on a project? Am I missing something?

 You don't _store_ intermediate results - you recreate them from your
 script.  If they are time consuming, then you can use readRDS/saveRDS
 to cache the results. If you don't start with a clean workspace
 frequently, it's difficult to tell whether or not your code is
 reproducible.

 Hadley

 --
 Chief Scientist, RStudio
 http://had.co.nz/

Not to mention http://www.mail-archive.com/r-help@r-project.org/msg196997.html

John Kane
Kingston ON Canada

GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need to replicate Boltzman Signmodial Curve fit from Graph Pad

2013-04-24 Thread David Carlson

Sorry. I left out a line:

 pHvals - seq(min(dta$pH), max(dta$pH), length.out=100)

Should be inserted just before the lines() function.

-
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77840-4352

From: J J [mailto:rnoob...@gmail.com] 
Sent: Tuesday, April 23, 2013 6:49 PM
To: David Carlson
Subject: Re: [R] Need to replicate Boltzman Signmodial Curve fit from Graph
Pad

Thank you David! That looks really close, I believe Graphpad gave a V50 of
 7.244.

One thing I missed on your example was where you obtained 'pHvals' from?


On Tue, Apr 23, 2013 at 5:17 PM, David Carlson dcarl...@tamu.edu wrote:
You may want to compare R's four point logistic with the results you get
using Graphpad. For these data the Bottom parameter is not significantly
different from zero (so the three point sigmoid might be adequate), but as
Bert points out, especially with only 8 data points, it is probably
overfitting the model.

 dta - read.table(text=pH counts
+ 3.8 968
+ 5.0 1347
+ 5.8 2867
+ 6.6 9203
+ 7.0 15817
+ 7.4 20297
+ 8.2 31916
+ 9.2 35756,
+ header=TRUE)
 Sig.nls - nls(counts~SSfpl(pH, Bottom, Top, V50, Slope), dta)
 summary(Sig.nls)

Formula: counts ~ SSfpl(pH, Bottom, Top, V50, Slope)

Parameters:
          Estimate  Std. Error t value     Pr(|t|)
Bottom   718.76871   579.25327   1.241     0.282463
Top    36983.98871  1008.31727  36.679 0.032986 ***
V50        7.24930     0.05054 143.436 0.000142 ***
Slope      0.55582     0.05023  11.065     0.000379 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 767.1 on 4 degrees of freedom

Number of iterations to convergence: 0
Achieved convergence tolerance: 0.03808
 options(scipen=5)
 confint(Sig.nls)
Waiting for profiling to be done...
                2.5%         97.5%
Bottom  -963.4106068  2243.7154750
Top    34499.9962176 40150.4114887
V50        7.1173103     7.4033391
Slope      0.4382893     0.7114438
 coef(Sig.nls)
       Bottom           Top           V50         Slope
  718.7687104 36983.9887074     7.2493032     0.5558236
 plot(dta, pch=16)
 lines(pHvals, predict(Sig.nls, data.frame(pH=pHvals)), col=red)

-
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Bert Gunter
Sent: Tuesday, April 23, 2013 12:04 PM
To: John Kane
Cc: r-help@r-project.org; J J
Subject: Re: [R] Need to replicate Boltzman Signmodial Curve fit from Graph
Pad

I think you'll find that this is just an alternate parameterization of a 4 P
logistic.
... and in any case there is no statistically detectable difference between
the two.

That is, this is just basically a Graphpad marketing ploy. Fit such data
with anyone's logistic function ... and that is probably overfitting in many
cases, anyway.

Cheers,
Bert

On Tue, Apr 23, 2013 at 9:54 AM, John Kane jrkrid...@inbox.com wrote:
 No idea of the area but does this link help?
 http://bioinformatics.oxfordjournals.org/content/24/13/1549.full

 John Kane
 Kingston ON Canada


 -Original Message-
 From: rnoob...@gmail.com
 Sent: Tue, 23 Apr 2013 10:55:59 -0400
 To: r-help@r-project.org
 Subject: [R] Need to replicate Boltzman Signmodial Curve fit from
 Graph Pad

 Hello useRs (please don't kill me),

 I've fairly new to R having only a few months of playing around with R.
 What little I've learned has been extremely useful.

 If someone could point me as to how to replicate the Boltzman
 Sigmodial curve fit as provided by Graphpad software I'd be eternally
grateful.

 Where  we currently use Graphpad for only this one function,its seems
 highly inefficient for a $100 piece of software to be used for only
 one function (which isn't nearly as bad as the company's pandemic
 reliance on Excel for nearly everything else).

 so anyway, what I'd normally do is take a set of data like this:

  pH counts  3.8 968  5.0 1347  5.8 2867  6.6 9203  7.0 15817  7.4
 20297
 8.2
 31916  9.2 35756
 then fit this to a Boltzman sigmodial in Graphpad. Graphpad spits out
 a much longer set of vectors and also information about v50 and
 confidence intervals etc (if anyone is familiar with that software).
 This is what i'd like to replicate.

 It might be that I just don't know what i'm doing,so feel free to
 call me an idiot but any help is greatly appreciated!

 -JJ

       [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’


On 13-04-24 11:08 AM, Liviu Andronic wrote:

Dear Duncan,


On Wed, Apr 24, 2013 at 3:03 PM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:

A better approach is to *never* save and load workspaces unless you know
exactly what is in them.  Always reply no to the question about saving
your workspace (or set that as the default).  If you accidentally end up
with a workspace being loaded, delete it.


I must admit that I'm a bit surprised by this. I was always under the
impression that saving/restoring workspaces was the proper workflow in
R. If you use R interactively (e.g., not by running scripts), how else
would you store your data, intermediary results, etc., while working
on a project? Am I missing something?


I think Michael and Hadley have already given good advice.  I'll just 
add one thing:  If you use R interactively, you should still be running 
scripts, just running them within an interactive session.  That way you 
have a record of what you did, and can back up (by starting over and 
running everything up to a known good spot) to make corrections.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Trouble Computing Type III SS in a Cox Regression using drop1 and Anova

2013-04-24 Thread Paul Miller

Hello All,

Am having some trouble computing Type III SS in a Cox Regression using either 
drop1 or Anova from the car package. Am hoping that people will take a look to 
see if they can tell what's going on.

Here is my R code:

cox3grp - subset(survData,
Treatment %in% c(DC, DA, DO),
c(PTNO, Treatment, PFS_CENSORED, PFS_MONTHS, AGE, PS2))
cox3grp - droplevels(cox3grp)
str(cox3grp)

coxCV - coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data=cox3grp, 
method = efron)
coxCV

drop1(coxCV, test=Chisq)

require(car)
Anova(coxCV, type=III) 

And here are my results:

cox3grp - subset(survData,
+   Treatment %in% c(DC, DA, DO),
+   c(PTNO, Treatment, PFS_CENSORED, PFS_MONTHS, AGE, 
PS2))
 cox3grp - droplevels(cox3grp)
 str(cox3grp)
'data.frame':   227 obs. of  6 variables:
 $ PTNO: int  1195997 104625 106646 1277507 220506 525343 789119 817160 
824224 82632 ...
 $ Treatment   : Factor w/ 3 levels DC,DA,DO: 1 1 1 1 1 1 1 1 1 1 ...
 $ PFS_CENSORED: int  1 1 1 0 1 1 1 1 0 1 ...
 $ PFS_MONTHS  : num  1.12 8.16 6.08 1.35 9.54 ...
 $ AGE : num  72 71 80 65 72 60 63 61 71 70 ...
 $ PS2 : Ord.factor w/ 2 levels YesNo: 2 2 2 2 2 2 2 2 2 2 ...
 
 coxCV - coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data=cox3grp, 
 method = efron)
 coxCV
Call:
coxph(formula = Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, 
data = cox3grp, method = efron)


  coef exp(coef) se(coef)  z p
AGE0.00492 1.005  0.00789  0.624 0.530
PS2.L -0.34523 0.708  0.14315 -2.412 0.016

Likelihood ratio test=5.66  on 2 df, p=0.0591  n= 227, number of events= 198 
 
 drop1(coxCV, test=Chisq)
Single term deletions

Model:
Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2
   DfAICLRT Pr(Chi)  
none1755.2  
AGE 1 1753.6 0.3915  0.53151  
PS2 1 1758.4 5.2364  0.02212 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 
 require(car)
 Anova(coxCV,  type=III)
Analysis of Deviance Table (Type III tests)
LR Chisq Df Pr(Chisq)  
AGE   0.3915  10.53151  
PS2   5.2364  10.02212 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 

Both drop1 and Anova give me a different p-value than I get from coxph for both 
my two-level ps2 variable and for age. This is not what I would expect based on 
experience using SAS to conduct similar analyses. Indeed SAS consistently 
produces the same p-values. Namely the ones I get from coxph. 

My sense is that I'm probably misusing R in some way but I'm not sure what I'm 
likely to be doing wrong. SAS prodcues Wald Chi-Square results for its type III 
tests. Maybe that has something to do with it. Ideally, I'd like to get type 
III values that match those from coxph. If anyone could help me understand 
better, that would be greatly appreciated.

Thanks,

Paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3

Dear users of R
I have a subroutine in Fortran95, compiled to a DLL with gfortran in Cygwin
4.5.3.
The subroutine is:
subroutine MyPBP( S, p, N )
! Expose subroutine rtest to users of this DLL
!DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: mypbp_ ::mypbp
! This function computes the Poisson-Binomial distribution
! of size N using p
double precision, intent(inout) :: S(N+1)
double precision, intent(in) :: p(N)
integer, intent(in) :: N
double precision :: X(N+1)
integer i, j
!X=0
!S=0
X(1) = 1 - p(1)
X(2) = p(1)
do i = 2, N
S(1) = X(1)*(1-p(i))
do j = 2,i
S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
end do
S(i+1) = X(i)*p(i)
X = S
if (i == N) then
S = X
end if
end do
end subroutine MyPBP
and it is saved into Mango.f95
I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
gfortran-4 -shared -o Mango.dll Mango.o
I am on a Windows machine running Windows 7 with Intel i7.
I load the dll in a 32-bit R by dyn.load(Mango.dll). Using getLoadedDLLs
I can see the DLL. However, is.loaded(Mango.dll) = FALSE. In addition, R
stop responding when I try .Fortran(MyPBP, as.numeric(S), as.numeric(p),
as.integer(N)),
where N-5, S-array(0,N+1) and p- c(0.1, 0.2, 0.5, 0.8, 0.9).

What am I doing wrong?
Any ideas, thoughts and/or comments are highly appreciated.

Jens

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RDA permutest envfit

2013-04-24 Thread Rémi Lesmerises

Dear all,

I did a RDA and when I looked to the signification of the test with permutest, 
the output was non-significant. But when I used the envfit function, some of 
theÂ vectors are significant. All the test's conditions are respected. What it 
means? Is it an error in the script?


Commands and output:

 permutest(rda.ind, perm=999, first=TRUE)

Permutation test for rdaÂ 

Call: rda(formula = x ~ ARH_frs + CON_frs + PRP_frs + RUI_frs + VAM_frs, data = 
z)
Permutation test for first constrained eigenvalue
Pseudo-F: Â  Â  Â  Â 4.093568 (with 1, 10 Degrees of Freedom)
Significance: Â  Â 0.413Â 
Based on 999 permutations under reduced model.

 fit - envfit(rda.ind, z, perm = 999, display = lc)
 fit

***VECTORS

Â  Â  Â  Â  Â  Â  Â RDA1 Â  Â  Â RDA2 Â  Â  r2 Pr(r) Â  Â 
VAM_frs Â 0.145281 -0.989390 0.2388 Â 0.147 Â  Â 

ARH_frs -0.876494 -0.481413 0.6127 Â 0.002 **Â 
CON_frs Â 0.904278 Â 0.426944 0.4846 Â 0.013 * Â 
PRP_frs -0.997634 Â 0.068755 0.9433 Â 0.001 ***
RUI_frs -0.648512 -0.761204 0.6243 Â 0.004 **Â 
---
Signif. codes: Â 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â 
â 1Â 
P values based on 999 permutations.

Â 
RÃ©mi Lesmerises, biol. M.Sc.,
Candidat Ph.D. en Biologie
UniversitÃ© du QuÃ©bec Ã  Rimouski
remilesmeri...@yahoo.ca

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sum up column values according to row id

2013-04-24 Thread Matteo Mura

Dear All,

here a problem I think many of you can solve in few minutes.

I have a dataframe which contains values of plot id, diameters, heigths and
basal area of trees, thus columns names are: id | dbh | h | g

head(ipso, n=10)id dbh h  g
1  FPE0164  36 13.62 0.10178760
2  FPE0164  31 12.70 0.07547676
21 FPE1127  57 18.85 0.25517586
13 FPE1127  39 15.54 0.11945906
12 FPE1127  34 14.78 0.09079203
6  FPE1127  32 15.12 0.08042477
5  FPE1127  28 14.13 0.06157522
15 FPE1127  27 13.50 0.05725553
19 FPE1127  25 13.28 0.04908739
11 FPE1127  19 11.54 0.02835287

from here I need to calculate the mean of the six greater g_ith for
each id_ith. The clauses are that:

if length(id) =6

do the mean of the first six greaters g


else
do the mean of all the g_ith in the id_ith (in head print above e.g.
for the id==FPE0164 do the mean of just these two values of g).

The g are already ordered by id ascending and g descending using:

ipso - ipso[with(ipso, order(ipso$id, -ipso$g)), ] # Order for id
ascending and g descending

I tried a lot of for loops and tapply() without results.

Can anyone help me to solve this?

Thanks for your attention

Best whishes

Matteo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R Interactive Mode

2013-04-24 Thread Hrachya Astsatryan

Dear all,

We are doing some research about the time series analysis of NDVI, and 
we found the NDVITS package which is a very great tool.
Unfortunately when we run it, after  TimeSeriesAnalysis it asks to enter 
Village or Country.

library(ndvits, lib.loc=/home/vahe/R/i686-pc-linux-gnu-library/2.15)
ndvidirectory=paste(system.file(extdata/VITO_Mzimba,
 package=ndvits), /, sep=)
region=Mzimba
Ystart=2004
Yend=2006
shape=SLP_Mzimba
shapedir=paste(system.file(extdata/shape, package=ndvits),
/, sep=)
outfile = mzimbaTS2.txt
outfile2 = MzimbaTS2.pdf
outfiel3 = my.pdf
signal = TimeSeriesAnalysis(shape, shapedir, ndvidirectory, region, 
Ystart, Yend, outfile, outfile2)


How it is possible to call the package by default indicating Village 
option /we don't want to enter the parameter and don't want to change 
anything in the code which is quite difficult/.

Thank you in advance,
Hrach
-- 
Hrachya Astsatryan
Head of HPC Laboratory,
Institute for Informatics and Automation Problems,
National Academy of Sciences of the Republic of Armenia
1, P. Sevak str., Yerevan 0014, Armenia
t: 374 10 284780
f: 374 10 285812
e: hr...@sci.am
skype: tighra

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] about the loglm and glm---Re:Re: Regression on stratified count data

2013-04-24 Thread meng

Hi,Achim:


Can all the analysis you mentioned via loglm be performed via 
glm(...,family=poisson)?


Many thanks.










At 2013-04-24 19:37:10,Achim Zeileis achim.zeil...@uibk.ac.at wrote:
On Wed, 24 Apr 2013, meng wrote:

 Hi all:
 For stratified count data,how to perform regression analysis?

 My data:
 age case oc count
 1   1 121
 1   1 226
 1   2 117
 1   2 259
 2   1 118
 2   1 288
 2   2 1 7
 2   2 2   95

 age:
 1:40y
 2:40y

 case:
 1:patient
 2:health

 oc:
 1:use drug
 2:not use drug

 My purpose:
 Anaysis whether case and oc are correlated, and age is a stratified variable.

 My solution:
 1,Mantel-Haenszel test by using function mantelhaen.test
 2,loglinear regression by using function 
 glm(count~case*oc,family=poisson).But I don't know how to handle variable 
 age,which is the stratified variable.

Instead of using glm(family = poisson) it is also convenient to use 
loglm() from package MASS for the associated convenience table.

The code below shows how to set up the contingency table, visualize it 
using package vcd, and then fit two models using loglm. The models 
considered are conditional independence of case and drug given age and the 
no three-way interaction already suggested by Peter. Both models are 
also accompanied by visualizations of the residuals.

## contingency table with nice labels
tab - expand.grid(drug = 1:2, case = 1:2, age = 1:2)
tab$count - c(21, 26, 17, 59, 18, 88, 7, 95)
tab$age - factor(tab$age, levels = 1:2, labels = c(40, 40))
tab$case - factor(tab$case, levels = 1:2, labels = c(patient, 
healthy))
tab$drug - factor(tab$drug, levels = 1:2, labels = c(yes, no))
tab - xtabs(count ~ age + drug + case, data = tab)

## visualize case explained by drug given age
library(vcd)
mosaic(case ~ drug | age, data = tab,
   split_vertical = c(TRUE, TRUE, FALSE))

## test wheter drug and case are independent given age
m1 - loglm(~ age/(drug + case), data = tab)
m1

## visualize corresponding residuals from independence model
mosaic(case ~ drug | age, data = tab,
   split_vertical = c(TRUE, TRUE, FALSE),
   residuals_type = deviance,
   gp = shading_hcl, gp_args = list(interpolate = 1.2))
mosaic(case ~ drug | age, data = tab,
   split_vertical = c(TRUE, TRUE, FALSE),
   residuals_type = pearson,
   gp = shading_hcl, gp_args = list(interpolate = 1.2))

## test whether there is no three-way interaction
## (i.e., dependence of case on drug is the same for both age groups)
m2 - loglm(~ (age + drug + case)^2, data = tab)
m2

## visualization (with default pearson residuals)
mosaic(case ~ drug | age, data = tab,
   expected = ~ (age + drug + case)^2,
   split_vertical = c(TRUE, TRUE, FALSE),
   gp = shading_hcl, gp_args = list(interpolate = 1.2))


 Many thanks for your help.

 My best.
  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matching observations and ranking

Hi,
May be this helps:
As you wanted to match only from row3 onwards to row2, the corresponding values 
on row1 and row2 were set to NA.
dat1- read.table(text=
  S.No AB001A AB0002A AB362
   P1   -/-    C/C   A/A   
    P2   C/C    C/C   A/A   
    3   C/C    C/C   A/A   
    4   C/C    C/C   A/A   
    5   C/C    C/C   A/A   
    6   C/C    C/C   A/A   
    7   C/C    C/C   A/A   
    8   -/-    -/-   -/-   
    9   C/C    C/C   A/A   
    10  C/C    C/C   A/A   
    11  -/-    C/C   A/A   
    12  C/C    C/C   A/A   
    13  C/C    C/C   A/A   
    14  C/C    C/C   A/A   
    15  C/C    -/-   A/A   
    16   -/-    C/C   A/A   
    17   A/A    A/C   A/A   
    18  C/A    A/A   A/A
,sep=,header=TRUE,stringsAsFactors=FALSE)
dat2-cbind(dat1,(1*mapply(==,dat1[,-1],dat1[2,-1])))
names(dat2)[duplicated(names(dat2))]- 
paste0(names(dat2)[duplicated(names(dat2))],_1)
library(plyr)
 dat3-mutate(dat2,SUM=rowSums(cbind(AB001A_1,AB0002A_1,AB362_1)), 
MATCH=(SUM/3)*100)
 dat3[1:2,5:9]-NA
res-mutate(dat3,RANK=rank(MATCH,ties.method=min))
 head(res)
#  S.No AB001A AB0002A AB362 AB001A_1 AB0002A_1 AB362_1 SUM MATCH RANK
#1   P1    -/- C/C   A/A   NA    NA  NA  NA    NA   17
#2   P2    C/C C/C   A/A   NA    NA  NA  NA    NA   18
#3    3    C/C C/C   A/A    1 1   1   3   100    7
#4    4    C/C C/C   A/A    1 1   1   3   100    7
#5    5    C/C C/C   A/A    1 1   1   3   100    7
#6    6    C/C C/C   A/A    1 1   1   3   100    7
A.K.


Hi Arun, 
Thank you very much for your help in solving my problem, 
S. No   AB001A  AB0002A AB362   AB001A    AB0002A     AB362   SUM %Match  Rank 
    P1   -/-        C/C   A/A                         
   P 2   C/C        C/C   A/A                         
    3   C/C        C/C   A/A                         
    4   C/C        C/C   A/A                         
    5   C/C        C/C   A/A                         
    6   C/C        C/C   A/A                         
    7   C/C        C/C   A/A                         
    8   -/-        -/-   -/-                         
    9   C/C        C/C   A/A                         
   10  C/C        C/C   A/A                         
    11  -/-        C/C   A/A                         
    12  C/C        C/C   A/A                         
    13  C/C        C/C   A/A                         
    14  C/C        C/C   A/A                         
    16  C/C        -/-   A/A                         
Actually i want to match observation from 3 to 16 with the value in 
p2 (i.e 3 with p2, 4 with p2, 5 with p2 etc), if they match i would like
 to give value 1 and store it in corresponding dummy variable i.e. 
AB001A and i would like to do samething for remaining vars too and 
storing in their dummy vars. Finally i want make sum of all the matched 
(i.e. 1 score) in each row and calculate percentage of match and then 
rank. This what i want, sorry for not expressing my problem exactly in 
understandable way. 




Hi to all bloggers, 
 my data looks like this, 

S. No   AB001A  AB0002A AB362   VAR1    VAR2    VAR3    SUM %Match  Rank 
   1   -/-        C/C   A/A                         
    2   C/C        C/C   A/A                         
    3   C/C        C/C   A/A                         
    4   C/C        C/C   A/A                         
    5   C/C        C/C   A/A                         
    6   C/C        C/C   A/A                         
    7   C/C        C/C   A/A                         
    8   -/-        -/-   -/-                         
    9   C/C        C/C   A/A                         
    10  C/C        C/C   A/A                         
    11  -/-        C/C   A/A                         
    12  C/C        C/C   A/A                         
    13  C/C        C/C   A/A                         
    14  C/C        C/C   A/A                         
    16  C/C        -/-   A/A                         
    17   -/-        C/C   A/A                         
    18   C/C        C/C   A/A                         
    19  C/C        C/C   A/A                         
I want to match obs 3 with obs 2 if it exactly matched then score 
will be 1 else 0, that will be stored in var1 for AB001a, in var2 for 
ab0002a and in var3 for ab362 and i want to calculate sum of all the 1's
and observation match percent and their rank (top ten matchers), I did 
this successfully in excel but it took me lot of time, i used if 
condition in excel like (=if(A3=A$2,1,0) and then i dragged among all 
obs and i did sum of all obs, their

[R] R cannot find the path on my mac

2013-04-24 Thread Gitte Brinch Andersen

I am really sorry for this probably quite simple question.
I am new to R, and I am running a pipeline that has already been made. All I
have to do is give the paths for different folders, where the pipeline can find
the files with my data.

But every time I try to run the pipeline it returns with the message, that it
cannot find the file. And I really don't know why. I have found the path by
going to the folder and finding the info about the folder, where the location
of this is stated. I copy-paste this into the source command.

The path for the folder is:

When found with the info about the folder:
/Users/gban/Desktop/Methylation analysis/MethylationDataAnalysis

But I think it should be:
/Users/gban/Desktop/Methylation
analysis/MethylationDataAnalysis/1.Normalization_raw_data/
as it is the folder 1.Normalization_raw_data where my data are in.

I have tried both paths but it returns with this message every time:

source('/Users/gban/Desktop/Methylation
analysis/MethylationDataAnalysis/1.Normalization_raw_data/pipelineIlluminaMethylation.main.R')
Fejl i file(filename, r, encoding = encoding) :
cannot open the connection
In addition: Advarselsbesked:
In file(filename, r, encoding = encoding) :
cannot open file '/Users/gban/Desktop/Methylation
analysis/MethylationDataAnalysis/1.
450K_pipeline_Nov2012release./SRC/loadMethylumi2.R': No such file or directory

Do you have any idea why?
I know this is probably more a question of how to type the path on a mac, than
it is a question of how R works.

But I am really frustrated about this, and thought it might be possible for you
to help?

Thank you in advance, and again sorry for the quite stupid question.

Best

Gitte Brinch Andersen

E-mail: gitt...@hum-gen.au.dk

[R] Regression and FMMs with flexmix

2013-04-24 Thread Robin Tviet





I am repeating this because it seems that some people think it is important to 
reveal your identity I don;t understand why this is so important.   Hopefuly 
now this list will be helpful.

Could someone please assist with this


I am trying to understand how to use the flexmix package, I have read the 
Leisch paper but am very unclear what is needed for the M-step driver.  I am 
just fitting a simple linear regression model.  The documentation is far from 
clear what the FLXmclust function does, but, it in principle could do all I 
need, however, I do not get sentible results as if I try the following the 
result is poor:

x-c()
for(i in 
0:99){x$y[2*i]=(0+i);x$x[2*i]=i;x$x[2*i+1]=i;x$y[2*i+1]=i+1000;x$g[2*i]=1;x$g[2*i+1]=2}
m1-flexmix(y~x  ,data=x,k=2)
table(x$g,m1@cluster)

1  2
1 25 74
2 67 33

It all depends on the randomised starting values.  So I think I need a better 
driver, but, I cannot find a spec for what I have to do in the driver.

Where is FLXmclust documented?  can anyone assist?
 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matching observations and ranking

Just to add:
If your original dataset have only few columns, then you can try this too:
res1-within(mutate(dat1,AB001A_1=1*(AB001A==AB001A[2]),AB0002A_1=1*(AB0002A==AB0002A[2]),AB362_1=1*(AB362==AB362[2]),SUM=rowSums(cbind(AB001A_1,AB0002A_1,AB362_1)),MATCH=(SUM/3)*100),{MATCH[1:2]-NA;RANK=rank(MATCH,ties.method=min);SUM[1:2]-NA;AB001A_1[1:2]-NA;AB0002A_1[1:2]-NA;AB362_1[1:2]-NA})
identical(res,res1)
#[1] TRUE
A.K.





- Original Message -
From: arun smartpink...@yahoo.com
To: R help r-help@r-project.org
Cc: 
Sent: Wednesday, April 24, 2013 10:09 AM
Subject: Re: matching observations and ranking 

Hi,
May be this helps:
As you wanted to match only from row3 onwards to row2, the corresponding values 
on row1 and row2 were set to NA.
dat1- read.table(text=
  S.No AB001A AB0002A AB362
   P1   -/-    C/C   A/A   
    P2   C/C    C/C   A/A   
    3   C/C    C/C   A/A   
    4   C/C    C/C   A/A   
    5   C/C    C/C   A/A   
    6   C/C    C/C   A/A   
    7   C/C    C/C   A/A   
    8   -/-    -/-   -/-   
    9   C/C    C/C   A/A   
    10  C/C    C/C   A/A   
    11  -/-    C/C   A/A   
    12  C/C    C/C   A/A   
    13  C/C    C/C   A/A   
    14  C/C    C/C   A/A   
    15  C/C    -/-   A/A   
    16   -/-    C/C   A/A   
    17   A/A    A/C   A/A   
    18  C/A    A/A   A/A
,sep=,header=TRUE,stringsAsFactors=FALSE)
dat2-cbind(dat1,(1*mapply(==,dat1[,-1],dat1[2,-1])))
names(dat2)[duplicated(names(dat2))]- 
paste0(names(dat2)[duplicated(names(dat2))],_1)
library(plyr)
 dat3-mutate(dat2,SUM=rowSums(cbind(AB001A_1,AB0002A_1,AB362_1)), 
MATCH=(SUM/3)*100)
 dat3[1:2,5:9]-NA
res-mutate(dat3,RANK=rank(MATCH,ties.method=min))
 head(res)
#  S.No AB001A AB0002A AB362 AB001A_1 AB0002A_1 AB362_1 SUM MATCH RANK
#1   P1    -/- C/C   A/A   NA    NA  NA  NA    NA   17
#2   P2    C/C C/C   A/A   NA    NA  NA  NA    NA   18
#3    3    C/C C/C   A/A    1 1   1   3   100    7
#4    4    C/C C/C   A/A    1 1   1   3   100    7
#5    5    C/C C/C   A/A    1 1   1   3   100    7
#6    6    C/C C/C   A/A    1 1   1   3   100    7
A.K.


Hi Arun, 
Thank you very much for your help in solving my problem, 
S. No   AB001A  AB0002A AB362   AB001A    AB0002A     AB362   SUM %Match  Rank 
    P1   -/-        C/C   A/A                         
   P 2   C/C        C/C   A/A                         
    3   C/C        C/C   A/A                         
    4   C/C        C/C   A/A                         
    5   C/C        C/C   A/A                         
    6   C/C        C/C   A/A                         
    7   C/C        C/C   A/A                         
    8   -/-        -/-   -/-                         
    9   C/C        C/C   A/A                         
   10  C/C        C/C   A/A                         
    11  -/-        C/C   A/A                         
    12  C/C        C/C   A/A                         
    13  C/C        C/C   A/A                         
    14  C/C        C/C   A/A                         
    16  C/C        -/-   A/A                         
Actually i want to match observation from 3 to 16 with the value in 
p2 (i.e 3 with p2, 4 with p2, 5 with p2 etc), if they match i would like
to give value 1 and store it in corresponding dummy variable i.e. 
AB001A and i would like to do samething for remaining vars too and 
storing in their dummy vars. Finally i want make sum of all the matched 
(i.e. 1 score) in each row and calculate percentage of match and then 
rank. This what i want, sorry for not expressing my problem exactly in 
understandable way. 




Hi to all bloggers, 
 my data looks like this, 

S. No   AB001A  AB0002A AB362   VAR1    VAR2    VAR3    SUM %Match  Rank 
   1   -/-        C/C   A/A                         
    2   C/C        C/C   A/A                         
    3   C/C        C/C   A/A                         
    4   C/C        C/C   A/A                         
    5   C/C        C/C   A/A                         
    6   C/C        C/C   A/A                         
    7   C/C        C/C   A/A                         
    8   -/-        -/-   -/-                         
    9   C/C        C/C   A/A                         
    10  C/C        C/C   A/A                         
    11  -/-        C/C   A/A                         
    12  C/C        C/C   A/A                         
    13  C/C        C/C   A/A                         
    14  C/C        C/C   A/A                         
    16  C/C        -/-   A/A                         
    17

[R] puzzles of Finance data programming with R

2013-04-24 Thread chenzi14

Hi
This is Candice from Newcastle University.
I am now coming across some problems with the programming of R.
It is like this.
I am asked to run three models based on a sample of CPI:
Random walk
Recursive AR(4)
Rolling AR(4)
1.Based on the papers I find, I think the random walk may be similar to an 
AR(1) model (if with drift). But I am not sure whether this is right?
2.For the recursive model, I found a command named filter.
 m4=filter(Z,filter=c(c1,c2,c3,c4),method=recursive,side=1)
However, since it is not a simulating process, I cannot define the coefficients.
But this type of command needs specific coefs.
And I also try another code like 
m=filter(Z,rep(1,5),method=recursive,side=1). 
However, this command is applied to MA models.By doing rep(1,5), I think this 
means I define all the coefs to be 1.
Additionally, I tried the ?fiter.
But there are not that relevant examples.
3.For the rolling AR(4) model.
In fact, I have done a rollapply command for 12 years rolling windows for 
previous questions.
I think the methodology for the merge command should be used. Since rolling is 
a bit like a process of merging and moving on. 
But I do not know, in order to run a rolling AR(4) model, which command should 
I pick?


PS. If any additional packages like the TSA, need to be installed, please tell 
me. 


Thank you very much!
I am really looking forward to hear from you.


Best Regards
Candice




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fonts not rendering during plot

2013-04-24 Thread George Dietz

Hi,

I am having a problem where sometimes fonts for text in plots don't get 
rendered-the text just shows up as boxes. If I call R from a certain directory 
the problem goes away, otherwise the fonts don't render (but plots get made). 
In both cases, my PANGO_RC and FONTCONFIG_FILE do not change…

Thank you,
-George
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Residuals for fracdiff

2013-04-24 Thread fmesserGmail

Hi,

 

I am using the fracdiff package to estimate the parameters of an
ARFIMA(1,d,1) model. I would also like to get the residuals of the series. I
have seen another post about this (below). However, being still quite at the
beginner level in terms of R, I did not quite understand how this worked. I
also read through the fracdiff package manual with no success to find any
help with the residuals. I would be very grateful for any help. Thank you!

 

Best regards,

François Messer

HEC, Faculty of Business and Economics

University of Lausanne

 

Message-id: 3c3d6095.5eb01...@lmttrading.com
 
 

 Date: Wed, 9 Jan 2002 07:57:31 +
 From: Susana Barbosa susanabarb...@novalis.fc.up.pt
mailto:susanabarb...@novalis.fc.up.pt?Subject=Re:%20[R]%20How%20to%20obtain
%20the%20series%20of%20residuals%20from%20fracdiff 
 Subject: [R] How to obtain the series of residuals from fracdiff

 Hi

 I'm using fracdiff package to estimate the parameters of a
 fractionally-differenced ARIMA (p,d,q) model, and it works fine, but I
wanted
 to have also the filtered series and the series of residuals.
 I understand these are calculated in the subroutine fdfilt, in the program
 fdcore.f, but I can't manage to get them out.

 Any suggestion would be much appreciated

 Thanks

 Susana Barbosa

Hi Susana

For fractional differencing you can use, e.g., something like

fracdiff - function (x, d, N = 100)
{
n - 0:N
w - gamma(-d+n)/(gamma(-d)*gamma(n+1))
y - filter(x, w, sides=1)
return (y)
}

Adrian

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] string size limits in RCurl

2013-04-24 Thread Elmore, Ryan

Hi All,

I am running into what appears to be character size limit in a JSON string when 
trying retrieve data from either `curlPerform()` or `getURL()`.  Here is 
non-reproducible code [1], but it should shed some light on the problem.

# Note that .base.url is the basic url for the API, q is a query, user
#  is specified, etc.
session = getCurlHandle()
curl.opts - list(userpwd = paste(user, :, key, sep = ),
  httpheader = Content-Type: application/json)
request - paste(.base.url, q, sep = )
txt - getURL(url = request, curl = session, .opts = curl.opts,
  write = basicTextGatherer())

or

r = dynCurlReader()
curlPerform(url = request, writefunction = r$update, curl = session,
.opts = curl.opts)

My guess is that the `update` or `value` functions in the `basicTextGather` or 
`dynCurlReader` text handler objects are having trouble with the large strings. 
 In this example, `r$value()` will return a truncated string that is 
approximately 2 MB.  The code given above will work fine for queries  2 MB.

Note that I can easily do the following from the command line (or using 
`system()` in R), but writing to disc seems like a waste if I am doing the 
subsequent analysis in R.

curl -v --header Content-Type: application/json --user 
username:register:passwd 
https://base.url.for.api/getdata/select+*+from+sometable  stream.json

where `file.json` is a roughly 14MB json string. I can read the string into R 
using either

con - file(paste(.project.path, data/stream.json, sep = ), r)
string - readLines(con)

or directly to list as

tmp - fromJSON(file = paste(.project.path, data/stream.json, sep = ))

Any thoughts are very much appreciated.  Note that I posted this same 
question/comment to StackOverflow and will happily provide any helpful 
suggestions to that list as well.

Ryan

[1] - Sorry for not providing reproducible code, but I'm dealing with a govt 
firewall.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3


On 13-04-24 1:36 PM, Jens Olofsson wrote:

Dear users of R
I have a subroutine in Fortran95, compiled to a DLL with gfortran in Cygwin
4.5.3.


We don't support Cygwin.  You should use the gfortran in Rtools, and get 
R to set the command line options for you, either by putting the code in 
a package, or by using R CMD SHLIB Mango.f95.


Duncan Murdoch


The subroutine is:
subroutine MyPBP( S, p, N )
 ! Expose subroutine rtest to users of this DLL
 !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: mypbp_ ::mypbp
 ! This function computes the Poisson-Binomial distribution
 ! of size N using p
 double precision, intent(inout) :: S(N+1)
 double precision, intent(in) :: p(N)
 integer, intent(in) :: N
 double precision :: X(N+1)
 integer i, j
 !X=0
 !S=0
 X(1) = 1 - p(1)
 X(2) = p(1)
 do i = 2, N
 S(1) = X(1)*(1-p(i))
 do j = 2,i
 S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
 end do
 S(i+1) = X(i)*p(i)
 X = S
 if (i == N) then
 S = X
 end if
 end do
end subroutine MyPBP
and it is saved into Mango.f95
I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
gfortran-4 -shared -o Mango.dll Mango.o
I am on a Windows machine running Windows 7 with Intel i7.
I load the dll in a 32-bit R by dyn.load(Mango.dll). Using getLoadedDLLs
I can see the DLL. However, is.loaded(Mango.dll) = FALSE. In addition, R
stop responding when I try .Fortran(MyPBP, as.numeric(S), as.numeric(p),
as.integer(N)),
where N-5, S-array(0,N+1) and p- c(0.1, 0.2, 0.5, 0.8, 0.9).

What am I doing wrong?
Any ideas, thoughts and/or comments are highly appreciated.

Jens

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] substitute rename.vars {gdata}

2013-04-24 Thread Knut Krueger


is there a equivalent to rename.vars {gdata}?

As I do not use read.xls i am not using library gdata for a package 
except this function


Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3

Dear Duncan,
I know this isn't a forum for Cygwin, but for R. Pls treat me as a noob and
also remember I am on Windows. How should I use R CMD SHLIB as if I write
that at the prompt I get the error: unexpected symbol in R CMD. I have
mango.f95 in the working directory.
//Jens



On 24 April 2013 19:46, Duncan Murdoch murdoch.dun...@gmail.com wrote:

 On 13-04-24 1:36 PM, Jens Olofsson wrote:

 Dear users of R
 I have a subroutine in Fortran95, compiled to a DLL with gfortran in
 Cygwin
 4.5.3.


 We don't support Cygwin.  You should use the gfortran in Rtools, and get R
 to set the command line options for you, either by putting the code in a
 package, or by using R CMD SHLIB Mango.f95.

 Duncan Murdoch

  The subroutine is:
 subroutine MyPBP( S, p, N )
  ! Expose subroutine rtest to users of this DLL
  !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: mypbp_ ::mypbp
  ! This function computes the Poisson-Binomial distribution
  ! of size N using p
  double precision, intent(inout) :: S(N+1)
  double precision, intent(in) :: p(N)
  integer, intent(in) :: N
  double precision :: X(N+1)
  integer i, j
  !X=0
  !S=0
  X(1) = 1 - p(1)
  X(2) = p(1)
  do i = 2, N
  S(1) = X(1)*(1-p(i))
  do j = 2,i
  S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
  end do
  S(i+1) = X(i)*p(i)
  X = S
  if (i == N) then
  S = X
  end if
  end do
 end subroutine MyPBP
 and it is saved into Mango.f95
 I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
 gfortran-4 -shared -o Mango.dll Mango.o
 I am on a Windows machine running Windows 7 with Intel i7.
 I load the dll in a 32-bit R by dyn.load(Mango.dll). Using getLoadedDLLs
 I can see the DLL. However, is.loaded(Mango.dll) = FALSE. In addition, R
 stop responding when I try .Fortran(MyPBP, as.numeric(S), as.numeric(p),
 as.integer(N)),
 where N-5, S-array(0,N+1) and p- c(0.1, 0.2, 0.5, 0.8, 0.9).

 What am I doing wrong?
 Any ideas, thoughts and/or comments are highly appreciated.

 Jens

 [[alternative HTML version deleted]]

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the joy of spreadsheets (off-topic)

2013-04-24 Thread Thomas Adams

One might wonder if the Excel error was indeed THAT or perhaps a way to
get the desired results, give the other issues in their analysis?

On Wed, Apr 24, 2013 at 11:58 AM, peter dalgaard pda...@gmail.com wrote:

In case you haven't noticed, this is making the rounds in the media,
including a handful of references to R. See e.g.

http://news.slashdot.org/story/13/04/17/0215211/excel-error-contributes-to-problems-with-austerity-study

I suppose we can't fortune()'ify anonymous quotes, but I kind of like this
exchange:

Bacon Bits: SPSS and R are very good at statistical analysis. Quantrix,
MapleSoft, IBM Algorithmics, and other software is for financial data
modeling. None of those is particularly appropriate for sharing data in a
useful format with peers. Excel is.

Hatta: R is extremely appropriate for sharing data in a useful format
with peers. It's completely free for one. But more importantly, it saves
every single step of your analysis. Send someone an Excel file, and who
knows what they've done to the data. Send someone your R project directory
and they can see exactly what you did.

The problem with sending R files to your peers isn't that the R files
aren't useful. It's that your peers aren't.

On Apr 16, 2013, at 19:25 , Sarah Goslee wrote:

Given that we occasionally run into problems with comparing Excel
results to R results, and other spreadsheet-induced errors, I thought
this might be of interest.

http://www.nextnewdeal.net/rortybomb/researchers-finally-replicated-reinhart-rogoff-and-there-are-serious-problems

The punchline:

Ouch.

(Note: I know nothing about the site, the author of the article, or
the study in question. I was pointed to it by someone else. But if
true: highly problematic.)

Sarah

--
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

[R] Distance matrices Combinations

2013-04-24 Thread eliza botto

Dear UseRs,
MY PROBLEM IS A SMALL PIECE OF A REAL BIG AND A COMPLICATED PROBLEM. IF I 
DELIBERATE IN A VERY SIMPLE WAY THEN ALL I 
WANT IS TO PUT ALL THE POSSIBLE COMBINATIONS OF 75 DISTANCE MATRICES (BY TAKING 
4 MATRICES, MORE COMMONLY 75C4), in the following equation.


t-as.matrix((MAT1)^2+(MAT2)^2+(MAT3)^2+(MAT4)^2+,upper=T,diag=T))

Then 1215450 values of t(one for each combination) should one by one be 
inculcated in the following loop(to calculate the o value) and in the end 
want those 10 combinations of distance matrices which have lowest o values.

e - vector(list, 124)

w-sqrt(t)

mat1-w

for (i in 1:124){

r-matrix(sort(mat1[i,],index.return=TRUE)$ix,ncol=1)

u-r[c(2,3,4,5,6),1]

mata-m[,c(u)]  ##(shifted)

amata-apply(mata,1,mean)

amata-data.frame(amata)

aavg-as.matrix(amata, ncol=1)

b-aavg

e[[i]]-print(sum(abs(b-m[,i])))

}

x-do.call(rbind,e)

Y-x

z - apply(Y,2,sd)

o-mean(Y)

Does my question make any scene?
Thanks in advance
Elisa 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sum up column values according to row id

2013-04-24 Thread David Carlson

Something like this?

mean6 - function(x) {
if (length(x)  6) {
  mn - mean(x)
  } else {
mn - mean(x[1:6])
}
return(mn)
}

aggregate(g~id, ipso, mean6)

-
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77840-4352





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Matteo Mura
Sent: Wednesday, April 24, 2013 7:57 AM
To: r-help@r-project.org
Subject: [R] Sum up column values according to row id

Dear All,

here a problem I think many of you can solve in few minutes.

I have a dataframe which contains values of plot id, diameters, heigths and
basal area of trees, thus columns names are: id | dbh | h | g

head(ipso, n=10)id dbh h  g
1  FPE0164  36 13.62 0.10178760
2  FPE0164  31 12.70 0.07547676
21 FPE1127  57 18.85 0.25517586
13 FPE1127  39 15.54 0.11945906
12 FPE1127  34 14.78 0.09079203
6  FPE1127  32 15.12 0.08042477
5  FPE1127  28 14.13 0.06157522
15 FPE1127  27 13.50 0.05725553
19 FPE1127  25 13.28 0.04908739
11 FPE1127  19 11.54 0.02835287

from here I need to calculate the mean of the six greater g_ith for each
id_ith. The clauses are that:

if length(id) =6

do the mean of the first six greaters g


else
do the mean of all the g_ith in the id_ith (in head print above e.g.
for the id==FPE0164 do the mean of just these two values of g).

The g are already ordered by id ascending and g descending using:

ipso - ipso[with(ipso, order(ipso$id, -ipso$g)), ] # Order for id ascending
and g descending

I tried a lot of for loops and tapply() without results.

Can anyone help me to solve this?

Thanks for your attention

Best whishes

Matteo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3


On 13-04-24 1:51 PM, Jens Olofsson wrote:

Dear Duncan,
I know this isn't a forum for Cygwin, but for R. Pls treat me as a noob and
also remember I am on Windows. How should I use R CMD SHLIB as if I write
that at the prompt I get the error: unexpected symbol in R CMD. I have
mango.f95 in the working directory.



That's a command-line command, not something done with R.  You can use 
it from your bash shell if you have R and the Rtools directories on your 
path, or from the Windows CMD shell.


BTW, my comment wasn't trying to tell you to go to a Cygwin forum, it 
was telling you that Cygwin's gfortran is unsupported.  You need to use 
the MinGW-64 one that we distribute if you want us to be able to help.


Duncan Murdoch


//Jens



On 24 April 2013 19:46, Duncan Murdoch murdoch.dun...@gmail.com wrote:


On 13-04-24 1:36 PM, Jens Olofsson wrote:


Dear users of R
I have a subroutine in Fortran95, compiled to a DLL with gfortran in
Cygwin
4.5.3.



We don't support Cygwin.  You should use the gfortran in Rtools, and get R
to set the command line options for you, either by putting the code in a
package, or by using R CMD SHLIB Mango.f95.

Duncan Murdoch

  The subroutine is:

subroutine MyPBP( S, p, N )
  ! Expose subroutine rtest to users of this DLL
  !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: mypbp_ ::mypbp
  ! This function computes the Poisson-Binomial distribution
  ! of size N using p
  double precision, intent(inout) :: S(N+1)
  double precision, intent(in) :: p(N)
  integer, intent(in) :: N
  double precision :: X(N+1)
  integer i, j
  !X=0
  !S=0
  X(1) = 1 - p(1)
  X(2) = p(1)
  do i = 2, N
  S(1) = X(1)*(1-p(i))
  do j = 2,i
  S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
  end do
  S(i+1) = X(i)*p(i)
  X = S
  if (i == N) then
  S = X
  end if
  end do
end subroutine MyPBP
and it is saved into Mango.f95
I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
gfortran-4 -shared -o Mango.dll Mango.o
I am on a Windows machine running Windows 7 with Intel i7.
I load the dll in a 32-bit R by dyn.load(Mango.dll). Using getLoadedDLLs
I can see the DLL. However, is.loaded(Mango.dll) = FALSE. In addition, R
stop responding when I try .Fortran(MyPBP, as.numeric(S), as.numeric(p),
as.integer(N)),
where N-5, S-array(0,N+1) and p- c(0.1, 0.2, 0.5, 0.8, 0.9).

What am I doing wrong?
Any ideas, thoughts and/or comments are highly appreciated.

Jens

 [[alternative HTML version deleted]]

__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help with logkda.pari (in package untb)

2013-04-24 Thread Chase, Jennifer

Hello-
I'm looking for help using the function logkda.pari. logkda.pari needs to
access the program pari/gp (which of course I've installed), but I don't
understand what commands I should use so that it recognizes that it is
installed. Any help with this would be much appreciated.

Cheers
Jen Chase

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Trouble Computing Type III SS in a Cox Regression using drop1 and Anova

2013-04-24 Thread John Fox

Dear Paul,

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Paul Miller
 Sent: Wednesday, April 24, 2013 1:18 PM
 To: r-help@r-project.org
 Subject: [R] Trouble Computing Type III SS in a Cox Regression using
 drop1 and Anova

 Hello All,

 Am having some trouble computing Type III SS in a Cox Regression using
 either drop1 or Anova from the car package. Am hoping that people will
 take a look to see if they can tell what's going on.

 Here is my R code:

. . .

 Both drop1 and Anova give me a different p-value than I get from coxph
 for both my two-level ps2 variable and for age. This is not what I
 would expect based on experience using SAS to conduct similar analyses.
 Indeed SAS consistently produces the same p-values. Namely the ones I
 get from coxph.

 My sense is that I'm probably misusing R in some way but I'm not sure
 what I'm likely to be doing wrong. SAS prodcues Wald Chi-Square results
 for its type III tests. Maybe that has something to do with it.
 Ideally, I'd like to get type III values that match those from coxph.
 If anyone could help me understand better, that would be greatly
 appreciated.

You've answered your own question: The summary() output gives you Wald
tests, and drop1() and Anova() give you LR tests. From ?Anova:

test.statistic: ... for a Cox model, whether to calculate LR
(partial-likelihood ratio) or Wald tests; ...

Thus, if you want the Wald test, you can ask for it (though it escapes me
why you prefer it to the LR test).

I hope this helps,
 John

---
John Fox
Senator McMaster Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sum up column values according to row id

2013-04-24 Thread Rui Barradas


Hello,

Note also that in the with(9 instruction there's some redundancy, th OP 
could simply do


with(ipso, order(id, -g))


Hope this helps,

Rui Barradas

Em 24-04-2013 19:10, David Carlson escreveu:

Something like this?

mean6 - function(x) {
 if (length(x)  6) {
   mn - mean(x)
   } else {
mn - mean(x[1:6])
}
return(mn)
}

aggregate(g~id, ipso, mean6)

-
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77840-4352





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Matteo Mura
Sent: Wednesday, April 24, 2013 7:57 AM
To: r-help@r-project.org
Subject: [R] Sum up column values according to row id

Dear All,

here a problem I think many of you can solve in few minutes.

I have a dataframe which contains values of plot id, diameters, heigths and
basal area of trees, thus columns names are: id | dbh | h | g

head(ipso, n=10)id dbh h  g
1  FPE0164  36 13.62 0.10178760
2  FPE0164  31 12.70 0.07547676
21 FPE1127  57 18.85 0.25517586
13 FPE1127  39 15.54 0.11945906
12 FPE1127  34 14.78 0.09079203
6  FPE1127  32 15.12 0.08042477
5  FPE1127  28 14.13 0.06157522
15 FPE1127  27 13.50 0.05725553
19 FPE1127  25 13.28 0.04908739
11 FPE1127  19 11.54 0.02835287

from here I need to calculate the mean of the six greater g_ith for each
id_ith. The clauses are that:

if length(id) =6

do the mean of the first six greaters g


else
do the mean of all the g_ith in the id_ith (in head print above e.g.
for the id==FPE0164 do the mean of just these two values of g).

The g are already ordered by id ascending and g descending using:

ipso - ipso[with(ipso, order(ipso$id, -ipso$g)), ] # Order for id ascending
and g descending

I tried a lot of for loops and tapply() without results.

Can anyone help me to solve this?

Thanks for your attention

Best whishes

Matteo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3

2013-04-24 Thread R. Michael Weylandt michael.weyla...@gmail.com

Ok. I apologise for not understanding. So, I have installed R-tools. It
changed my PATH-variable. I didn't installed Cygwin dlls as stated by
http://cran.r-project.org/doc/manuals/R-admin.html#The-Windows-toolset.
Instead my PATH-variable contains the path to the Cygwin dlls AFTER the
path to R... So, I started RTerm (32-bit) and tried  R CMD SHLIB mango.f95
and got the same error as earlier Error: unexpected symbol in R CMD.
The same goes for RTerm (64-bit). Can you pls advice me on how to proceed?
Sincerely Jens


On 24 April 2013 20:08, Duncan Murdoch murdoch.dun...@gmail.com wrote:

 On 13-04-24 1:51 PM, Jens Olofsson wrote:

 Dear Duncan,
 I know this isn't a forum for Cygwin, but for R. Pls treat me as a noob
 and
 also remember I am on Windows. How should I use R CMD SHLIB as if I write
 that at the prompt I get the error: unexpected symbol in R CMD. I have
 mango.f95 in the working directory.



 That's a command-line command, not something done with R.  You can use it
 from your bash shell if you have R and the Rtools directories on your path,
 or from the Windows CMD shell.

 BTW, my comment wasn't trying to tell you to go to a Cygwin forum, it was
 telling you that Cygwin's gfortran is unsupported.  You need to use the
 MinGW-64 one that we distribute if you want us to be able to help.

 Duncan Murdoch

  //Jens



 On 24 April 2013 19:46, Duncan Murdoch murdoch.dun...@gmail.com wrote:

  On 13-04-24 1:36 PM, Jens Olofsson wrote:

  Dear users of R
 I have a subroutine in Fortran95, compiled to a DLL with gfortran in
 Cygwin
 4.5.3.


 We don't support Cygwin.  You should use the gfortran in Rtools, and get
 R
 to set the command line options for you, either by putting the code in a
 package, or by using R CMD SHLIB Mango.f95.

 Duncan Murdoch

   The subroutine is:

 subroutine MyPBP( S, p, N )
   ! Expose subroutine rtest to users of this DLL
   !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: mypbp_ ::mypbp
   ! This function computes the Poisson-Binomial distribution
   ! of size N using p
   double precision, intent(inout) :: S(N+1)
   double precision, intent(in) :: p(N)
   integer, intent(in) :: N
   double precision :: X(N+1)
   integer i, j
   !X=0
   !S=0
   X(1) = 1 - p(1)
   X(2) = p(1)
   do i = 2, N
   S(1) = X(1)*(1-p(i))
   do j = 2,i
   S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
   end do
   S(i+1) = X(i)*p(i)
   X = S
   if (i == N) then
   S = X
   end if
   end do
 end subroutine MyPBP
 and it is saved into Mango.f95
 I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
 gfortran-4 -shared -o Mango.dll Mango.o
 I am on a Windows machine running Windows 7 with Intel i7.
 I load the dll in a 32-bit R by dyn.load(Mango.dll). Using
 getLoadedDLLs
 I can see the DLL. However, is.loaded(Mango.dll) = FALSE. In
 addition, R
 stop responding when I try .Fortran(MyPBP, as.numeric(S),
 as.numeric(p),
 as.integer(N)),
 where N-5, S-array(0,N+1) and p- c(0.1, 0.2, 0.5, 0.8, 0.9).

 What am I doing wrong?
 Any ideas, thoughts and/or comments are highly appreciated.

 Jens

  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-helphttps://stat.ethz.ch/mailman/**listinfo/r-help
 https://stat.**ethz.ch/mailman/listinfo/r-**helphttps://stat.ethz.ch/mailman/listinfo/r-help
 
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html 
 http://www.R-project.org/**posting-guide.htmlhttp://www.R-project.org/posting-guide.html
 

 and provide commented, minimal, self-contained, reproducible code.







[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the joy of spreadsheets (off-topic)

2013-04-24 Thread peter dalgaard

On Apr 24, 2013, at 20:01 , Thomas Adams wrote:

One might wonder if the Excel error was indeed THAT or perhaps a way to get
the desired results, give the other issues in their analysis?

I think I'd reserve that suspicion for what they did with the NZ data:

Growth for 1946-49: 7.7, 11.9, −9.9, and 10.8
--1951: -7.6

Those were the 5 years with Debt/GDP 90%. Obviously, the economy was going up
and down like a yoyo. So they retain only the last value, miscode it as -7.9,
and give that one year the same weight as decades of positive growth in other
countries...

On Wed, Apr 24, 2013 at 11:58 AM, peter dalgaard pda...@gmail.com wrote:
In case you haven't noticed, this is making the rounds in the media,
including a handful of references to R. See e.g.

http://news.slashdot.org/story/13/04/17/0215211/excel-error-contributes-to-problems-with-austerity-study

I suppose we can't fortune()'ify anonymous quotes, but I kind of like this
exchange:

Bacon Bits: SPSS and R are very good at statistical analysis. Quantrix,
MapleSoft, IBM Algorithmics, and other software is for financial data
modeling. None of those is particularly appropriate for sharing data in a
useful format with peers. Excel is.

The problem with sending R files to your peers isn't that the R files aren't
useful. It's that your peers aren't.

On Apr 16, 2013, at 19:25 , Sarah Goslee wrote:

Given that we occasionally run into problems with comparing Excel
results to R results, and other spreadsheet-induced errors, I thought
this might be of interest.

http://www.nextnewdeal.net/rortybomb/researchers-finally-replicated-reinhart-rogoff-and-there-are-serious-problems

The punchline:

Ouch.

(Note: I know nothing about the site, the author of the article, or
the study in question. I was pointed to it by someone else. But if
true: highly problematic.)

Sarah

--
Sarah Goslee
http://www.functionaldiversity.org

--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk Priv: pda...@gmail.com

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk Priv: pda...@gmail.com

Re: [R] R cannot find the path on my mac

2013-04-24 Thread MacQueen, Don

The why is that you haven't got the path correct yet.

Assuming you are running R.app or R64.app (running the R Console),
one way to find the correct path is to type

  file.choose()

Navigate to your file and choose it. It will then tell you the correct
path.

You can also do things like
  source(  file.path(  'path.to.folder', 'filename' ) )
if you want to keep the folder name and the filename separate.

You could also open Terminal (in the Utilities folder) and type
  find . -name filename
and it should tell you the path relative to your home folder. This assumes
that the file is somewhere below
  /Users/gban

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 4/24/13 8:46 AM, Gitte Brinch Andersen gitt...@hum-gen.au.dk wrote:

Hi

I am really sorry for this probably quite simple question.
I am new to R, and I am running a pipeline that has already been made.
All I have to do is give the paths for different folders, where the
pipeline can find the files with my data.

But every time I try to run the pipeline it returns with the message,
that it cannot find the file. And I really don't know why. I have found
the path by going to the folder and finding the info about the folder,
where the location of this is stated. I copy-paste this into the source
command.

The path for the folder is:

When found with the info about the folder:
/Users/gban/Desktop/Methylation analysis/MethylationDataAnalysis

But I think it should be:
/Users/gban/Desktop/Methylation
analysis/MethylationDataAnalysis/1.Normalization_raw_data/
as it is the folder 1.Normalization_raw_data where my data are in.

I have tried both paths but it returns with this message every time:

 source('/Users/gban/Desktop/Methylation
analysis/MethylationDataAnalysis/1.Normalization_raw_data/pipelineIllumin
aMethylation.main.R')
Fejl i file(filename, r, encoding = encoding) :
  cannot open the connection
In addition: Advarselsbesked:
In file(filename, r, encoding = encoding) :
  cannot open file '/Users/gban/Desktop/Methylation
analysis/MethylationDataAnalysis/1.
450K_pipeline_Nov2012release./SRC/loadMethylumi2.R': No such file or
directory


Do you have any idea why?
I know this is probably more a question of how to type the path on a mac,
than it is a question of how R works.

But I am really frustrated about this, and thought it might be possible
for you to help?

Thank you in advance, and again sorry for the quite stupid question.

Best

Gitte Brinch Andersen

E-mail: gitt...@hum-gen.au.dk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3



On Apr 24, 2013, at 7:46 PM, Jens Olofsson jens.olofs...@gmail.com wrote:

 Ok. I apologise for not understanding. So, I have installed R-tools. It
 changed my PATH-variable. I didn't installed Cygwin dlls as stated by
 http://cran.r-project.org/doc/manuals/R-admin.html#The-Windows-toolset.
 Instead my PATH-variable contains the path to the Cygwin dlls AFTER the
 path to R... So, I started RTerm (32-bit) and tried  R CMD SHLIB mango.f95

Repeating what Duncan said -- R CMD SHLIB is to be done outside of R, not 
within R. 

Some recent discussion also suggests its perhaps easier to do this with RStudio 
+ devtools as part of a package than as a standalone shared object. 

MW


 and got the same error as earlier Error: unexpected symbol in R CMD.
 The same goes for RTerm (64-bit). Can you pls advice me on how to proceed?
 Sincerely Jens
 
 
 On 24 April 2013 20:08, Duncan Murdoch murdoch.dun...@gmail.com wrote:
 
 On 13-04-24 1:51 PM, Jens Olofsson wrote:
 
 Dear Duncan,
 I know this isn't a forum for Cygwin, but for R. Pls treat me as a noob
 and
 also remember I am on Windows. How should I use R CMD SHLIB as if I write
 that at the prompt I get the error: unexpected symbol in R CMD. I have
 mango.f95 in the working directory.
 
 
 
 That's a command-line command, not something done with R.  You can use it
 from your bash shell if you have R and the Rtools directories on your path,
 or from the Windows CMD shell.
 
 BTW, my comment wasn't trying to tell you to go to a Cygwin forum, it was
 telling you that Cygwin's gfortran is unsupported.  You need to use the
 MinGW-64 one that we distribute if you want us to be able to help.
 
 Duncan Murdoch
 
 //Jens
 
 
 
 On 24 April 2013 19:46, Duncan Murdoch murdoch.dun...@gmail.com wrote:
 
 On 13-04-24 1:36 PM, Jens Olofsson wrote:
 
 Dear users of R
 I have a subroutine in Fortran95, compiled to a DLL with gfortran in
 Cygwin
 4.5.3.
 
 
 We don't support Cygwin.  You should use the gfortran in Rtools, and get
 R
 to set the command line options for you, either by putting the code in a
 package, or by using R CMD SHLIB Mango.f95.
 
 Duncan Murdoch
 
  The subroutine is:
 
 subroutine MyPBP( S, p, N )
  ! Expose subroutine rtest to users of this DLL
  !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: mypbp_ ::mypbp
  ! This function computes the Poisson-Binomial distribution
  ! of size N using p
  double precision, intent(inout) :: S(N+1)
  double precision, intent(in) :: p(N)
  integer, intent(in) :: N
  double precision :: X(N+1)
  integer i, j
  !X=0
  !S=0
  X(1) = 1 - p(1)
  X(2) = p(1)
  do i = 2, N
  S(1) = X(1)*(1-p(i))
  do j = 2,i
  S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
  end do
  S(i+1) = X(i)*p(i)
  X = S
  if (i == N) then
  S = X
  end if
  end do
 end subroutine MyPBP
 and it is saved into Mango.f95
 I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
 gfortran-4 -shared -o Mango.dll Mango.o
 I am on a Windows machine running Windows 7 with Intel i7.
 I load the dll in a 32-bit R by dyn.load(Mango.dll). Using
 getLoadedDLLs
 I can see the DLL. However, is.loaded(Mango.dll) = FALSE. In
 addition, R
 stop responding when I try .Fortran(MyPBP, as.numeric(S),
 as.numeric(p),
 as.integer(N)),
 where N-5, S-array(0,N+1) and p- c(0.1, 0.2, 0.5, 0.8, 0.9).
 
 What am I doing wrong?
 Any ideas, thoughts and/or comments are highly appreciated.
 
 Jens
 
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-helphttps://stat.ethz.ch/mailman/**listinfo/r-help
 https://stat.**ethz.ch/mailman/listinfo/r-**helphttps://stat.ethz.ch/mailman/listinfo/r-help
 
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html 
 http://www.R-project.org/**posting-guide.htmlhttp://www.R-project.org/posting-guide.html
 
 
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mgcv: how select significant predictor vars when using gam(...select=TRUE) using automatic optimization

2013-04-24 Thread Juliet Hannah

Hi Jan and Simon,

If possible, could you attach the diagnostic plots. I would be curious to
see them.

Thanks,

Juliet


On Fri, Apr 19, 2013 at 4:39 AM, jholstei jan.holst...@awi.de wrote:

 Simon,

 that was very instructivevery special thanks to you.
 I already noticed that the model was bad, but it was not clear to me that
 transformation of predictors to, say a more centered distribution is
 helpful here.
 And thanks for pointing out Tweedie, I noticed that the error structure is
 far from normal and more like gamma or poisson, but Gamma made things worse.

 Best regards,
 Jan





 Am 18 Apr 2013 um 17:25 schrieb Simon Wood:

  Jan,
 
  Thanks for the data (off list). The p-value computations are based on
 the approximation that things are approximately normal on the linear
 predictor scale, but actually they are no where close to normal in this
 case, which is why the p-values look inconsistent. The reason that the
 approximate normality assumption doesn't hold is that the model is quite a
 poor fit. If you take a look at gam.check(fit) you'll see that the constant
 variance assumption of quasi(link=log) is violated quite badly, and the
 residual distribution is really quite odd (plot residuals against fitted as
 well). Also see plot(fit,pages=1,scale=0) - it shows ballooning confidence
 intervals and smooth estimates that are so low in places that they might as
 well be minus infinity (given log link) - clearly something is wrong with
 this model!
 
  I would be inclined to reset all the 0's to 0 (rather than 0.01), and
 then to try Tweedie(p=1.5,link=log) as the family. Also the predictor
 variables are very skewed which is giving leverage problems, so I would
 transform them to give less skew. e.g. Something like
 
  fit-gam(target~s(log(mgs))+s(I(gsd^.5))+s(I(mud^.25))+s(log(ssCmax)),
  family=Tweedie(p=1.6,link=log),data=df,method=REML)
 
  gives a model that is closer to being reasonable (p-values are then
 consistent between select=TRUE and FALSE).
 
  best,
  Simon
 
  On 18/04/13 14:24, Simon Wood wrote:
  Jan,
 
  Thanks for this. Is there any chance that you could send me the data off
  list and I'll try to figure out what is happening? (Under the
  understanding that I'll only use the data for investigating this issue,
  of course).
 
  best,
  Simon
 
  on 18/04/13 11:11, Jan Holstein wrote:
  Simon,
 
  thanks for the reply,  I guess I'm pretty much up to date using
   mgcv 1.7-22.
  Upgrading to R 3.0.0 also didn't do any change.
 
  Unfortunately using method=REML does not make any difference:
 
  ### first with select=FALSE
  fit-gam(target
 
 ~s(mgs)+s(gsd)+s(mud)+s(ssCmax),family=quasi(link=log),data=wspe1,method=REML,select=F)
 
  summary(fit)
 
  Family: quasi
  Link function: log
  Formula:
  target ~ s(mgs) + s(gsd) + s(mud) + s(ssCmax)
  Parametric coefficients:
  Estimate Std. Error t value Pr(|t|)
  (Intercept)   -4.724  7.462  -0.6330.527
  Approximate significance of smooth terms:
  edf Ref.df  F p-value
  s(mgs)3.118  3.492  0.099   0.974
  s(gsd)6.377  7.044 15.596  2e-16 ***
  s(mud)8.837  8.971 18.832  2e-16 ***
  s(ssCmax) 3.886  4.051  2.342   0.052 .
  ---
  Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
  R-sq.(adj) =  0.403   Deviance explained = 40.6%
  REML score =  33186  Scale est. = 8.7812e+05  n = 4511
 
 
 
 
 
   Then using select=T
 
  fit2-gam(target
 
 ~s(mgs)+s(gsd)+s(mud)+s(ssCmax),family=quasi(link=log),data=wspe1,method=REML,select=TRUE)
 
  summary(fit2)
  Family: quasi
  Link function: log
  Formula:
  target ~ s(mgs) + s(gsd) + s(mud) + s(ssCmax)
  Parametric coefficients:
  Estimate Std. Error t value Pr(|t|)
  (Intercept)   -6.406  5.239  -1.2230.222
  Approximate significance of smooth terms:
  edf Ref.df F p-value
  s(mgs)2.844  8 25.43  2e-16 ***
  s(gsd)6.071  9 14.50  2e-16 ***
  s(mud)6.875  8 21.79  2e-16 ***
  s(ssCmax) 3.787  8 18.42  2e-16 ***
  ---
  Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
  R-sq.(adj) =0.4   Deviance explained = 40.1%
  REML score =  33203  Scale est. = 8.8359e+05  n = 4511
 
 
 
 
 
 
 
  I played around with other families/link functions with no success
  regarding
  the select behaviour.
 
  Well, look at the structure of my data:
  http://r.789695.n4.nabble.com/file/n4664586/screen-capture-1.png
 
  All possible predictor variables in principle look like this, and taken
  alone, each and every is significant according to p-value (but not all
  can
  at the same time).
  In theory, the target variable should be a hypersurface in 11dim space
  with
  lots of noise, but interaction of more than 2 vars gets costly (not to
  think
  of 11) and often enough (also without interaction) the solution does
 not
  converge at minimal step size. If it does, results are usually not as
  good
  as without interaction.
 
  Any comment/advice on model setup is

Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3

Hello again. 

I may be a bit slow, but I'm learning. :-)
Installed Rtools, added R to the path. Used CMD, moved to the folder of the 
source and wrote R CMD SHLIB Mango.f95 and voila I got a dll I loaded in R and 
I could call my current subroutines as intended. 

I am very grateful for ur help and ur patience. 

Sincerely, Jens

24 apr 2013 kl. 21:33 skrev R. Michael Weylandt michael.weyla...@gmail.com 
michael.weyla...@gmail.com:

 
 
 On Apr 24, 2013, at 7:46 PM, Jens Olofsson jens.olofs...@gmail.com wrote:
 
 Ok. I apologise for not understanding. So, I have installed R-tools. It
 changed my PATH-variable. I didn't installed Cygwin dlls as stated by
 http://cran.r-project.org/doc/manuals/R-admin.html#The-Windows-toolset.
 Instead my PATH-variable contains the path to the Cygwin dlls AFTER the
 path to R... So, I started RTerm (32-bit) and tried  R CMD SHLIB mango.f95
 
 Repeating what Duncan said -- R CMD SHLIB is to be done outside of R, not 
 within R. 
 
 Some recent discussion also suggests its perhaps easier to do this with 
 RStudio + devtools as part of a package than as a standalone shared object. 
 
 MW
 
 
 and got the same error as earlier Error: unexpected symbol in R CMD.
 The same goes for RTerm (64-bit). Can you pls advice me on how to proceed?
 Sincerely Jens
 
 
 On 24 April 2013 20:08, Duncan Murdoch murdoch.dun...@gmail.com wrote:
 
 On 13-04-24 1:51 PM, Jens Olofsson wrote:
 
 Dear Duncan,
 I know this isn't a forum for Cygwin, but for R. Pls treat me as a noob
 and
 also remember I am on Windows. How should I use R CMD SHLIB as if I write
 that at the prompt I get the error: unexpected symbol in R CMD. I have
 mango.f95 in the working directory.
 
 
 That's a command-line command, not something done with R.  You can use it
 from your bash shell if you have R and the Rtools directories on your path,
 or from the Windows CMD shell.
 
 BTW, my comment wasn't trying to tell you to go to a Cygwin forum, it was
 telling you that Cygwin's gfortran is unsupported.  You need to use the
 MinGW-64 one that we distribute if you want us to be able to help.
 
 Duncan Murdoch
 
 //Jens
 
 
 
 On 24 April 2013 19:46, Duncan Murdoch murdoch.dun...@gmail.com wrote:
 
 On 13-04-24 1:36 PM, Jens Olofsson wrote:
 
 Dear users of R
 I have a subroutine in Fortran95, compiled to a DLL with gfortran in
 Cygwin
 4.5.3.
 We don't support Cygwin.  You should use the gfortran in Rtools, and get
 R
 to set the command line options for you, either by putting the code in a
 package, or by using R CMD SHLIB Mango.f95.
 
 Duncan Murdoch
 
 The subroutine is:
 
 subroutine MyPBP( S, p, N )
 ! Expose subroutine rtest to users of this DLL
 !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: mypbp_ ::mypbp
 ! This function computes the Poisson-Binomial distribution
 ! of size N using p
 double precision, intent(inout) :: S(N+1)
 double precision, intent(in) :: p(N)
 integer, intent(in) :: N
 double precision :: X(N+1)
 integer i, j
 !X=0
 !S=0
 X(1) = 1 - p(1)
 X(2) = p(1)
 do i = 2, N
 S(1) = X(1)*(1-p(i))
 do j = 2,i
 S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
 end do
 S(i+1) = X(i)*p(i)
 X = S
 if (i == N) then
 S = X
 end if
 end do
 end subroutine MyPBP
 and it is saved into Mango.f95
 I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
 gfortran-4 -shared -o Mango.dll Mango.o
 I am on a Windows machine running Windows 7 with Intel i7.
 I load the dll in a 32-bit R by dyn.load(Mango.dll). Using
 getLoadedDLLs
 I can see the DLL. However, is.loaded(Mango.dll) = FALSE. In
 addition, R
 stop responding when I try .Fortran(MyPBP, as.numeric(S),
 as.numeric(p),
 as.integer(N)),
 where N-5, S-array(0,N+1) and p- c(0.1, 0.2, 0.5, 0.8, 0.9).
 
 What am I doing wrong?
 Any ideas, thoughts and/or comments are highly appreciated.
 
 Jens
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-helphttps://stat.ethz.ch/mailman/**listinfo/r-help
 https://stat.**ethz.ch/mailman/listinfo/r-**helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html 
 http://www.R-project.org/**posting-guide.htmlhttp://www.R-project.org/posting-guide.html
 
 and provide commented, minimal, self-contained, reproducible code.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

Re: [R] RDA permutest envfit

2013-04-24 Thread Gavin Simpson

On Wed, 2013-04-24 at 10:38 -0700, Rémi Lesmerises wrote:
 Dear all,
 
 I did a RDA and when I looked to the signification of the test with
 permutest, the output was non-significant. But when I used the envfit
 function, some of the vectors are significant. All the test's
 conditions are respected. What it means? Is it an error in the script?

The two functions are testing two fundamentally different things.
`permutest` (as you invoked it --- `first = TRUE`) is testing the first
axis of the RDA. That axis is a linear combination of the constraints.

The `envfit` procedure is testing the individual correlations between
the 2-d configuration of samples in the RDA space and the direction of
maximal variance of the environmental data. Each variable is considered
separately.

I can easily imagine a case, where the variance explained on the first
axis is not significant but variance over 2 axes is significant, as one
where the vectors do not point solely along the first RDA axis but also
point along the 2nd axis. By looking only at their contributions to the
first axis and summing them you don't explain a whole lot, but when you
look at the directions in 2D space each variable explains a significant
amount of variance.

HTH

G

 Commands and output:
 
  permutest(rda.ind, perm=999, first=TRUE)
 
 Permutation test for rda 
 
 Call: rda(formula = x ~ ARH_frs + CON_frs + PRP_frs + RUI_frs +
 VAM_frs, data = z)
 Permutation test for first constrained eigenvalue
 Pseudo-F:4.093568 (with 1, 10 Degrees of Freedom)
 Significance:0.413 
 Based on 999 permutations under reduced model.
 
  fit - envfit(rda.ind, z, perm = 999, display = lc)
  fit
 
 ***VECTORS
 
  RDA1  RDA2 r2 Pr(r)
 VAM_frs  0.145281 -0.989390 0.2388  0.147
 
 ARH_frs -0.876494 -0.481413 0.6127  0.002 ** 
 CON_frs  0.904278  0.426944 0.4846  0.013 *  
 PRP_frs -0.997634  0.068755 0.9433  0.001 ***
 RUI_frs -0.648512 -0.761204 0.6243  0.004 ** 
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
 P values based on 999 permutations.
 
  
 Rémi Lesmerises, biol. M.Sc.,
 Candidat Ph.D. en Biologie
 Université du Québec à Rimouski
 remilesmeri...@yahoo.ca
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3


On 13-04-24 2:46 PM, Jens Olofsson wrote:

Ok. I apologise for not understanding. So, I have installed R-tools. It
changed my PATH-variable. I didn't installed Cygwin dlls as stated by
http://cran.r-project.org/doc/manuals/R-admin.html#The-Windows-toolset.
Instead my PATH-variable contains the path to the Cygwin dlls AFTER the
path to R... So, I started RTerm (32-bit) and tried  R CMD SHLIB mango.f95
and got the same error as earlier Error: unexpected symbol in R CMD.
The same goes for RTerm (64-bit). Can you pls advice me on how to proceed?


See my first paragraph below.

Duncan Murdoch


Sincerely Jens


On 24 April 2013 20:08, Duncan Murdoch murdoch.dun...@gmail.com wrote:


On 13-04-24 1:51 PM, Jens Olofsson wrote:


Dear Duncan,
I know this isn't a forum for Cygwin, but for R. Pls treat me as a noob
and
also remember I am on Windows. How should I use R CMD SHLIB as if I write
that at the prompt I get the error: unexpected symbol in R CMD. I have
mango.f95 in the working directory.




That's a command-line command, not something done with R.  You can use it
from your bash shell if you have R and the Rtools directories on your path,
or from the Windows CMD shell.

BTW, my comment wasn't trying to tell you to go to a Cygwin forum, it was
telling you that Cygwin's gfortran is unsupported.  You need to use the
MinGW-64 one that we distribute if you want us to be able to help.

Duncan Murdoch

  //Jens




On 24 April 2013 19:46, Duncan Murdoch murdoch.dun...@gmail.com wrote:

  On 13-04-24 1:36 PM, Jens Olofsson wrote:


  Dear users of R

I have a subroutine in Fortran95, compiled to a DLL with gfortran in
Cygwin
4.5.3.



We don't support Cygwin.  You should use the gfortran in Rtools, and get
R
to set the command line options for you, either by putting the code in a
package, or by using R CMD SHLIB Mango.f95.

Duncan Murdoch

   The subroutine is:


subroutine MyPBP( S, p, N )
   ! Expose subroutine rtest to users of this DLL
   !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: mypbp_ ::mypbp
   ! This function computes the Poisson-Binomial distribution
   ! of size N using p
   double precision, intent(inout) :: S(N+1)
   double precision, intent(in) :: p(N)
   integer, intent(in) :: N
   double precision :: X(N+1)
   integer i, j
   !X=0
   !S=0
   X(1) = 1 - p(1)
   X(2) = p(1)
   do i = 2, N
   S(1) = X(1)*(1-p(i))
   do j = 2,i
   S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
   end do
   S(i+1) = X(i)*p(i)
   X = S
   if (i == N) then
   S = X
   end if
   end do
end subroutine MyPBP
and it is saved into Mango.f95
I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
gfortran-4 -shared -o Mango.dll Mango.o
I am on a Windows machine running Windows 7 with Intel i7.
I load the dll in a 32-bit R by dyn.load(Mango.dll). Using
getLoadedDLLs
I can see the DLL. However, is.loaded(Mango.dll) = FALSE. In
addition, R
stop responding when I try .Fortran(MyPBP, as.numeric(S),
as.numeric(p),
as.integer(N)),
where N-5, S-array(0,N+1) and p- c(0.1, 0.2, 0.5, 0.8, 0.9).

What am I doing wrong?
Any ideas, thoughts and/or comments are highly appreciated.

Jens

  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-helphttps://stat.ethz.ch/mailman/**listinfo/r-help
https://stat.**ethz.ch/mailman/listinfo/r-**helphttps://stat.ethz.ch/mailman/listinfo/r-help



PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html 
http://www.R-project.org/**posting-guide.htmlhttp://www.R-project.org/posting-guide.html




and provide commented, minimal, self-contained, reproducible code.













__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tables package - remove NAs and NaN

2013-04-24 Thread Santosh

Dear Rxperts,
Sorry if I am posting a really really dumb request.. I am new to subversion
and am trying to use subversion to download the tables package as suggested
by Duncan. I installed subversion client(from collabnet) and tried to
access tables package using the command below.

svn checkout svn://scm.r-forge.r-project.org/svnroot/tables/

I get the following error message:
C:\Users\santosh\tempsvn checkout svn://
scm.r-forge.r-project.org/svnroot/tables/
svn: E730060: Unable to connect to a repository at URL
'svn://scm.r-forge.r-proj ect.org/svnroot/tables'
svn: E730060: Can't connect to host 'scm.r-forge.r-project.org': A
connection at tempt failed because the connected party did not properly
respond after a period  of time, or established connection failed because
connected host has failed to
respond.

Is there anything additional I need to do with Subversion or with the
commands?


Regards,
Santosh

On Tue, Apr 23, 2013 at 5:13 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote:

 On 13-04-23 6:31 AM, Duncan Murdoch wrote:

 On 13-04-22 10:40 PM, David Winsemius wrote:


 On Apr 22, 2013, at 5:49 PM, Santosh wrote:

  Dear Rxperts,
 q - data.frame(p=rep(c(A,B),**each=10,len=30),
 a=rep(c(1,2,3),each=10),id=**seq(30),
 b=round(runif(30,10,20)),
 c=round(runif(30,40,70)))
 The operation below...
 tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b + c)*
 (mean+sd),data=q)
 yields some rows of NAs and NaN as shown below

   b   c
 p a   N  mean  sdmean  sd
 A 1   10 16.30 2.497 52.30  9.358
 20   NaNNA   NaN NA
 3   10 15.60 2.716 60.30  8.001
 B 10   NaNNA   NaN NA
 2   10 15.40 2.366 57.70 10.414
 30   NaNNA   NaN NA
 All 30 15.77 2.473 56.77  9.601

 How do I remove the rows having N=0 ?
 I would like the resulting table look like..
   b   c
 p a   N  mean  sdmean  sd
 A 1   10 16.30 2.497 52.30  9.358
   3   10 15.60 2.716 60.30  8.001
 B  2   10 15.40 2.366 57.70 10.414
 All 30 15.77 2.473 56.77  9.601


 Here's a bit of a hack:

 tabular( (`p a`=interaction(p,a, drop=TRUE, sep= )) ~ (N = 1) + (b +
 c)*
   (mean+sd),data=q)

   b   c
p a N  mean sd mean sd
A 1 10 12.8 0.7888 52.1 8.020
B 2 10 16.3 3.0569 54.9 8.711
A 3 10 14.6 3.7771 56.5 6.980

 I have been rather hoping that Duncan Murdoch would have noticed the
 earlier thread, but maybe he can comment on whether there is a more direct
 route/


 This isn't something that the package is designed to handle:  if you say
 p*a, it wants all combinations of p and a.

 If I wanted a table like that, I'd use a different hack.  One
 possibility is to create that interaction column, but display it as just
 the initial letter, labelled p, and then add another column to contain
 the a values as data.  It would be tricky to get the formatting right.

 Another possibility is to generate the whole table with the N=0 rows,
 and then post-process it to remove those rows, and adjust the row labels
 appropriately.  This approach probably gives the nicer result, but the
 post-processing is quite messy:  you need to delete some rows from the
 table, from its rowLabels attribute, and from the justification
 attributes of both the table and its rowLabels.  (I should add a [
 method to the package to hide this messiness.)


 I've done this now, in version 0.7.54 on R-forge.  To leave out the rows
 with N=0, you can select a subset of the table where N (the first column)
 is non-zero:

 tab - tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b +
 c)*(mean+sd),data=q)

 tab[ tab[,1]  0, ]

 and it produces this:


  b   c
  p a   N  mean  sdmean sd
  A 1   10 16.20 3.458 56.3 10.155
3   10 13.60 2.119 58.1  8.075
  B 2   10 14.40 2.547 51.2  9.438
All 30 14.73 2.888 55.2  9.419

 Indexing of tables isn't as general as indexing of matrices, but most of
 the simple forms should work.  I haven't tested yet, but I expect this will
 be fine in LaTeX or HTML (also new, not on CRAN yet) output as well.

 Duncan Murdoch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pglm package: fitted values and residuals

2013-04-24 Thread Paul Johnson

On Wed, Apr 24, 2013 at 3:11 AM,  alfonso.carf...@uniparthenope.it wrote:
 I'm using the package pglm and I'have estimated a random probit model.
 I need to save in a vector the fitted values and the residuals of the model
 but I can not do it.

 I tried with the command fitted.values using the following procedure without
 results:

This is one of those ask the pglm authors questions. You should take
it up with the authors of the package.  There is a specialized email
list R-sig-mixed where you will find more people working on this exact
same thing.

pglm looks like fun to me, but it is not quite done, so far as I can
tell. Well, the authors have not gone the extra step to make their
regression objects behave like other R regression objects.   In case
you need alternative software, ask in R-sig-mixed. You'll learn that
most of these can be estimated with other packages. But I really like
the homogeneous user interface that is spelled out in pglm, and I
expect my students will run into the same questions that you have..

I just downloaded their source code, you probably ought to do that so
you can understand what they are doing.   They provide the fitting
functions, but they do not do any of the other work necessary to make
these functions fit together with the R class framework.  There are no
methods for predict, anova, and so forth.

I'm in their R folder looking for implementations:

pauljohn@pols110:/tmp/pglm/R$ grep summary *
pauljohn@pols110:/tmp/pglm/R$ grep predict *
pauljohn@pols110:/tmp/pglm/R$ grep class *
pauljohn@pols110:/tmp/pglm/R$ grep fitted *
pglm.R:  # glm models can be fitted

Run

 example(pglm)

what can we do after that?

 plot(anb)
Error in xy.coords(x, y, xlabel, ylabel, log) :
  'x' is a list, but does not have components 'x' and 'y'

## Nothing.
## We do get a regression summary object, that's better than some
packages provide:

 anbsum - summary(anb)

## And a coefficient table

 coef(anbsum)
 Estimate  Std. error   t value  Pr( t)
(Intercept) -6.933764e-01 0.061391429 -11.294351205 1.399336e-29
wage 1.517009e-02 0.006375966   2.379261231 1.734738e-02
exper1.314229e-03 0.007400129   0.177595444 8.590407e-01
ruralyes-8.594328e-05 0.051334716  -0.001674175 9.986642e-01

 model.matrix(anb)
Error in terms.default(object) : no terms component nor attribute
 anova(anb)
Error in UseMethod(anova) :
  no applicable method for 'anova' applied to an object of class
c('maxLik', 'maxim')
 predict(anb)
Error in UseMethod(predict) :
  no applicable method for 'predict' applied to an object of class
c('maxLik', 'maxim')

So, if you want those features with these models, you'll have to get
busy and do a lot of coding!

While working on regression support lately, I've reached the
conclusion that if an R package that claims to do regression but
does not provide methods for summary, predict, anova, nobs, fitted,
logLik, AIC, and so forth, then it is not done yet. Otherwise, users
like you who expect to be able to run methods like fitted or such have
a bad experience, as you are having now.

Maybe somebody reading this will remind us where the common list of R
regression methods is listed. I know for sure I've seen a document
about these things, but I'm baffled now trying to find it. But I'm
sure there is one.


pj

 library(pglm)

 m1_S-pglm(Feed ~ Cons_PC_1 + imp_gen_1 + LGDP_PC_1 + lnEI_1 +
 SH_Ren_1,data,family=binomial(probit),model=random,method=bfgs,index=c(Year,IDCountry))

 m1_S$fitted.values
 residuals(m1)


 Can someone help me about it?

 Thanks


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tables package - remove NAs and NaN


On 13-04-24 3:23 PM, Santosh wrote:

Dear Rxperts,
Sorry if I am posting a really really dumb request.. I am new to subversion
and am trying to use subversion to download the tables package as suggested
by Duncan. I installed subversion client(from collabnet) and tried to
access tables package using the command below.

svn checkout svn://scm.r-forge.r-project.org/svnroot/tables/

I get the following error message:
C:\Users\santosh\tempsvn checkout svn://
scm.r-forge.r-project.org/svnroot/tables/
svn: E730060: Unable to connect to a repository at URL
'svn://scm.r-forge.r-proj ect.org/svnroot/tables'
svn: E730060: Can't connect to host 'scm.r-forge.r-project.org': A
connection at tempt failed because the connected party did not properly
respond after a period  of time, or established connection failed because
connected host has failed to
respond.

Is there anything additional I need to do with Subversion or with the
commands?




The spacing looks funny there:  You should have no blanks in the path.

Maybe you didn't, it's only the email.  In that case, R-forge is 
probably just responding very slowly.


You could try this path instead:

svn checkout svn://scm.r-forge.r-project.org/svnroot/tables/pkg/tables

This is the subdirectory that contains everything in the package.

And you should be able to install the package directly from R-forge by 
setting your repository to http://R-forge.r-project.org, but it is very 
slow on updates, so it hasn't got to the current version yet.


Duncan Murdoch




Regards,
Santosh

On Tue, Apr 23, 2013 at 5:13 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote:


On 13-04-23 6:31 AM, Duncan Murdoch wrote:


On 13-04-22 10:40 PM, David Winsemius wrote:



On Apr 22, 2013, at 5:49 PM, Santosh wrote:

  Dear Rxperts,

q - data.frame(p=rep(c(A,B),**each=10,len=30),
a=rep(c(1,2,3),each=10),id=**seq(30),
b=round(runif(30,10,20)),
c=round(runif(30,40,70)))
The operation below...
tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b + c)*
(mean+sd),data=q)
yields some rows of NAs and NaN as shown below

   b   c
p a   N  mean  sdmean  sd
A 1   10 16.30 2.497 52.30  9.358
 20   NaNNA   NaN NA
 3   10 15.60 2.716 60.30  8.001
B 10   NaNNA   NaN NA
 2   10 15.40 2.366 57.70 10.414
 30   NaNNA   NaN NA
 All 30 15.77 2.473 56.77  9.601

How do I remove the rows having N=0 ?
I would like the resulting table look like..
   b   c
p a   N  mean  sdmean  sd
A 1   10 16.30 2.497 52.30  9.358
   3   10 15.60 2.716 60.30  8.001
B  2   10 15.40 2.366 57.70 10.414
 All 30 15.77 2.473 56.77  9.601



Here's a bit of a hack:

tabular( (`p a`=interaction(p,a, drop=TRUE, sep= )) ~ (N = 1) + (b +
c)*
   (mean+sd),data=q)

   b   c
p a N  mean sd mean sd
A 1 10 12.8 0.7888 52.1 8.020
B 2 10 16.3 3.0569 54.9 8.711
A 3 10 14.6 3.7771 56.5 6.980

I have been rather hoping that Duncan Murdoch would have noticed the
earlier thread, but maybe he can comment on whether there is a more direct
route/



This isn't something that the package is designed to handle:  if you say
p*a, it wants all combinations of p and a.

If I wanted a table like that, I'd use a different hack.  One
possibility is to create that interaction column, but display it as just
the initial letter, labelled p, and then add another column to contain
the a values as data.  It would be tricky to get the formatting right.

Another possibility is to generate the whole table with the N=0 rows,
and then post-process it to remove those rows, and adjust the row labels
appropriately.  This approach probably gives the nicer result, but the
post-processing is quite messy:  you need to delete some rows from the
table, from its rowLabels attribute, and from the justification
attributes of both the table and its rowLabels.  (I should add a [
method to the package to hide this messiness.)



I've done this now, in version 0.7.54 on R-forge.  To leave out the rows
with N=0, you can select a subset of the table where N (the first column)
is non-zero:

tab - tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b +
c)*(mean+sd),data=q)

tab[ tab[,1]  0, ]

and it produces this:


  b   c
  p a   N  mean  sdmean sd
  A 1   10 16.20 3.458 56.3 10.155
3   10 13.60 2.119 58.1  8.075
  B 2   10 14.40 2.547 51.2  9.438
All 30 14.73 2.888 55.2  9.419

Indexing of tables isn't as general as indexing of matrices, but most of
the simple forms should work.  I haven't tested yet, but I expect this will
be fine in LaTeX or HTML (also new, not on CRAN yet) output as well.

Duncan Murdoch





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RDA permutest envfit

2013-04-24 Thread Rémi Lesmerises

Thank you!

A last question: Is it still the same explanation if I remove the condition 
first=TRUE (and then testing for all axis) and permutest gives the same 
result?
Â 
RÃ©mi Lesmerises, biol. M.Sc.,
Candidat Ph.D. en Biologie
UniversitÃ© du QuÃ©bec Ã  Rimouski
remilesmeri...@yahoo.ca



On Wed, 2013-04-24 at 10:38 -0700, RÃ©mi Lesmerises wrote:
 Dear all,
 
 I did a RDA and when I looked to the signification of the test with
 permutest, the output was non-significant. But when I used the envfit
 function, some of the vectors are significant. All the test's
 conditions are respected. What it means? Is it an error in the script?

The two functions are testing two fundamentally different things.
`permutest` (as you invoked it --- `first = TRUE`) is testing the first
axis of the RDA. That axis is a linear combination of the constraints.

The `envfit` procedure is testing the individual correlations between
the 2-d configuration of samples in the RDA space and the direction of
maximal variance of the environmental data. Each variable is considered
separately.

I can easily imagine a case, where the variance explained on the first
axis is not significant but variance over 2 axes is significant, as one
where the vectors do not point solely along the first RDA axis but also
point along the 2nd axis. By looking only at their contributions to the
first axis and summing them you don't explain a whole lot, but when you
look at the directions in 2D space each variable explains a significant
amount of variance.

HTH

G

 Commands and output:
 
  permutest(rda.ind, perm=999, first=TRUE)
 
 Permutation test for rda 
 
 Call: rda(formula = x ~ ARH_frs + CON_frs + PRP_frs + RUI_frs +
 VAM_frs, data = z)
 Permutation test for first constrained eigenvalue
 Pseudo-F:Â  Â  Â  Â  4.093568 (with 1, 10 Degrees of Freedom)
 Significance:Â  Â  0.413 
 Based on 999 permutations under reduced model.
 
  fit - envfit(rda.ind, z, perm = 999, display = lc)
  fit
 
 ***VECTORS
 
Â  Â  Â  Â  Â  Â  Â  RDA1Â  Â  Â  RDA2Â  Â   r2 Pr(r)Â  Â  
 VAM_frsÂ  0.145281 -0.989390 0.2388Â  0.147Â  Â  
 
 ARH_frs -0.876494 -0.481413 0.6127Â  0.002 ** 
 CON_frsÂ  0.904278Â  0.426944 0.4846Â  0.013 *Â  
 PRP_frs -0.997634Â  0.068755 0.9433Â  0.001 ***
 RUI_frs -0.648512 -0.761204 0.6243Â  0.004 ** 
 ---
 Signif. codes:Â  0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â 
 â 1 
 P values based on 999 permutations.
 
Â  
 RÃ©mi Lesmerises, biol. M.Sc.,
 Candidat Ph.D. en Biologie
 UniversitÃ© du QuÃ©bec Ã  Rimouski
 remilesmeri...@yahoo.ca
 
 Â Â Â  [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin SimpsonÂ  Â  Â  Â  Â  Â   [t] +44 (0)20 7679 0522
ECRC, UCL Geography,Â  Â  Â  Â  Â  [f] +44 (0)20 7679 0565
Pearson Building,Â  Â  Â  Â  Â  Â   [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, LondonÂ  Â  Â  Â  Â  [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT.Â  Â  Â  Â  Â  Â  Â  Â   [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’


On 13-04-24 10:57 AM, Duncan Murdoch wrote:

On 13-04-24 10:12 AM, Martin Morgan wrote:

On 04/24/2013 06:03 AM, Duncan Murdoch wrote:

On 13-04-24 5:46 AM, Liviu Andronic wrote:

Dear all,
I've bumped into the: Error in loadNamespace(name) : there is no
package called ‘R.utils’ error. I've already read a bit on this (
http://www.cybaea.net/Blogs/Data/A-warning-on-the-R-save-format.html )
but I have a follow-up question.

Given a workspace that automatically loads a package that I don't
really need/want (e.g. ‘R.utils’), how do I identify which object
requires this package to load? I would like to avoid loading ‘R.utils’
every time I open an R session.


That's not easy, because the code in R that triggers that error has no idea of
the name of the object it is loading.


Maybe traceback() can provide some hints? I did, more or less arbitrarily

library(rms)
a = list(fun=ie.setup)
save(a, file=/tmp/a.rda)
remove.packages(rms)

and then in a new session

load(/tmp/a.rda)
Error in loadNamespace(name) : there is no package called 'rms'
traceback()
7: stop(e)
6: value[[3L]](cond)
5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
4: tryCatchList(expr, classes, parentenv, handlers)
3: tryCatch(loadNamespace(name), error = function(e) stop(e))
2: getNamespace(c(rms, 3.6-3))
1: load(/tmp/a.rda)

with the line numbered 2 giving me the necessary hint.


That tells you that some object needs the rms package, but I don't see
how you would know it is a that is the problem.  We already knew that
rms was needed from the error message.

What I've done sometimes in debugging is to change that error to a
warning in the getNamespace() function, and add some tracing code to the
serialization code to print the names of objects as they are loaded.
(This goes in ReadItem in src/main/serialize.c.)

I wouldn't expect Liviu to make those changes, but perhaps a verbose
option could be added to load(), so that it could be available to users.



I have added this in R-devel.  The format of the printed output may well 
change before this is ever released, but it should be enough to identify 
the bad item already.


You'll need a build of R-devel from r62658 or newer to see this.  Then

load(/tmp/a.rda, verbose=TRUE)

will print the names of objects as they are read (the names are read 
after the attributes and before the value).  If you want to see reams of 
mostly useless information, you can try verbose=n (for some number n=2 
or more); this prints names and component numbers to a greater depth.


Duncan Murdoch



Duncan Murdoch



Martin




You could try a binary search to find out, but it will be tedious:
1. Install R.utils.
2. Load the workspace successfully.
3. Delete half the objects, and save it.
4. Uninstall R.utils, and see if you can load the workspace.

At this point you'll know if there's an object needing R.utils still left or
not, and you can repeat the steps until you find a single object that causes the
problem.  (But it might not be the only one, so deleting it from the original
workspace might not solve your problem.)

A better approach is to *never* save and load workspaces unless you know exactly
what is in them.  Always reply no to the question about saving your workspace
(or set that as the default).  If you accidentally end up with a workspace being
loaded, delete it.

Duncan Murdoch



Regards,
Liviu



sessionInfo()

R version 2.15.3 (2013-03-01)
Platform: i686-pc-linux-gnu (32-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] datasets  grDevices splines   graphics  utils stats
methods   base

other attached packages:
[1] R.utils_1.23.2R.oo_1.13.0   R.methodsS3_1.4.2
tables_0.7.57 reshape2_1.2.2
[6] car_2.0-15nnet_7.3-6MASS_7.3-23
Hmisc_3.10-1  survival_2.37-2
[11] foreign_0.8-53

loaded via a namespace (and not attached):
[1] cluster_1.14.3   grid_2.15.3  lattice_0.20-13  plyr_1.8
rstudio_0.97.312
[6] stringr_0.6.2tools_2.15.3




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RDA permutest envfit

2013-04-24 Thread Gavin Simpson

On Wed, 2013-04-24 at 13:13 -0700, Rémi Lesmerises wrote:
 Thank you!
 
 A last question: Is it still the same explanation if I remove the
 condition first=TRUE (and then testing for all axis) and permutest
 gives the same result?

No; again `permutest(, first = FALSE)` and `envfit()` are testing
different models/fits but how they are different is not the same as with
your previous question.

`permutest(, first = FALSE)` is testing whether the variance
explained by the k environmental variable (k = 4 I believe in your case)
is sufficiently large as to be unusual when compared to the variance
explained under a null model (achieved by permuting residuals,
propagating those residuals plus fitted values to give new data and
refitting the model on those new data). `envfit()` is doing what it
always did; assessing whether the correlation between the vector and the
2-d configuration is unusually large.

In the `permutest(, first = FALSE)` case, you have a model with 4df.
In the `envfit()` case you have 4 models each with 1 degree of freedom.

The two methods also differ in terms of the role played by the
environmental data. In the `permutest` case, the environmental data are
the predictors. In the `envfit` case the env data are the responses and
the axis scores in the 2 dimensions chosen are the predictors.

HTH

G

 Rémi Lesmerises, biol. M.Sc.,
 Candidat Ph.D. en Biologie
 Université du Québec à Rimouski
 remilesmeri...@yahoo.ca
 
 
 
 On Wed, 2013-04-24 at 10:38 -0700, Rémi Lesmerises wrote:
  Dear all,
  
  I did a RDA and when I looked to the signification of the test with
  permutest, the output was non-significant. But when I used the envfit
  function, some of the vectors are significant. All the test's
  conditions are respected. What it means? Is it an error in the script?
 
 The two functions are testing two fundamentally different things.
 `permutest` (as you invoked it --- `first = TRUE`) is testing the first
 axis of the RDA. That axis is a linear combination of the constraints.
 
 The `envfit` procedure is testing the individual correlations between
 the 2-d configuration of samples in the RDA space and the direction of
 maximal variance of the environmental data. Each variable is considered
 separately.
 
 I can easily imagine a case, where the variance explained on the first
 axis is not significant but variance over 2 axes is significant, as one
 where the vectors do not point solely along the first RDA axis but also
 point along the 2nd axis. By looking only at their contributions to the
 first axis and summing them you don't explain a whole lot, but when you
 look at the directions in 2D space each variable explains a significant
 amount of variance.
 
 HTH
 
 G
 
  Commands and output:
  
   permutest(rda.ind, perm=999, first=TRUE)
  
  Permutation test for rda 
  
  Call: rda(formula = x ~ ARH_frs + CON_frs + PRP_frs + RUI_frs +
  VAM_frs, data = z)
  Permutation test for first constrained eigenvalue
  Pseudo-F:4.093568 (with 1, 10 Degrees of Freedom)
  Significance:0.413 
  Based on 999 permutations under reduced model.
  
   fit - envfit(rda.ind, z, perm = 999, display = lc)
   fit
  
  ***VECTORS
  
   RDA1  RDA2 r2 Pr(r)
  VAM_frs  0.145281 -0.989390 0.2388  0.147
  
  ARH_frs -0.876494 -0.481413 0.6127  0.002 ** 
  CON_frs  0.904278  0.426944 0.4846  0.013 *  
  PRP_frs -0.997634  0.068755 0.9433  0.001 ***
  RUI_frs -0.648512 -0.761204 0.6243  0.004 ** 
  ---
  Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
  P values based on 999 permutations.
  
   
  Rémi Lesmerises, biol. M.Sc.,
  Candidat Ph.D. en Biologie
  Université du Québec à Rimouski
  remilesmeri...@yahoo.ca
  
  [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pglm package: fitted values and residuals

2013-04-24 Thread Achim Zeileis


On Wed, 24 Apr 2013, Paul Johnson wrote:


On Wed, Apr 24, 2013 at 3:11 AM,  alfonso.carf...@uniparthenope.it wrote:

I'm using the package pglm and I'have estimated a random probit 
model. I need to save in a vector the fitted values and the residuals 
of the model but I can not do it.


I tried with the command fitted.values using the following procedure 
without results:


This is one of those ask the pglm authors questions. You should take 
it up with the authors of the package.  There is a specialized email 
list R-sig-mixed where you will find more people working on this exact 
same thing.


pglm looks like fun to me, but it is not quite done, so far as I can 
tell. Well, the authors have not gone the extra step to make their 
regression objects behave like other R regression objects.  In case you 
need alternative software, ask in R-sig-mixed. You'll learn that most of 
these can be estimated with other packages. But I really like the 
homogeneous user interface that is spelled out in pglm, and I expect my 
students will run into the same questions that you have..


I just downloaded their source code, you probably ought to do that so
you can understand what they are doing.   They provide the fitting
functions, but they do not do any of the other work necessary to make
these functions fit together with the R class framework.  There are no
methods for predict, anova, and so forth.


This is only partially true. In fact, pglm employs the framework 
provided by the maxLik (by Ott Toomet and Arne Henningsen) and hence it 
inherits some of the methods that maxLik provides for all of its fitted 
model objects. So there is summary(), coef(), vcov(), AIC() work and you 
can leverage tools like coeftest() from lmtest, linearHypothesis() from 
car or sandwich() from sandwich do work.


But it is certainly true that even more features would be desirable and I 
think that Yves always planned on enhancing pglm at some point. (After 
all it's still the initial version 0.1-0 on CRAN...)



I'm in their R folder looking for implementations:

pauljohn@pols110:/tmp/pglm/R$ grep summary *
pauljohn@pols110:/tmp/pglm/R$ grep predict *
pauljohn@pols110:/tmp/pglm/R$ grep class *
pauljohn@pols110:/tmp/pglm/R$ grep fitted *
pglm.R:  # glm models can be fitted

Run


example(pglm)


what can we do after that?


plot(anb)

Error in xy.coords(x, y, xlabel, ylabel, log) :
 'x' is a list, but does not have components 'x' and 'y'

## Nothing.
## We do get a regression summary object, that's better than some
packages provide:


anbsum - summary(anb)


## And a coefficient table


coef(anbsum)

Estimate  Std. error   t value  Pr( t)
(Intercept) -6.933764e-01 0.061391429 -11.294351205 1.399336e-29
wage 1.517009e-02 0.006375966   2.379261231 1.734738e-02
exper1.314229e-03 0.007400129   0.177595444 8.590407e-01
ruralyes-8.594328e-05 0.051334716  -0.001674175 9.986642e-01


model.matrix(anb)

Error in terms.default(object) : no terms component nor attribute

anova(anb)

Error in UseMethod(anova) :
 no applicable method for 'anova' applied to an object of class
c('maxLik', 'maxim')

predict(anb)

Error in UseMethod(predict) :
 no applicable method for 'predict' applied to an object of class
c('maxLik', 'maxim')

So, if you want those features with these models, you'll have to get
busy and do a lot of coding!

While working on regression support lately, I've reached the conclusion 
that if an R package that claims to do regression but does not provide 
methods for summary, predict, anova, nobs, fitted, logLik, AIC, and so 
forth, then it is not done yet. Otherwise, users like you who expect to 
be able to run methods like fitted or such have a bad experience, as you 
are having now.


Maybe somebody reading this will remind us where the common list of R
regression methods is listed. I know for sure I've seen a document
about these things, but I'm baffled now trying to find it. But I'm
sure there is one.


I'm sure that there are many. One of my attempts to write up a list is in 
Table 1 of vignette(betareg, package = betareg).


Personally, I don't write anova() methods for my model objects because I 
can leverage lrtest() and waldtest() from lmtest and linearHypothesis() 
and deltaMethod() from car as long as certain standard methods are 
available, including coef(), vcov(), logLik(), etc.


Similarly, an AIC() method is typically not needed as long as logLik() is 
available. And BIC() works if nobs() is available in addition.


Best,
Z



pj


library(pglm)

m1_S-pglm(Feed ~ Cons_PC_1 + imp_gen_1 + LGDP_PC_1 + lnEI_1 +
SH_Ren_1,data,family=binomial(probit),model=random,method=bfgs,index=c(Year,IDCountry))

m1_S$fitted.values
residuals(m1)


Can someone help me about it?

Thanks



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and

[R] getting started in parallel computing on a windows OS

2013-04-24 Thread Benjamin Caldwell

Dear R help,

I've what I think is a fairly simple parallel problem, and am getting
bogged down in documentation and packages for much more complex situations.

I have a big matrix  (30^5,5]. I have a function that will act on each row
of that matrix sequentially and output the 'best' result from the whole
matrix (it compares the result from each row to the last and keeps the
'better' result). I would like to divide that first large matrix into
chunks equal to the number of cores I have available to me, and work
through each chunk, then output the results from each chunk.

I'm really having trouble making head or tail of how to do this on a
windows machine - lots of different false starts on several different
packages now. Basically, I have the function, and I can of course easily
divide the matrix into chunks. I just need a way to process each chunk
in parallel (other than opening new R sessions for each core manually).

Any help much appreciated - after two days of trying to get this to work
I'm pretty burnt out.

Thanks

*Ben Caldwell*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tables package - remove NAs and NaN

2013-04-24 Thread Santosh

yes, With Duncan's and Liviu's help,  I was able to remove those NAs and
NaNs from the tabular summary.

svn .. thing has not worked for me yet.. would try this later..

Thanks so much!
Regards,
Santosh



On Wed, Apr 24, 2013 at 1:46 PM, Santosh santosh2...@gmail.com wrote:

 Thanks so much.. I will try it out.
 Regards,
 Santosh


 On Wed, Apr 24, 2013 at 1:39 PM, Duncan Murdoch 
 murdoch.dun...@gmail.comwrote:

 On 13-04-24 4:29 PM, Santosh wrote:

 Dear Duncan,

 Thanks for your email. For some reason svn does not seem to work on my
 machine.. not sure if it is due to firewall or due to my company's IS
 environment settings. I could, however, install the package after
 pointing
 the repos to r-forge site as you had suggested. But, yes, the versions
 installed were 0.7.47 and 0.7.51 and therefore is not the latest one.

 Could you, please, be able to send it to me as a zipped attachment that I
 can use in both 32-bit and 64-bit Windows?


 You can download it from

 http://www.stats.uwo.ca/**faculty/murdoch/temp/tables_0.**7.57.tar.gzhttp://www.stats.uwo.ca/faculty/murdoch/temp/tables_0.7.57.tar.gz

 for the source, or

 http://www.stats.uwo.ca/**faculty/murdoch/temp/tables_0.**7.57.ziphttp://www.stats.uwo.ca/faculty/murdoch/temp/tables_0.7.57.zip

 for the binary install.  I think the latter should work in both 32 and 64
 bit Windows, but I haven't tested it.

 Duncan Murdoch


 Thanks so much,
 Santosh


 On Wed, Apr 24, 2013 at 1:02 PM, Duncan Murdoch 
 murdoch.dun...@gmail.com**wrote:

  On 13-04-24 3:23 PM, Santosh wrote:

  Dear Rxperts,

 Sorry if I am posting a really really dumb request.. I am new to
 subversion
 and am trying to use subversion to download the tables package as
 suggested
 by Duncan. I installed subversion client(from collabnet) and tried to
 access tables package using the command below.

 svn checkout svn://scm.r-forge.r-project.org/svnroot/tables/
 http://**scm.r-forge.r-project.org/**svnroot/tables/http://scm.r-forge.r-project.org/svnroot/tables/
 


 I get the following error message:
 C:\Users\santosh\tempsvn checkout svn://
 scm.r-forge.r-project.org/svnroot/tables/http://scm.r-forge.r-project.org/**svnroot/tables/
 http://scm.r-**forge.r-project.org/svnroot/**tables/http://scm.r-forge.r-project.org/svnroot/tables/
 

 svn: E730060: Unable to connect to a repository at URL
 'svn://scm.r-forge.r-proj ect.org/svnroot/tables'
 svn: E730060: Can't connect to host 'scm.r-forge.r-project.org': A
 connection at tempt failed because the connected party did not properly
 respond after a period  of time, or established connection failed
 because
 connected host has failed to
 respond.

 Is there anything additional I need to do with Subversion or with the
 commands?



 The spacing looks funny there:  You should have no blanks in the path.

 Maybe you didn't, it's only the email.  In that case, R-forge is
 probably
 just responding very slowly.

 You could try this path instead:

 svn checkout svn://scm.r-forge.r-project.
 org/svnroot/tables/pkg/tables**http://scm.r-forge.r-project.**
 org/svnroot/tables/pkg/tableshttp://scm.r-forge.r-project.org/svnroot/tables/pkg/tables
 


 This is the subdirectory that contains everything in the package.

 And you should be able to install the package directly from R-forge by
 setting your repository to http://R-forge.r-project.org, but it is very
 slow on updates, so it hasn't got to the current version yet.

 Duncan Murdoch



  Regards,
 Santosh

 On Tue, Apr 23, 2013 at 5:13 AM, Duncan Murdoch 
 murdoch.dun...@gmail.com

 **wrote:


   On 13-04-23 6:31 AM, Duncan Murdoch wrote:


   On 13-04-22 10:40 PM, David Winsemius wrote:



  On Apr 22, 2013, at 5:49 PM, Santosh wrote:

Dear Rxperts,

  q - data.frame(p=rep(c(A,B),**each=10,len=30),
 a=rep(c(1,2,3),each=10),id=**seq(30),


 b=round(runif(30,10,20)),
 c=round(runif(30,40,70)))
 The operation below...
 tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b + c)*


 (mean+sd),data=q)
 yields some rows of NAs and NaN as shown below

 b   c
 p a   N  mean  sdmean  sd
 A 1   10 16.30 2.497 52.30  9.358
   20   NaNNA   NaN NA
   3   10 15.60 2.716 60.30  8.001
 B 10   NaNNA   NaN NA
   2   10 15.40 2.366 57.70 10.414
   30   NaNNA   NaN NA
   All 30 15.77 2.473 56.77  9.601

 How do I remove the rows having N=0 ?
 I would like the resulting table look like..
 b   c
 p a   N  mean  sdmean  sd
 A 1   10 16.30 2.497 52.30  9.358
 3   10 15.60 2.716 60.30  8.001
 B  2   10 15.40 2.366 57.70 10.414
   All 30 15.77 2.473 56.77  9.601


  Here's a bit of a hack:

 tabular( (`p a`=interaction(p,a, drop=TRUE, sep= )) ~ (N = 1) +
 (b +
 c)*
 (mean+sd),data=q)

 b   c
  p a N  mean sd mean sd
  A 1 10 12.8 0.7888 52.1 8.020
  B 2 10 16.3 3.0569 54.9 8.711
  A 3 10 14.6 3.7771 56.5 6.980

 I

Re: [R] Trouble Computing Type III SS in a Cox Regression

2013-04-24 Thread Terry Therneau

I should hope that there is trouble, since type III is an undefined concept for a Cox 
model.  Since SAS Inc fostered the cult of type III they have recently added it as an 
option for phreg, but I am not able to find any hints in the phreg documentation of what 
exactly they are doing when you invoke it.  If you can unearth this information, then I 
will be happy to tell you whether

   a. using the test (whatever it is) makes any sense at all for your data set
   b. if a is true, how to get it out of R

I use the word cult on purpose -- an entire generation of users who believe in the 
efficacy of this incantation without having any idea what it actually does.  In many 
particular instances the SAS type III corresponds to a survey sampling question, i.e., 
reweight the data so that it is balanced wrt factor A and then test factor B in the new 
sample.  The three biggest problems with type III are that
1: the particular test has been hyped as better when in fact it sometimes is sensible 
and sometimes not, 2: SAS implemented it as a computational algorithm which unfortunately 
often works even when the underlying rationale does not hold and
3: they explain it using a notation that completely obscures the actual question.  This 
last leads to the nonsense phrase test for main effects in the presence of interactions.


There is a survey reweighted approach for Cox models, very closely related to the work 
on causal inference (marginal structural models), but I'd bet dollars to donuts that 
this is not what SAS is doing.


(Per 2 -- type III was a particular order of operations of the sweep algorithm for linear 
models, and for backwards compatability that remains the core definition even as 
computational algorthims have left sweep behind.  But Cox models can't be computed using 
the sweep algorithm).


Terry Therneau


On 04/24/2013 12:41 PM, r-help-requ...@r-project.org wrote:

Hello All,

Am having some trouble computing Type III SS in a Cox Regression using either 
drop1 or Anova from the car package. Am hoping that people will take a look to 
see if they can tell what's going on.

Here is my R code:

cox3grp- subset(survData,
Treatment %in% c(DC, DA, DO),
c(PTNO, Treatment, PFS_CENSORED, PFS_MONTHS, AGE, PS2))
cox3grp- droplevels(cox3grp)
str(cox3grp)

coxCV- coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data=cox3grp, method = 
efron)
coxCV

drop1(coxCV, test=Chisq)

require(car)
Anova(coxCV, type=III)

And here are my results:

cox3grp- subset(survData,
+   Treatment %in% c(DC, DA, DO),
+   c(PTNO, Treatment, PFS_CENSORED, PFS_MONTHS, AGE, 
PS2))

  cox3grp- droplevels(cox3grp)
  str(cox3grp)

'data.frame':   227 obs. of  6 variables:
  $ PTNO: int  1195997 104625 106646 1277507 220506 525343 789119 
817160 824224 82632 ...
  $ Treatment   : Factor w/ 3 levels DC,DA,DO: 1 1 1 1 1 1 1 1 1 1 ...
  $ PFS_CENSORED: int  1 1 1 0 1 1 1 1 0 1 ...
  $ PFS_MONTHS  : num  1.12 8.16 6.08 1.35 9.54 ...
  $ AGE : num  72 71 80 65 72 60 63 61 71 70 ...
  $ PS2 : Ord.factor w/ 2 levels YesNo: 2 2 2 2 2 2 2 2 2 2 ...
  
  coxCV- coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data=cox3grp, method = efron)

  coxCV

Call:
coxph(formula = Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2,
 data = cox3grp, method = efron)


   coef exp(coef) se(coef)  z p
AGE0.00492 1.005  0.00789  0.624 0.530
PS2.L -0.34523 0.708  0.14315 -2.412 0.016

Likelihood ratio test=5.66  on 2 df, p=0.0591  n= 227, number of events= 198
  
  drop1(coxCV, test=Chisq)

Single term deletions

Model:
Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2
DfAICLRT Pr(Chi)
none 1755.2
AGE 1 1753.6 0.3915  0.53151
PS2 1 1758.4 5.2364  0.02212 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  require(car)

  Anova(coxCV,  type=III)

Analysis of Deviance Table (Type III tests)
 LR Chisq Df Pr(Chisq)
AGE   0.3915  10.53151
PS2   5.2364  10.02212 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  

Both drop1 and Anova give me a different p-value than I get from coxph for both 
my two-level ps2 variable and for age. This is not what I would expect based on 
experience using SAS to conduct similar analyses. Indeed SAS consistently 
produces the same p-values. Namely the ones I get from coxph.

My sense is that I'm probably misusing R in some way but I'm not sure what I'm 
likely to be doing wrong. SAS prodcues Wald Chi-Square results for its type III 
tests. Maybe that has something to do with it. Ideally, I'd like to get type 
III values that match those from coxph. If anyone could help me understand 
better, that would be greatly appreciated.

Thanks,

Paul


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide

[R] Floating point precision causing undesireable behaviour when printing as.POSIXlt times with microseconds?

2013-04-24 Thread O'Hanlon, Simon J

Dear list,
When using as.POSIXlt with times measured down to microseconds the default 
format.POSIXlt seems to cause some possibly undesirable behaviour:

According to the code in format.POSIXlt the maximum accuracy of printing 
fractional seconds is 1 microsecond, but if I do;

options( digits.secs = 6 )
as.POSIXlt( 1.02 , tz=, origin=1970-01-01)
as.POSIXlt( 1.98 , tz=, origin=1970-01-01)
as.POSIXlt( 1.99 , tz=, origin=1970-01-01)

I return respectively:
[1] 1970-01-01 01:00:01.02 BST
[1] 1970-01-01 01:00:01.98 BST
[1] 1970-01-01 01:00:01 BST

If options( digits.secs = 6 ) should I not expect to be able to print 1.99 
seconds? This seems to be caused by the following code fragment in 
format.POSIXlt:

np - getOption(digits.secs)
if (is.null(np))
np - 0L
else np - min(6L, np)
if (np = 1L)
for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs,
i))  1e-06)) {
np - i
break
}

Specifically
for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs,
i))  1e-06))
Which in the case of 1.99 seconds will give:

options( scipen = 10 )
np - 6
sapply( seq_len(np) - 1L , function(x) abs(1.99 - round(1.99, x)) )

 [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.01 0.01 0.01 0.01 0.01 0.01

The logical test all( ...  1e-06)  should evaluate to FALSE but due to 
floating point precision it evaluates TRUE:

sprintf( %.20f , abs(1. 99  - round(1. 99,5)))
[1] 0.00991773

If instead of:

for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs,
i))  1e-06))

in format.POSIXlt we had a comparison value that was half the minimum increment:

for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs,
i))  5e-07))

This behaviour disappears:

mod.format.POSIXlt( as.POSIXlt( 1.99 , tz=, origin=1970-01-01) )
[1] 1970-01-01 01:00:01.99

But I am unsure if the original behaviour is what I should expect given the 
documentation (I have read it and I can't see a reason to expect 1.99 to 
round down to 1). And also if changing the formatting function would have other 
undesirable consequences?

My sessionInfo():
R version 3.0.0 (2013-04-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base



Thank you,


Simon



Simon O'Hanlon
Postgraduate Researcher

Helminth Ecology Research Group
Department of Infectious Disease Epidemiology
Imperial College London
St. Mary's Hospital, Norfolk Place,
London, W2 1PG, UK

Office: +44 (0) 20 759 43229



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sum up column values according to row id

HI
Not sure whether this helps:
ipso- read.table(text=
   id dbh h  g
1  FPE0164  36 13.62 0.10178760
2  FPE0164  31 12.70 0.07547676
21 FPE1127  57 18.85 0.25517586
13 FPE1127  39 15.54 0.11945906
12 FPE1127  34 14.78 0.09079203
6  FPE1127  32 15.12 0.08042477
5  FPE1127  28 14.13 0.06157522
15 FPE1127  27 13.50 0.05725553
19 FPE1127  25 13.28 0.04908739
11 FPE1127  19 11.54 0.02835287 
,sep=,header=TRUE,stringsAsFactors=FALSE)

library(plyr)
ddply(ipso,.(id),summarize,g=mean(head(g,6)))
 #  id  g
#1 FPE0164 0.08863218
#2 FPE1127 0.11078041


aggregate(g~id,ipso,function(x) mean(head(x,6)))
#   id  g
#1 FPE0164 0.08863218
#2 FPE1127 0.11078041



Dear All, 

here a problem I think many of you can solve in few minutes. 

I have a dataframe which contains values of plot id, diameters, heigths and 
basal area of trees, thus columns names are: id | dbh | h | g 

head(ipso, n=10)        id dbh     h          g 
1  FPE0164  36 13.62 0.10178760 
2  FPE0164  31 12.70 0.07547676 
21 FPE1127  57 18.85 0.25517586 
13 FPE1127  39 15.54 0.11945906 
12 FPE1127  34 14.78 0.09079203 
6  FPE1127  32 15.12 0.08042477 
5  FPE1127  28 14.13 0.06157522 
15 FPE1127  27 13.50 0.05725553 
19 FPE1127  25 13.28 0.04908739 
11 FPE1127  19 11.54 0.02835287 

from here I need to calculate the mean of the six greater g_ith for 
each id_ith. The clauses are that: 

if length(id) =6 

do the mean of the first six greaters g 


else 
do the mean of all the g_ith in the id_ith (in head print above e.g. 
for the id==FPE0164 do the mean of just these two values of g). 

The g are already ordered by id ascending and g descending using: 

ipso - ipso[with(ipso, order(ipso$id, -ipso$g)), ] # Order for id 
ascending and g descending 

I tried a lot of for loops and tapply() without results. 

Can anyone help me to solve this? 

Thanks for your attention 

Best whishes 

Matteo 


__ 
[hidden email] mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] getting started in parallel computing on a windows OS

2013-04-24 Thread Martin Morgan


On 04/24/2013 02:50 PM, Benjamin Caldwell wrote:

Dear R help,

I've what I think is a fairly simple parallel problem, and am getting
bogged down in documentation and packages for much more complex situations.

I have a big matrix  (30^5,5]. I have a function that will act on each row
of that matrix sequentially and output the 'best' result from the whole
matrix (it compares the result from each row to the last and keeps the
'better' result). I would like to divide that first large matrix into
chunks equal to the number of cores I have available to me, and work
through each chunk, then output the results from each chunk.

I'm really having trouble making head or tail of how to do this on a
windows machine - lots of different false starts on several different
packages now. Basically, I have the function, and I can of course easily
divide the matrix into chunks. I just need a way to process each chunk
in parallel (other than opening new R sessions for each core manually).

Any help much appreciated - after two days of trying to get this to work
I'm pretty burnt out.


Hi Ben -- in your code from this morning you had a function

fitting - function(ndx.grd=two,dt.grd=one,ind.vr='ind',rsp.vr='res') {
## ... setup
for(i in 1:length(ndx.grd[,1])){
## ... do work
}
## ... collate results
}

that you're trying to run in parallel. Obviously the ## ... represent lines I've 
removed. When you say something like


y - foreach(icount(length(two))) %dopar% fitting()

its saying that you want to run fitting() length(two) times. So you're actually 
doing the same thing length(two) times, whereas you really want to divide the 
work thats inside fitting() into chunks, and do those on separate cores!


Conceptually what you'd like to do is

fit_one - function(idx, ndx.grd, dt.grd, ind.vr, rsp.vr) {
## ... do work on row idx _ONLY_
}

and then evaluate with

## ... setup
y -
  foreach (idx = icount(nrow(two)) %dopar% one_fit(idx, two, one, ind, res)
## ... collate

so that fit_one fits just one of your combinations. foreach will worry about 
distributing the work. Make sure that fit_one works first, before trying to run 
this in parallel; your use of try(), trying to fit different data types 
(character, integer, numeric) into a matrix rather than data.frame, and the type 
coercions all indicate that you're fighting with R rather than working with it.


Hope that helps,

Martin



Thanks

*Ben Caldwell*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question

2013-04-24 Thread R. Michael Weylandt

 On Tue, Apr 23, 2013 at 2:54 PM, nafiseh hagiaghamohammadi
 n_hajiaghamohammadi2...@yahoo.com wrote:
 Hi

 I fit one  linear quantile regression with package quantreg and I want to
 khow this model is good or not.Is there method for checking it?
 Thanks your advice

   I ask this question because  there is 2 models,f0 and f1 in  (R1 - 1 -
 f1$rho/f0$rho ),
 is it true?

   but I fit 1 model and I want to check goodness of fit for 1 model .



 Please keep your responses on list so you can get a quick reply even
when I'm otherwise busy.

I think you could -- for a rough and ready comparison -- compare
against a constant (empirical quantile) model (not unlike how basic
OLS models compare against the constant mean predictor)  but someone
else might know if there's any subtleties about quantile regression
that should be noted here.

MW

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fonts not rendering during plot

2013-04-24 Thread David Winsemius


On Apr 24, 2013, at 7:29 AM, George Dietz wrote:

 Hi,
 
 I am having a problem where sometimes fonts for text in plots don't get 
 rendered-the text just shows up as boxes. If I call R from a certain 
 directory the problem goes away, otherwise the fonts don't render (but plots 
 get made). In both cases, my PANGO_RC and FONTCONFIG_FILE do not change…

Offering the OS details is more courteous than omitting them. (And there are 
several other courtesies expected of you listed in the Fine Posting Guide.) If 
this happens to be on a Mac, it may mean your screen fonts are corrupt. You can 
see their names with: names(quartzFonts()).  Using Fontbook.app will sometimes 
reveal that there are two copies of one of the screen graphics fonts and that 
one of them displays as blanks. These should be deleted with Fontbook.app.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about the loglm and glm---Re:Re: Regression on stratified count data

2013-04-24 Thread meng

Understand.
Many thanks for your detailed explaination.





 



At 2013-04-24 22:22:55,Achim Zeileis achim.zeil...@uibk.ac.at wrote:
On Wed, 24 Apr 2013, meng wrote:

 Hi,Achim:
 Can all the analysis you mentioned via loglm be performed via
 glm(...,family=poisson)?

Yes.

## transform table back to data.frame
df - as.data.frame(tab)

## fit models: conditional independence, no-three way interaction,
## and saturated
g1 - glm(Freq ~ age/(drug + case), data = df, family = poisson)
g2 - glm(Freq ~ (age + drug + case)^2, data = df, family = poisson)
g3 - glm(Freq ~ age * drug * case, data = df, family = poisson)

## likelihood-ratio tests (against saturated)
anova(g1, g3, test = Chisq)
anova(g2, g3, test = Chisq)

## compare fitted frequencies (which are essentially equal)
all.equal(as.numeric(fitted(g1)),
   as.data.frame(as.table(fitted(m1)))$Freq)
all.equal(as.numeric(fitted(g2)),
   as.data.frame(as.table(fitted(m2)))$Freq)

The difference is mainly that loglm() has a specialized user interface and 
that it uses a different optimizer (iterative proportional fitting rather 
than iterative reweighted least squares).

Best,
Z

 Many thanks.
 
 
 
 
 
 
 
 At 2013-04-24 19:37:10,Achim Zeileis achim.zeil...@uibk.ac.at wrote:
 On Wed, 24 Apr 2013, meng wrote:
 
  Hi all:
  For stratified count data,how to perform regression analysis?
 
  My data:
  age case oc count
  1   1 121
  1   1 226
  1   2 117
  1   2 259
  2   1 118
  2   1 288
  2   2 1 7
  2   2 2   95
 
  age:
  1:40y
  2:40y
 
  case:
  1:patient
  2:health
 
  oc:
  1:use drug
  2:not use drug
 
  My purpose:
  Anaysis whether case and oc are correlated, and age is a stratified varia
 ble.
 
  My solution:
  1,Mantel-Haenszel test by using function mantelhaen.test
  2,loglinear regression by using function glm(count~case*oc,family=poisson
 ).But I don't know how to handle variable age,which is the stratified vari
 able.
 
 Instead of using glm(family = poisson) it is also convenient to use 
 loglm() from package MASS for the associated convenience table.
 
 The code below shows how to set up the contingency table, visualize it 
 using package vcd, and then fit two models using loglm. The models 
 considered are conditional independence of case and drug given age and the 
 no three-way interaction already suggested by Peter. Both models are 
 also accompanied by visualizations of the residuals.
 
 ## contingency table with nice labels
 tab - expand.grid(drug = 1:2, case = 1:2, age = 1:2)
 tab$count - c(21, 26, 17, 59, 18, 88, 7, 95)
 tab$age - factor(tab$age, levels = 1:2, labels = c(40, 40))
 tab$case - factor(tab$case, levels = 1:2, labels = c(patient, 
 healthy))
 tab$drug - factor(tab$drug, levels = 1:2, labels = c(yes, no))
 tab - xtabs(count ~ age + drug + case, data = tab)
 
 ## visualize case explained by drug given age
 library(vcd)
 mosaic(case ~ drug | age, data = tab,
split_vertical = c(TRUE, TRUE, FALSE))
 
 ## test wheter drug and case are independent given age
 m1 - loglm(~ age/(drug + case), data = tab)
 m1
 
 ## visualize corresponding residuals from independence model
 mosaic(case ~ drug | age, data = tab,
split_vertical = c(TRUE, TRUE, FALSE),
residuals_type = deviance,
gp = shading_hcl, gp_args = list(interpolate = 1.2))
 mosaic(case ~ drug | age, data = tab,
split_vertical = c(TRUE, TRUE, FALSE),
residuals_type = pearson,
gp = shading_hcl, gp_args = list(interpolate = 1.2))
 
 ## test whether there is no three-way interaction
 ## (i.e., dependence of case on drug is the same for both age groups)
 m2 - loglm(~ (age + drug + case)^2, data = tab)
 m2
 
 ## visualization (with default pearson residuals)
 mosaic(case ~ drug | age, data = tab,
expected = ~ (age + drug + case)^2,
split_vertical = c(TRUE, TRUE, FALSE),
gp = shading_hcl, gp_args = list(interpolate = 1.2))
 
 
  Many thanks for your help.
 
  My best.
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.h
 tml
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help me make faster R code for Kennard-Stone algorithm [My code is so slow from Matlab]

2013-04-24 Thread Kevin Hao

Hi all,

Can you help me change my Kennard-Stone algorithm to faster one?

[The original code can run fast in matlab, but when I change matlab code to
R code, it is so slow.]

Since my code so crude and too many loops (changed from matlab code), it is
too slow. I hope that you can help to improve the performance.

Thanks.

kevin


#
##
# Kennard-Stone algorithm for selection of samples
ksFun = function(x, N) {
# Initial the vector of minimum distance
dminmax = sample(0, N, replace = TRUE) # Default: FALSE
M = nrow(x)
samples = 1:M
# Initializes the matrix of distances
D = matrix(0, nrow = M, ncol = M)
for (i in 1:(M - 1)) {
xa = x[i, ]
for (j in (i + 1):M) {
xb = x[j, ]
# D: Upper Trianglar Matrix
# D[i, j] = Euclidean distance between object i and
j (j  i)
D[i, j] = max(svd(xa - xb)$d) # the largest
singular value
}
}

cat(for (i in 1:(M - 1)) done! \n)

maxD = apply(D, 2, max)
index_row = apply(D, 2, function(x) which(x == max(x))[1])

dummy = max(maxD)
index_column = which(maxD == dummy)
m = vector()
m[1] = index_row[index_column]
m[2] = index_column

dminmax[2] = D[m[1], m[2]]

for (i in 3:N) {
pool = setdiff(samples, m)
dmin = sample(0, M-i+1, replace = TRUE)
for (j in 1:(M-i+1)) {
indexa = pool[j]
d = sample(0, i-i, replace = TRUE)
for ( k in 1:(i-1)) {
indexb = m[k]
if (indexa  indexb) {
d[k] = D[indexa, indexb]
} else {
d[k] = D[indexb, indexa]
}
}
dmin[j] = min(d)
}
#cat(for (j in 1:(M -i+1)) done! \n)
dminmax[i] = max(dmin)
index = which(dmin == max(dmin))
m[i] = pool[index]
}
cat(for (i in 3:N) done! \n)
res = list(m = m, dminmax = dminmax)
return(res)
}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question

2013-04-24 Thread Roger Koenker

good usually means good relative to something else, in this case the comparison 
seems, as Michael
has already said, f0 - rq(y ~ 1, tau = ?) and then one can compute the R1 
version that I originally
suggested.  But since there is still no explicit way to evaluate this, it is 
all a bit pointless.

Roger Koenker
rkoen...@illinois.edu




On Apr 24, 2013, at 6:37 PM, R. Michael Weylandt wrote:

 On Tue, Apr 23, 2013 at 2:54 PM, nafiseh hagiaghamohammadi
 n_hajiaghamohammadi2...@yahoo.com wrote:
 Hi
 
 I fit one  linear quantile regression with package quantreg and I want to
 khow this model is good or not.Is there method for checking it?
 Thanks your advice
 
  I ask this question because  there is 2 models,f0 and f1 in  (R1 - 1 -
 f1$rho/f0$rho ),
 is it true?
 
  but I fit 1 model and I want to check goodness of fit for 1 model .
 
 
 
 Please keep your responses on list so you can get a quick reply even
 when I'm otherwise busy.
 
 I think you could -- for a rough and ready comparison -- compare
 against a constant (empirical quantile) model (not unlike how basic
 OLS models compare against the constant mean predictor)  but someone
 else might know if there's any subtleties about quantile regression
 that should be noted here.
 
 MW
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Stochastic Frontier: Finding the optimal scale/scale efficiency by frontier package

2013-04-24 Thread jpm miao

Hi,

   I am trying to find out the scale efficiency and optimal scale of banks
by stochastic frontier analysis given the panel data of bank. I am free to
choose any model of stochastic frontier analysis.

   The only approach I know to work with R is to estimate a translog
production function by sfa or other related function in frontier package,
and then use the Ray 1998 formula to find the scale efficiency. However, as
the textbook Coelli et al 2005 point out that the concavity may not be
satisfied,  one needs to impose the nonpositive definiteness condition so
that the scale efficiency 1. How can I do it with frontier package?

   Is there any other SFA model/function in R recommended to find out the
scale efficiency and optimal scale?

   Thanks,

Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the joy of spreadsheets (off-topic)

2013-04-24 Thread Ross Boylan


On 4/17/2013 5:18 AM, Kevin Wright wrote:

On Tue, Apr 16, 2013 at 4:33 PM, Jim Lemon j...@bitwrit.com.au wrote:


On 04/17/2013 03:25 AM, Sarah Goslee wrote:

The final point does relate to Excel and any application that hides what is

going on to the casual observer. I will treasure this URL to give to anyone
who chastises my moaning when I have to perform some task in Excel. It is
not an error in the application (although these certainly exist) but a
salutory caution to those who think that if a reasonable looking number
appears in a cell, it must be the correct answer. I have found not one, but
two such errors in the simple calculation of a birthday age from the date
of birth and date of death.

Jim


So there (maybe) was a bug in Excel.  Maybe hidden from the casual
observer.  And since Excel is not R, and we are R snobs, Excel is evil,
right?  But, wait.  Is it easier for a casual observer to detect a flaw
in the formula in Excel, or to find an incorrect array index in an R
script?
If the person knows R, or can fake it, I think it is easier.  You have 
to hunt around an Excel spreadsheet to see what the formulae are,
and the cell references usually have no inherent meaning.  Further, one 
of the errors they made, not including all the data in a range, is very 
easy to make in excel but would be very hard to make in R.


As others have noted, the problem was not a bug in Excel the program 
(unless you consider the design a bug) but a bug induced by the use of 
Excel.


I doubt the exclusion of the range was deliberate, although the other 
errors seem to have been.  However, it is likely that if the result had 
not been to their liking the original authors would have rechecked their 
work and discovered the problem.  One of the errors, equal weighting 
of countries regardless of how many years they spent in a given state, 
is arguably a judgement call. Selective exclusion and inclusion of data 
is also a judgement call, but that strikes me as less defensible.


Someone wrote that the overall finding of a negative relation between 
debt and growth is intact.  First of all, the headline summary was that 
if debt/GDP  90% you fall off a cliff.  That is not intact; it is 
false.  The remaining relation is quite weak.  And the substantive 
conclusion that high debt *causes* weaker growth is a complete reading 
into a correlational finding.  It is pretty hard to sort out causal 
ordering, but some evidence suggests it is more the reverse: 
http://krugman.blogs.nytimes.com/2013/04/18/correlation-causality-and-casuistry/. 
See Krugman and Delongs blogs generally for gleeful commentary, or the 
original critique in 
http://www.peri.umass.edu/236/hash/31e2ff374b6377b2ddec04deaa6388b1/publication/566/.


At any rate, a policy-relevant conclusion would need to be based on a 
much more careful analysis than was done, careful not only in the 
mechanics but in using methods that at least attempted to sort out the 
causal relations.


The irony is that the substantively most trivial mistake is also the 
most clearly an error, while the more important issues are at least a 
little less clear-cut.


Ross

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to make a raster image in R

2013-04-24 Thread Kristi Glover

Hi R-User



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] creating raster image in R

2013-04-24 Thread Kristi Glover

Hi R-User



I was trying to make a raster map with WGS84 projection 
in R, but I could not make it. I found one data set in Google that data 
is almost the same format as of mine. I wanted to make a raster map of 
temperature with 1 degree spatial resolution for the global scale. 
I
 could make it in the GIS software but I do have many variables (to be 
many raster images) and ultimately I am importing them into R for 
further analysis. Therefore, I wanted to make them in R, if possible. 

But, I am wondering how I can create a raster map in R.
(I have provided link of data for your references, this is an example of data 
set).

I am really appropriating for your help.

#--
I downloaded the following packages 

#create a raster map from scratch
install.packages(raster, dependencies=TRUE)
library(raster)  # raster data
install.packages(rgdal, dependencies=TRUE)
library(rgdal)  # input/output, projections
install.packages(rgeos, dependencies=TRUE)
library(rgeos)  # geometry ops
install.packages(spdep, dependencies=TRUE)
library(spdep)  # spatial dependence
install.packages(pastecs, dependencies=TRUE)
library(pastecs)

pts-read.table.url(https://www.betydb.org//miscanthusyield.csv;, header=T, 
sep=,)
proj4string(pts)=- CRS(+proj=longlat +datum=WGS84)
after that, what I am supposed to do for creating a raster map? any suggestions?
#---

Cheers,
Kristi
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Distance matrices Combinations