date:20111108

Dear all,
I have a different data sets and I am doing some calculations over time,

For that every data set is split into junks based on the time stamps
so one data set has like 10 timestamps.

There is also the case that one data set has less than 10 timestamps.

In my code I was doing the following

lapply(Datasource,analysis_for_one_data_source)

lapply(TimeFrames,analysis_for_one_data_source_and_one_time_frame)

as you can imagine there are times where the second lapply will explode 


analysis_for_one_data_source_and_one_time_frame- function( DataSource, 
TimeFrame,) {
    return( do_analysis(DataSource,TimeFrame))
    
}

as you can understand this will return an error for a given Datasource that 
does not have a timestamp. I was looking though if I can ask from R to handle 
the error by continuing to the next one. So the lapply that returns and error 
can for example to store the error message to the list it returns and continue 
to the next element of the lapply list

Would that be possible in R?
B.R
Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading a specific column of a csv file in a loop

have you considered reading in the data and then creating objects for each 
column and then saving (save) each to disk.  That way you incur the expense of 
the read once and now have quick access (?load) to the column as you need them.

You could also use a database for this.

On Nov 8, 2011, at 5:04, Sergio René Araujo Enciso araujo.enc...@gmail.com 
wrote:

 Dear all:
 
 I have two larges files with 2000 columns. For each file I am
 performing a loop to extract the ith element of each file and create
 a data frame with both ith elements in order to perform further
 analysis. I am not extracting all the ith elements but only certain
 which I am indicating on a vector called d.
 
 See  an example of my  code below
 
 ### generate an example for the CSV files, the original files contain
 more than 2000 columns, here for the sake of simplicity they have only
 10 columns
 M1-matrix(rnorm(1000), nrow=100, ncol=10,
 dimnames=list(seq(1:100),letters[1:10]))
 M2-matrix(rnorm(1000), nrow=100, ncol=10,
 dimnames=list(seq(1:100),letters[1:10]))
 write.table(M1, file=M1.csv, sep=,)
 write.table(M2, file=M2.csv, sep=,)
 
 ### the vector containing the i elements to be read
 d-c(1,4,7,8)
 P1-read.table(M1.csv, header=TRUE)
 P2-read.table(M1.csv, header=TRUE)
 for (i in d) {
 M-data.frame(P1[i],P2[i])
 rm(list=setdiff(ls(),d))
 }
 
 As the files are quite large, I want to include read.table within
 the loop so as it only read the ith element. I know that there is
 the option colClasses for which I have to create a vector with zeros
 for all the columns I do not want to load. Nonetheless I have no idea
 how to make this vector to change in the loop, so as the only element
 with no zeros is the ith element following the vector d. Any ideas
 how to do this? Or is there anz other approach to load only an
 specific element?
 
 best regards,
 
 Sergio René
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] skip on error

?try

Sent from my iPad

On Nov 8, 2011, at 5:49, Alaios ala...@yahoo.com wrote:

 Dear all,
 I have a different data sets and I am doing some calculations over time,
 
 For that every data set is split into junks based on the time stamps
 so one data set has like 10 timestamps.
 
 There is also the case that one data set has less than 10 timestamps.
 
 In my code I was doing the following
 
 lapply(Datasource,analysis_for_one_data_source)
 
 lapply(TimeFrames,analysis_for_one_data_source_and_one_time_frame)
 
 as you can imagine there are times where the second lapply will explode 
 
 
 analysis_for_one_data_source_and_one_time_frame- function( DataSource, 
 TimeFrame,) {
 return( do_analysis(DataSource,TimeFrame))
 
 }
 
 as you can understand this will return an error for a given Datasource that 
 does not have a timestamp. I was looking though if I can ask from R to handle 
 the error by continuing to the next one. So the lapply that returns and error 
 can for example to store the error message to the list it returns and 
 continue to the next element of the lapply list
 
 Would that be possible in R?
 B.R
 Alex
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] splitting by the last occurance of a dot

2011-11-08 Thread Ashim Kapoor

Dear R-helpers,

I want to split the following vector into 2 vectors by the last occurance
of a .

 dput(rownames(sensext))
c(pat, cash_bank_bal, invest_abroad, pat.1, cash_bank_bal.1,
invest_abroad.1, pat.2, cash_bank_bal.2, invest_abroad.2,
pat.3, cash_bank_bal.3, invest_abroad.3, pat.4, cash_bank_bal.4,
invest_abroad.4, Market.Capitalisation, Market.Capitalisation.1,
Market.Capitalisation.2, Market.Capitalisation.3,
Market.Capitalisation.4
)

My attempt :
I tried strsplit(rownames(sensext),\\.) but that splits it into 3 parts
sometimes,the logic of which I can see,since there are 2 dots sometimes.

Can someone tell me how to split this ?

Many thanks,
Ashim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help to ... import the data from Excel

2011-11-08 Thread writinghealth

Sarah, it seems your current directory is not where your LTS.xls file is
located.
What I would is the following:
always use the xlsReadWrite functions in the current directory where your
excel file is located
Also, just reaad the ENTIR excel sheet, rather than referencing columns.
This the simplest way and it has ALWAYS worked for me.
e.g. STI-read.xls('STI_CASES.xls',sheet=1)
And I have my data, now I have to work a bit to get what I want (e.g. cases
for a certain year ect), but that is doable.


--
View this message in context: 
http://r.789695.n4.nabble.com/help-to-import-the-data-from-Excel-tp3893382p4015392.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-bash beginner

2011-11-08 Thread Matteo Giani


On 11/07/2011 05:17 PM, David A. wrote:

Hi,

I am trying to run some R commands into my bash scripts and want to use shell variables 
in the R commands and store the output of R objects into shell variables for further 
usage in downstream analyses. So far I have managed the first, but how to get values out 
of R script? I am using  here documents (as a starter, maybe something else 
is simpler or better; suggestions greatly appreciated).


hi,

maybe this can be helpful:

http://rwiki.sciviews.org/doku.php?id=tips:scriptingr


I personally didn't manage to successfully write a 'fully functional' 
script, though, because of quoting troubles and so on.


best regards,

--
Matteo Giani

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Estimate of intercept in loglinear model

2011-11-08 Thread Colin Aitken

Sorry about that.  However I have solved the problem by declaring the 
explanatory variables as factors.


An unresolved problem is:  what does R do when the explanatory factors 
are not defined as factors when it obtains a different value for the 
intercept but the correct value for the fitted value?


A description of the data and the R code and output is attached for 
anyone interested.


Best wishes,

Colin Aitken

---

David Winsemius wrote:


On Nov 7, 2011, at 12:59 PM, Colin Aitken wrote:


How does R estimate the intercept term \alpha in a loglinear
model with Poisson model and log link for a contingency table of counts?

(E.g., for a 2-by-2 table {n_{ij}) with \log(\mu) = \alpha + \beta_{i} 
+ \gamma_{j})


I fitted such a model and checked the calculations by hand. I  agreed 
with the main effect terms but not the intercept. Interestingly,  I 
agreed with the fitted value provided by R for the first cell {11} in 
the table.


If my estimate of intercept = \hat{\alpha}, my estimate of the fitted 
value for the first cell = exp(\hat{\alpha}) but R seems to be doing 
something else for the estimate of the intercept.


However if I check the  R $fitted_value for n_{11} it agrees with my 
exp(\hat{\alpha}).


I would expect that with the corner-point parametrization, the 
estimates for a 2 x 2 table would correspond to expected frequencies 
exp(\alpha), exp(\alpha + \beta), exp(\alpha + \gamma), exp(\alpha + 
\beta + \gamma). The MLE of \alpha appears to be log(n_{.1} * 
n_{1.}/n_{..}), but this is not equal to the intercept given by R in 
the example I tried.


With thanks in anticipation,

Colin Aitken


--
Professor Colin Aitken,
Professor of Forensic Statistics,


Do you suppose you could provide a data-corpse for us to dissect?

Noting the tag line for every posting 

and provide commented, minimal, self-contained, reproducible code.




--
Professor Colin Aitken,
Professor of Forensic Statistics,
School of Mathematics, King’s Buildings, University of Edinburgh,
Mayfield Road, Edinburgh, EH9 3JZ.

Tel:0131 650 4877
E-mail:  c.g.g.ait...@ed.ac.uk
Fax :  0131 650 6553
http://www.maths.ed.ac.uk/~cgga


The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] iplots problem

2011-11-08 Thread agent dunham

I also have the same problem. Can you send me the link to download a proper 
Sun Java application (I have Microsot Windows XP Profesional V.2002)? 
Anyway, this is what I tried: 

 library(JGR)
Loading required package: JavaGD
Loading required package: iplots
Error : .onLoad failed in loadNamespace() for 'iplots', details:
  call: .jnew(org/rosuda/iplots/Framework)
  error: java.awt.HeadlessException
Error: package ‘iplots’ could not be loaded
 install.packages(JavaGD, dependencies= TRUE)
Aviso: package ‘JavaGD’ is in use and will not be installed
 install.packages(iplots, dependencies= TRUE)
probando la URL
'http://cran.es.r-project.org/bin/windows/contrib/2.14/iplots_1.1-4.zip'
Content type 'application/zip' length 448623 bytes (438 Kb)
URL abierta
downloaded 438 Kb

package ‘iplots’ successfully unpacked and MD5 sums checked
Aviso: cannot remove prior installation of package ‘iplots’

The downloaded packages are in
C:\Documents and Settings\cristina.pascual\Configuración
local\Temp\RtmpGuRMKp\downloaded_packages
 library(iplots)
Error en library(iplots) : there is no package called ‘iplots’
 library(JGR)
Error: package ‘iplots’ required by ‘JGR’ could not be found

u...@host.com

--
View this message in context: 
http://r.789695.n4.nabble.com/iplots-problem-tp825990p4015425.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Aggregate or extract function ?

2011-11-08 Thread Celine

Thanks for your help, I still have a little problem with this function
because I don't have the same number of line in my two datarame so when I
try to apply the dataframe function, I obtain this response ; that I have a
different number of lines.

Erreur dans data.frame(train, test[rowz, ]) : 
  les arguments impliquent des nombres de lignes différents : 50327, 66592
Do you know how I could solve this problem ?

Thanks,

Céline


--
View this message in context: 
http://r.789695.n4.nabble.com/Aggregate-or-extract-function-tp4013673p4015263.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variance explained by each predictor in GAM

2011-11-08 Thread huidongtian

Dear Prof. Wood,
I read your methods of extracting the variance explained by each
predictor in different places. My question is: using the method you
suggested, the sum of the deviance explained by all terms is not equal to
the deviance explained by the full model. Could you tell me what caused such
problem? 

  set.seed(0)
  n-400
  x1 - runif(n, 0, 1)
  ## to see problem with not fixing smoothing parameters
  ## remove the `##' from the next line, and the `sp'
  ## arguments from the `gam' calls generating b1 and b2. 
  x2 - runif(n, 0, 1) ## *.1 + x1 
  f1 - function(x) exp(2 * x)
  f2 - function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
  f - f1(x1) + f2(x2)
  e - rnorm(n, 0, 2)
  y - f + e
  ## fit full and reduced models...
  b - gam(y~s(x1)+s(x2))
  b1 - gam(y~s(x1),sp=b$sp[1])
  b2 - gam(y~s(x2),sp=b$sp[2])
  b0 - gam(y~1)
  ## calculate proportions deviance explained...
  dev.1 - (deviance(b1)-deviance(b))/deviance(b0) ## prop explained by
 s(x2)
  dev.2 - (deviance(b2)-deviance(b))/deviance(b0) ## prop explained by
 s(x1)
 
  dev.1 + dev.2
[1] 0.6974949
  summary(b)$dev.expl
[1] 0.7298136

I checked the two models (b1  b2), found the model coefficients are
different with model b, so I feel it could be the problem.

wish to hear your comments.

Huidong Tian





--
View this message in context: 
http://r.789695.n4.nabble.com/Re-variance-explained-by-each-predictor-in-GAM-tp896222p4015368.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting 3 different hours in a day

attach another column to you dataframe with the hour of the day and then use
'subset' to collect the three hours of interest.

Sent from my iPad

On Nov 8, 2011, at 1:12, jck13 jennake...@hotmail.com wrote:

Hello,

I have a csv with 5months of hourly data for 4 years. I would like to get
9am, 12pm and 3pm from each day and create a subset or a new data frame
that I can analyze. The time are from hour 0-23 for each day.
I am not sure how to create a loop which will take out each of these hours
and create a subset.

I was thinking of doing a for loop using the row number since:
9am= row 10
12pm= row 13
3pm= row 16

trying to loop through to extract these 3 times each day.

n=length(date_stamp)

for (i in i:n) {
m= 10
i= 1
new1= mv[m,]
i= i+1
m= m+3
##m+18 at row 16?
}

I need some help creating a loop through this! Thank you!

--
View this message in context:
http://r.789695.n4.nabble.com/Selecting-3-different-hours-in-a-day-tp4015010p4015010.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How much time a process need?

could be that the process was waiting for the user to select a file from a 
list, or some other input before proceeding.  Would have to see what the 
overall performance of the system was at that point.  It could also have been 
that the process was low on physical memory and there was a lot of paging going 
on.

Sent from my iPad

On Nov 7, 2011, at 21:50, Rolf Turner rolf.tur...@xtra.co.nz wrote:

 On 08/11/11 02:40, Joshua Wiley wrote:
 On Mon, Nov 7, 2011 at 5:32 AM, Alaiosala...@yahoo.com  wrote:
 So I just need to get the
 
user  system elapsed
   0.460   0.048  67.366
 
 
 user value and convert the seconds to days and then to hours ? Right?
 
 What about this elapsed field?
 It's all in seconds.  Convert whatever fields you want.
 
 That being said, doesn't having an elapsed time of over 67 seconds,
 when the actual calculation takes less than half a second, indicate
 that something weird is going on?  At the very least the R calculations
 are fighting for resources with some other very greedy processes.
 
cheers,
 
Rolf Turner
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] splitting by the last occurance of a dot

2011-11-08 Thread Gabor Grothendieck

On Tue, Nov 8, 2011 at 6:06 AM, Ashim Kapoor ashimkap...@gmail.com wrote:
 Dear R-helpers,

 I want to split the following vector into 2 vectors by the last occurance
 of a .

 dput(rownames(sensext))
 c(pat, cash_bank_bal, invest_abroad, pat.1, cash_bank_bal.1,
 invest_abroad.1, pat.2, cash_bank_bal.2, invest_abroad.2,
 pat.3, cash_bank_bal.3, invest_abroad.3, pat.4, cash_bank_bal.4,
 invest_abroad.4, Market.Capitalisation, Market.Capitalisation.1,
 Market.Capitalisation.2, Market.Capitalisation.3,
 Market.Capitalisation.4
 )

 My attempt :
 I tried strsplit(rownames(sensext),\\.) but that splits it into 3 parts
 sometimes,the logic of which I can see,since there are 2 dots sometimes.

 Can someone tell me how to split this ?

Assuming we want to split off the number at the end try this which
splits on those dots which are followed by a digit:

strsplit(r, \\.(?=\\d), perl = TRUE)


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] splitting by the last occurance of a dot

2011-11-08 Thread Ashim Kapoor

 Assuming we want to split off the number at the end try this which
 splits on those dots which are followed by a digit:

 strsplit(r, \\.(?=\\d), perl = TRUE)


Dear Gabor,

Thank you  very much. That works very well. I don't completely understand
it though. A few words on what the (?=\\d) is doing would be nice.

Regards,
Ashim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ggplot2 reorder factors for faceting

2011-11-08 Thread Iain Gallagher



Dear List

I am trying to draw a heatmap using ggplot2. In this heatmap I have faceted my 
data by 'infection' of which I have four. These four infections break down into 
two types and I would like to reorder the 'infection' column of my data to 
reflect this. 

Toy example below:

library(ggplot2)

# test data for ggplot reordering
genes - (rep (c(rep('a',4), rep('b',4), rep('c',4), rep('d',4), rep('e',4), 
rep('f',4)) ,4))
fcData - rnorm(96)
times - rep(rep(c(2,6,24,48),6),4)
infection - c(rep('InfA', 24), rep('InfB', 24), rep('InfC', 24), rep('InfD', 
24))
infType - c(rep('M', 24), rep('D',24), rep('M', 24), rep('D', 24))

# data is long format for ggplot2
plotData - as.data.frame(cbind(genes, as.numeric(fcData), as.numeric(times), 
infection, infType))

hp2 - ggplot(plotData, aes(factor(times), genes)) + geom_tile(aes(fill = 
scale(as.numeric(fcData + facet_wrap(~infection, ncol=4)

# set scale
hp2 - hp2 + scale_fill_gradient2(name=NULL, low=#0571B0, mid=#F7F7F7, 
high=#CA0020, midpoint=0, breaks=NULL, labels=NULL, limits=NULL, 
trans=identity) 

# set up text (size, colour etc etc)
hp2 - hp2 + labs(x = Time, y = ) + scale_y_discrete(expand = c(0, 0)) + 
opts(axis.ticks = theme_blank(), axis.text.x = theme_text(size = 10, angle = 
360, hjust = 0, colour = grey25), axis.text.y = theme_text(size=10, colour = 
'gray25'))

hp2 - hp2 + theme_bw()

In the resulting plot I would like infections infA and infC plotted next to 
each other and likewise for infB and infD. I have a column in the data - 
infType - which I could use to reorder the infection column but so far I have 
no luck getting this to work.

Could someone give me a pointer to the best way to reorder the infection factor 
and accompanying data into the order I would like?

Best

iain

 sessionInfo()
R version 2.13.2 (2011-09-30)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.utf8   LC_NUMERIC=C 
 [3] LC_TIME=en_GB.utf8    LC_COLLATE=en_GB.utf8    
 [5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8   
 [7] LC_PAPER=en_GB.utf8   LC_NAME=C    
 [9] LC_ADDRESS=C  LC_TELEPHONE=C   
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C  

attached base packages:
[1] grid  stats graphics  grDevices utils datasets  methods  
[8] base 

other attached packages:
[1] ggplot2_0.8.9 proto_0.3-9.2 reshape_0.8.4 plyr_1.6 

loaded via a namespace (and not attached):
[1] digest_0.5.0 tools_2.13.2


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] similar package in R like SKEW CALCULATOR?

See the replies given to you two days ago when you asked the same
question: 
http://r.789695.n4.nabble.com/similar-package-in-R-like-quot-SKEW-CALCULATOR-quot-td3993216.html

Feel free to follow up though with further questions,

Michael

On Tue, Nov 8, 2011 at 2:34 AM, Knut Krueger r...@knut-krueger.de wrote:

 Hi to all
 is there a similar package like the  SKEW CALCULATOR from
 Peter Nonacs (University of California - Department of Ecology and
 Evolutionary Biology)

 http://www.eeb.ucla.edu/Faculty/Nonacs/shareware.htm


 Kind Regards Knut

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] splitting by the last occurance of a dot

2011-11-08 Thread Gabor Grothendieck

On Tue, Nov 8, 2011 at 6:48 AM, Ashim Kapoor ashimkap...@gmail.com wrote:

 Assuming we want to split off the number at the end try this which
 splits on those dots which are followed by a digit:

 strsplit(r, \\.(?=\\d), perl = TRUE)


 Dear Gabor,

 Thank you  very much. That works very well. I don't completely understand it
 though. A few words on what the (?=\\d) is doing would be nice.


See the info on zero width lookahead assertions on the ?regex page.


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unexpected behavior when browsing list objects by name, when the object name is a substring (prefix) of an existing object that has a valid value

You've stumbled on one of the reasons use of the $ operator is
discouraged in formal programming: it uses partial matching when given
names.

E.g.,

a = list(boy = 1:5, cat = 1:10, dog = 1:3)

a$d #Exists

If you want to require exact matching (and it seems you do), use the
[[ operator.

a[[d]] # Error
a[[dog]] # Works
a[[alligator]] - 1:100  #Works

Michael

On Tue, Nov 8, 2011 at 12:29 AM, Val Athaide vath...@gmail.com wrote:

 Hi,

 The program I am writing requires me to append named objects to my
 existing list objects dynamically.

 So my list object is retval. retval always has a metadatatabletype object

 createMetadata=TRUE
 retval -list()
 retval$metadatatype =c(normal)
 retval$metadata=NULL

 How, depending on certain logic, I create a metadata object

 if (createMetadata==TRUE) retval$metadata =rbind(retval$metadata,c(1,0) )
 The results are
  retval$metadata
      [,1]     [,2]
 [1,] normal normal
 [2,] 1      0


 What I expected to see is
  retval$metadata
      [,1]     [,2]
 [1,] 1      0



 I have been able to reproduce this problem only when the object
 retval$metadata is NULL and there is an existing object that has a valid
 value and the NULL object is a sub-string (prefix) of the existing object
 with a valid value.

 Also, retval$metadata takes on a value of normal even though it has been
 explicity set as NULL

 Your assistance is appreciated.

 Thanks
 Vathaid





        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] splitting by the last occurance of a dot

2011-11-08 Thread Ashim Kapoor

See the info on zero width lookahead assertions on the ?regex page.

Thank you again.

Best Regards,
Ashim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simple coordinate transformations

2011-11-08 Thread Matthew Young


Hi,

I need to do some simple coordinate transforms between cartesian, 
cylindrical and spherical coordinates.  I can't find any built in 
functions or packages to do this, is this because they do not exist?  
Obviously I can write my own code, but don't want to re-invent the wheel 
if I can avoid it.


Cheers,

Matt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Building package problem

2011-11-08 Thread Eduardo M. A. M. Mendes

Dear R-users

I am trying to recompile a CRAN package on Windows 32.   Rtools for 2.14 (that 
is the version I am running) and miktex were sucessfully installed on my 
machine.

Problems:

a) hydroGOF is a CRAN package, but R CMD check does not work on it.

C:\Users\eduardo\Documents\R_tests2R CMD check hydroGOF
* using log directory 'C:/Users/eduardo/Documents/R_tests2/hydroGOF.Rcheck'
* using R version 2.14.0 (2011-10-31)
* using platform: i386-pc-mingw32 (32-bit)
* using session charset: ISO8859-1
* checking for file 'hydroGOF/DESCRIPTION' ... OK
* checking extension type ... Package
* this is package 'hydroGOF' version '0.3-2'
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking whether package 'hydroGOF' can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking for portable file names ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking for unstated dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking contents of 'data' directory ... OK
* checking data for non-ASCII characters ... OK
* checking data for ASCII and uncompressed saves ... OK
* checking examples ... ERROR
Running examples in 'hydroGOF-Ex.R' failed
The error most likely occurred in:

 ### Name: plot2
 ### Title: Plotting 2 Time Series
 ### Aliases: plot2
 ### Keywords: dplot

 ### ** Examples

 sim - 2:11
 obs - 1:10
 ## Not run:
 ##D plot2(sim, obs)
 ## End(Not run)

 ##
 # Loading daily streamflows of the Ega River (Spain), from 1961 to 1970
 require(zoo)
Loading required package: zoo

Attaching package: 'zoo'

The following object(s) are masked from 'package:base':

as.Date, as.Date.numeric

 data(EgaEnEstellaQts)
 obs - EgaEnEstellaQts

 # Generating a simulated daily time series, initially equal to the observed se
ries
 sim - obs

 # Randomly changing the first 2000 elements of 'sim', by using a normal distri
bution
 # with mean 10 and standard deviation equal to 1 (default of 'rnorm').
 sim[1:2000] - obs[1:2000] + rnorm(2000, mean=10)

 # Plotting 'sim' and 'obs' in 2 separate panels
 plot2(x=obs, y=sim)

 # Plotting 'sim' and 'obs' in the same window
 plot2(x=obs, y=sim, plot.type=single)
Error in as.POSIXlt.character(x, tz, ...) :
  character string is not in a standard unambiguous format
Calls: plot2 ... as.POSIXct.default - as.POSIXct - as.POSIXlt - as.POSIXlt.ch
aracter
Execution halted

b) option --binary is no longer available, is that so?  How can an extension 
zip can be built on Windows?

R CMD build --no-vignettes hydroGOF works.   And R CMD INSTALL 
hydroGOFxx.tar.gz too.

Many thanks

Ed

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building package problem

2011-11-08 Thread Joshua Wiley

Hi Ed,

If the only error is in examples then this should work:

R CMD check --no-examples foopkg

should not have anything to do with vignettes (although those may also
not run, who knows).  As far as building a binary, look at:

R CMD INSTALL --help

which leads you to

R CMD INSTALL --build foopkg

HTH,

Josh



On Tue, Nov 8, 2011 at 4:35 AM, Eduardo M. A. M. Mendes
emammen...@gmail.com wrote:
 Dear R-users

 I am trying to recompile a CRAN package on Windows 32.   Rtools for 2.14 
 (that is the version I am running) and miktex were sucessfully installed on 
 my machine.

 Problems:

 a) hydroGOF is a CRAN package, but R CMD check does not work on it.

 C:\Users\eduardo\Documents\R_tests2R CMD check hydroGOF
 * using log directory 'C:/Users/eduardo/Documents/R_tests2/hydroGOF.Rcheck'
 * using R version 2.14.0 (2011-10-31)
 * using platform: i386-pc-mingw32 (32-bit)
 * using session charset: ISO8859-1
 * checking for file 'hydroGOF/DESCRIPTION' ... OK
 * checking extension type ... Package
 * this is package 'hydroGOF' version '0.3-2'
 * checking package namespace information ... OK
 * checking package dependencies ... OK
 * checking if this is a source package ... OK
 * checking if there is a namespace ... OK
 * checking for executable files ... OK
 * checking whether package 'hydroGOF' can be installed ... OK
 * checking installed package size ... OK
 * checking package directory ... OK
 * checking for portable file names ... OK
 * checking DESCRIPTION meta-information ... OK
 * checking top-level files ... OK
 * checking index information ... OK
 * checking package subdirectories ... OK
 * checking R files for non-ASCII characters ... OK
 * checking R files for syntax errors ... OK
 * checking whether the package can be loaded ... OK
 * checking whether the package can be loaded with stated dependencies ... OK
 * checking whether the package can be unloaded cleanly ... OK
 * checking whether the namespace can be loaded with stated dependencies ... OK
 * checking whether the namespace can be unloaded cleanly ... OK
 * checking for unstated dependencies in R code ... OK
 * checking S3 generic/method consistency ... OK
 * checking replacement functions ... OK
 * checking foreign function calls ... OK
 * checking R code for possible problems ... OK
 * checking Rd files ... OK
 * checking Rd metadata ... OK
 * checking Rd cross-references ... OK
 * checking for missing documentation entries ... OK
 * checking for code/documentation mismatches ... OK
 * checking Rd \usage sections ... OK
 * checking Rd contents ... OK
 * checking for unstated dependencies in examples ... OK
 * checking contents of 'data' directory ... OK
 * checking data for non-ASCII characters ... OK
 * checking data for ASCII and uncompressed saves ... OK
 * checking examples ... ERROR
 Running examples in 'hydroGOF-Ex.R' failed
 The error most likely occurred in:

 ### Name: plot2
 ### Title: Plotting 2 Time Series
 ### Aliases: plot2
 ### Keywords: dplot

 ### ** Examples

 sim - 2:11
 obs - 1:10
 ## Not run:
 ##D plot2(sim, obs)
 ## End(Not run)

 ##
 # Loading daily streamflows of the Ega River (Spain), from 1961 to 1970
 require(zoo)
 Loading required package: zoo

 Attaching package: 'zoo'

 The following object(s) are masked from 'package:base':

    as.Date, as.Date.numeric

 data(EgaEnEstellaQts)
 obs - EgaEnEstellaQts

 # Generating a simulated daily time series, initially equal to the observed 
 se
 ries
 sim - obs

 # Randomly changing the first 2000 elements of 'sim', by using a normal 
 distri
 bution
 # with mean 10 and standard deviation equal to 1 (default of 'rnorm').
 sim[1:2000] - obs[1:2000] + rnorm(2000, mean=10)

 # Plotting 'sim' and 'obs' in 2 separate panels
 plot2(x=obs, y=sim)

 # Plotting 'sim' and 'obs' in the same window
 plot2(x=obs, y=sim, plot.type=single)
 Error in as.POSIXlt.character(x, tz, ...) :
  character string is not in a standard unambiguous format
 Calls: plot2 ... as.POSIXct.default - as.POSIXct - as.POSIXlt - 
 as.POSIXlt.ch
 aracter
 Execution halted

 b) option --binary is no longer available, is that so?  How can an extension 
 zip can be built on Windows?

 R CMD build --no-vignettes hydroGOF works.   And R CMD INSTALL 
 hydroGOFxx.tar.gz too.

 Many thanks

 Ed

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2 reorder factors for faceting

2011-11-08 Thread Kenneth Frost

Hi, Iain-



You might want to have a look at 

?relevel



i.e.  plotData$infection-relevel(plotData$infection, ref = 'InfC')



Ken

On 11/08/11, Iain Gallagher   wrote:
 
 
 Dear List
 
 I am trying to draw a heatmap using ggplot2. In this heatmap I have faceted 
 my data by 'infection' of which I have four. These four infections break down 
 into two types and I would like to reorder the 'infection' column of my data 
 to reflect this. 
 
 Toy example below:
 
 library(ggplot2)
 
 # test data for ggplot reordering
 genes - (rep (c(rep('a',4), rep('b',4), rep('c',4), rep('d',4), rep('e',4), 
 rep('f',4)) ,4))
 fcData - rnorm(96)
 times - rep(rep(c(2,6,24,48),6),4)
 infection - c(rep('InfA', 24), rep('InfB', 24), rep('InfC', 24), rep('InfD', 
 24))
 infType - c(rep('M', 24), rep('D',24), rep('M', 24), rep('D', 24))
 
 # data is long format for ggplot2
 plotData - as.data.frame(cbind(genes, as.numeric(fcData), as.numeric(times), 
 infection, infType))
 
 hp2 - ggplot(plotData, aes(factor(times), genes)) + geom_tile(aes(fill = 
 scale(as.numeric(fcData + facet_wrap(~infection, ncol=4)
 
 # set scale
 hp2 - hp2 + scale_fill_gradient2(name=NULL, low=#0571B0, mid=#F7F7F7, 
 high=#CA0020, midpoint=0, breaks=NULL, labels=NULL, limits=NULL, 
 trans=identity) 
 
 # set up text (size, colour etc etc)
 hp2 - hp2 + labs(x = Time, y = ) + scale_y_discrete(expand = c(0, 0)) + 
 opts(axis.ticks = theme_blank(), axis.text.x = theme_text(size = 10, angle = 
 360, hjust = 0, colour = grey25), axis.text.y = theme_text(size=10, colour 
 = 'gray25'))
 
 hp2 - hp2 + theme_bw()
 
 In the resulting plot I would like infections infA and infC plotted next to 
 each other and likewise for infB and infD. I have a column in the data - 
 infType - which I could use to reorder the infection column but so far I have 
 no luck getting this to work.
 
 Could someone give me a pointer to the best way to reorder the infection 
 factor and accompanying data into the order I would like?
 
 Best
 
 iain
 
  sessionInfo()
 R version 2.13.2 (2011-09-30)
 Platform: x86_64-pc-linux-gnu (64-bit)
 
 locale:
  [1] LC_CTYPE=en_GB.utf8   LC_NUMERIC=C 
  [3] LC_TIME=en_GB.utf8    LC_COLLATE=en_GB.utf8    
  [5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8   
  [7] LC_PAPER=en_GB.utf8   LC_NAME=C    
  [9] LC_ADDRESS=C  LC_TELEPHONE=C   
 [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C  
 
 attached base packages:
 [1] grid  stats graphics  grDevices utils datasets  methods  
 [8] base 
 
 other attached packages:
 [1] ggplot2_0.8.9 proto_0.3-9.2 reshape_0.8.4 plyr_1.6 
 
 loaded via a namespace (and not attached):
 [1] digest_0.5.0 tools_2.13.2
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2 reorder factors for faceting

2011-11-08 Thread ONKELINX, Thierry

Dear Iain,

RSiteSearch(ggplot2 reorder, restrict = c(Rhelp10, Rhelp08)) gives you 
the solution.

Best regards,

Thierry
 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 Namens Iain Gallagher
 Verzonden: dinsdag 8 november 2011 12:51
 Aan: r-help
 Onderwerp: [R] ggplot2 reorder factors for faceting
 
 
 
 Dear List
 
 I am trying to draw a heatmap using ggplot2. In this heatmap I have faceted my
 data by 'infection' of which I have four. These four infections break down 
 into
 two types and I would like to reorder the 'infection' column of my data to 
 reflect
 this.
 
 Toy example below:
 
 library(ggplot2)
 
 # test data for ggplot reordering
 genes - (rep (c(rep('a',4), rep('b',4), rep('c',4), rep('d',4), rep('e',4), 
 rep('f',4)) ,4))
 fcData - rnorm(96) times - rep(rep(c(2,6,24,48),6),4) infection - 
 c(rep('InfA',
 24), rep('InfB', 24), rep('InfC', 24), rep('InfD', 24)) infType - c(rep('M', 
 24),
 rep('D',24), rep('M', 24), rep('D', 24))
 
 # data is long format for ggplot2
 plotData - as.data.frame(cbind(genes, as.numeric(fcData), as.numeric(times),
 infection, infType))
 
 hp2 - ggplot(plotData, aes(factor(times), genes)) + geom_tile(aes(fill =
 scale(as.numeric(fcData + facet_wrap(~infection, ncol=4)
 
 # set scale
 hp2 - hp2 + scale_fill_gradient2(name=NULL, low=#0571B0, mid=#F7F7F7,
 high=#CA0020, midpoint=0, breaks=NULL, labels=NULL, limits=NULL,
 trans=identity)
 
 # set up text (size, colour etc etc)
 hp2 - hp2 + labs(x = Time, y = ) + scale_y_discrete(expand = c(0, 0)) +
 opts(axis.ticks = theme_blank(), axis.text.x = theme_text(size = 10, angle = 
 360,
 hjust = 0, colour = grey25), axis.text.y = theme_text(size=10, colour =
 'gray25'))
 
 hp2 - hp2 + theme_bw()
 
 In the resulting plot I would like infections infA and infC plotted next to 
 each
 other and likewise for infB and infD. I have a column in the data - infType - 
 which
 I could use to reorder the infection column but so far I have no luck getting 
 this
 to work.
 
 Could someone give me a pointer to the best way to reorder the infection 
 factor
 and accompanying data into the order I would like?
 
 Best
 
 iain
 
  sessionInfo()
 R version 2.13.2 (2011-09-30)
 Platform: x86_64-pc-linux-gnu (64-bit)
 
 locale:
  [1] LC_CTYPE=en_GB.utf8   LC_NUMERIC=C
  [3] LC_TIME=en_GB.utf8    LC_COLLATE=en_GB.utf8
  [5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8
  [7] LC_PAPER=en_GB.utf8   LC_NAME=C
  [9] LC_ADDRESS=C  LC_TELEPHONE=C [11]
 LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C
 
 attached base packages:
 [1] grid  stats graphics  grDevices utils datasets  methods [8] 
 base
 
 other attached packages:
 [1] ggplot2_0.8.9 proto_0.3-9.2 reshape_0.8.4 plyr_1.6
 
 loaded via a namespace (and not attached):
 [1] digest_0.5.0 tools_2.13.2
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Export interactive linked graphs?

2011-11-08 Thread Rainer M Krug

Hi

I am falling in love with iplots, especially the linked graphs, i.e.
selecting points in one will highlight the same points in the other graph
as well.
But I would like to be able to export the graphs, so that I can give them
to somebody else to be able to look at the graphs and explore the data. I
could write a script in R, but that would require installing the packages.

Is there a way of exporting these graphs, so that the user can select
points and that they are highlighted in the other graph as well? Giving the
data along would not be a problem.

Any help appreciated,

Rainer

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology,
UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax (F):   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Aggregate or extract function ?

2011-11-08 Thread Jean V Adams

Celine wrote on 11/08/2011 02:28:32 AM:
 
 Thanks for your help, I still have a little problem with this function
 because I don't have the same number of line in my two datarame so when 
I
 try to apply the dataframe function, I obtain this response ; that I 
have a
 different number of lines.
 
 Erreur dans data.frame(train, test[rowz, ]) : 
   les arguments impliquent des nombres de lignes différents : 50327, 
66592
 Do you know how I could solve this problem ?
 
 Thanks,
 
 Céline


The knn1() function finds, for each row of test (df1), the closest 
coordinates in train (df2). 
rowz - knn1(train, test, cl)
Thus, rowz should have the same number of elements as the test (df1) has 
rows, and df1 and df2[rowz, ] should have the same numbers of rows.

Without the data in hand, I don't know what's going on.  I suggest you 
look at the rowz variable and see if it makes sense.  Is it matching up 
the rows the way it should?  Try looking at a subset of the df1 and df2 
data within a certain narrow range of Xs and Ys.

Also, when replying to the r-help list, it is helpful for other readers to 
maintain the history of previous posts.

Jean
 

--- previous posts ---

Celine wrote on 11/07/2011 02:50:55 PM:
 
 Hi R user,
 
 I have two dataframe with different variables and coordinates :
X Y sp bio3 bio5 bio6 bio13 bio14
 1 -70.91667 -45.08333  0   47  194  -274712
 2 -86.58333  66.25000  0   16  119 -34542 3
 3 -62.58333 -17.91667  0   68  334  152   14428
 4 -68.91667 -31.25000  0   54  235  -4525 7
 5  55.58333  48.41667  0   23  319 -1722314
 6  66.25000  37.75000  0   34  363  -1849 0
 
 and this one :
 
  X  Y LU1 LU2 LU3 LU4 LU5 LU6 LU7 LU8 LU9 LU10 LU11 LU12 LU13   LU14
 LU15 LU16 LU17 LU18
 1 -36.5 84   0   0   0   0   0   0   0   0   00000 
 0.000000
 2 -36.0 84   0   0   0   0   0   0   0   0   00000 
 0.000000
 3 -35.5 84   0   0   0   0   0   0   0   0   00000 
 26.0854680000
 4 -35.0 84   0   0   0   0   0   0   0   0   00000 
 0.000000
 5 -34.5 84   0   0   0   0   0   0   0   0   00000 
 5.2677610000
 6 -34.0 84   0   0   0   0   0   0   0   0   00000
 105.3710690000
 
 I wouldlike to add to my first dataframe the value of the LU variables 
at
 the coordinates of the first dataframe. Of course, the coordinates are 
not
 at the same resolution and are different, this is the problem.
  I wouldlike to decrease the resolution of the first one because the 
second
 dataframe have a coarser resolution and obtain something like that :
 
X Y sp bio3 bio5 bio6 bio13 bio14 LU1 LU2 LU3 LU4 ...
 1 -70.91667 -45.08333  0   47  194  -274712 0 22.08 76.9
 2 -86.58333  66.25000  0   16  119 -34542 3 0 22.08 76.9
 3 -62.58333 -17.91667  0   68  334  152   14428 0 22.08 76.9
 4 -68.91667 -31.25000  0   54  235  -4525 7 0 22.08 76.9
 5  55.58333  48.41667  0   23  319 -1722314 0 22.08 76.9
 6  66.25000  37.75000  0   34  363  -1849 0 0 22.08 76.9
 
 Do someone know a function or a way to do obtain that ?
 
 Thanks in advance for the help,
 Céline


You could use 1-nearest neighbor classification to find the closest set of 

coordinates in the second data frame (df2) to each row of coordinates in 
the first data frame (df1).  The function knn1() is in the r package 
class.  For example:

library(class)

train - df2[, c(X, Y)]
test - df1[, c(X, Y)]
cl - 1:dim(train)[1]
rowz - knn1(train, test, cl)
data.frame(df1, df2[rowz, ])

Jean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How much time a process need?

Actually I want to have a rough approximation. A process that takes one day and 
a half it is good to send me how many hours it gets. It is not a problem to 
convert the values of system.error the major is that I am not sure If I should 
use the user or elapsed time for getting an estimation of how much it takes 
once  I press enter theh process to finish
 
A



From: Jim Holtman jholt...@gmail.com
To: Rolf Turner rolf.tur...@xtra.co.nz
Cc: R-help@r-project.org R-help@r-project.org
Sent: Tuesday, November 8, 2011 12:34 PM
Subject: Re: [R] How much time a process need?

could be that the process was waiting for the user to select a file from a 
list, or some other input before proceeding.  Would have to see what the 
overall performance of the system was at that point.  It could also have been 
that the process was low on physical memory and there was a lot of paging going 
on.

Sent from my iPad

On Nov 7, 2011, at 21:50, Rolf Turner rolf.tur...@xtra.co.nz wrote:

 On 08/11/11 02:40, Joshua Wiley wrote:

 So I just need to get the
 
    user  system elapsed
   0.460   0.048  67.366
 
 
 user value and convert the seconds to days and then to hours ? Right?
 
 What about this elapsed field?
 It's all in seconds.  Convert whatever fields you want.
 
 That being said, doesn't having an elapsed time of over 67 seconds,
 when the actual calculation takes less than half a second, indicate
 that something weird is going on?  At the very least the R calculations
 are fighting for resources with some other very greedy processes.
 
    cheers,
 
        Rolf Turner
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] save at relative directory

Dear all,
I have a variable called thres and before I run a script I set it to a value
like
thres- -10
at the end of the execution I am issuing a save(variablename,file='Results')
which will end up with a file saved at the current directory with the name 
Results
 
I would like though to use thres value and do the followingg
save at the directory called 10 so to get ./10/Results, (yes I want this in a 
relative order)
 
My question is how I can also check if the directory exists R to create it?
 
I would like to thank you in advance for your help
 
B.R
Alex
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] save at relative directory

2011-11-08 Thread Joshua Wiley

Hi Alex,

Look at some of these functions:

apropos(dir)
apropos(exists)

Cheers,

Josh


On Tue, Nov 8, 2011 at 5:36 AM, Alaios ala...@yahoo.com wrote:
 Dear all,
 I have a variable called thres and before I run a script I set it to a value
 like
 thres- -10
 at the end of the execution I am issuing a save(variablename,file='Results')
 which will end up with a file saved at the current directory with the name 
 Results

 I would like though to use thres value and do the followingg
 save at the directory called 10 so to get ./10/Results, (yes I want this in a 
 relative order)

 My question is how I can also check if the directory exists R to create it?

 I would like to thank you in advance for your help

 B.R
 Alex
        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] match first consecutive list of capitalized words in string

2011-11-08 Thread Richter-Dumke, Jonas

Dear R-Helpers,

this is my first post ever to a mailing list, so please feel free to point out 
any missunderstandings on my side regarding the conventions of this mailing 
list.

My problem:

Assuming the following character vector is given:

names - c(filia Maria, vidua Joh Dirck Kleve (oo 02.02.1732), Bernardus 
Engelb Franciscus Linde j.u.Doktor referendarius sereniss Judex et gograven 
Rheinensis)

Is there a regular expression matching the first consecutive list of 
capitalized words in a single characterstring (Maria, Joh Dirck Kleve, 
Bernardus Engelb Franciscus Linde)?
This expression would very reliably seperate the person names from the 
additional information in my historic church register transcription.

Thank you very much for your effort,

Jonas

--
This mail has been sent through the MPI for Demographic ...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] compare linear regressions

2011-11-08 Thread Holger Taschenberger

Hi,

I'm trying to compare two linear regressions. I'm using the
following approach:
##
xx-1:100
df1 - data.frame(x = xx, y = xx * 2 + 30 + rnorm(n=length(xx),sd=10), g = 1)
df2 - data.frame(x = xx, y = xx * 4 + 9  + rnorm(n=length(xx),sd=10), g = 2)
dta - rbind(df1, df2)
dta$g - factor(dta$g)
plot(df2$x,df2$y,type=l,col=red)
lines(df1$x,df1$y,col=blue)
summary(lm(formula = y ~ x + g + x:g, dta))
##
I learned that the coefficients (g2 and x:g2) tell me about the
differences in intercept and slope and the corresponding p-values.

Now I'm trying to do the same except that there should be no intercept
term:
##
xx-1:100
df1 - data.frame(x = xx, y = xx * 2 + rnorm(n=length(xx),sd=10), g = 1)
df2 - data.frame(x = xx, y = xx * 4 + rnorm(n=length(xx),sd=10), g = 2)
dta - rbind(df1, df2)
dta$g - factor(dta$g)
plot(df2$x,df2$y,type=l,col=red)
lines(df1$x,df1$y,col=blue)
summary(lm(formula = y ~ x - 1 + x:g, dta))
##
I assume that the last line is the correct way to specify a linear model
without intercept. But I'm not certain about that. Can someone please
confirm?

Thanks a lot,
Holger

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How much time a process need?

The 'user' + 'system' will give you how much CPU is required which is
an indication of how many of the CPU cycles you are using.  The
elapsed time is just how long it spent.  If the script is CPU
intensive, and there is no paging going on, you should see the CPU
time close to the elapsed time.  Longer extensions in the elapsed time
is an indication of extensive I/O, or it might  be an indication that
you are calling another process; e.g., you have a PERL script that is
parsing some data in preparation to using in R.

If you are really interested, then enable the monitoring on your
system and see what else is running.  Use 'perfmon' on WIndows or a
combination of 'ps' and 'vmstat' on UNIX/Linux systems.  Run you
script on a system that has nothing else running and interfering with
resources and then look at the results along with the performance data
from the entire system in  order to create a model of how long
things will take.

On Tue, Nov 8, 2011 at 8:31 AM, Alaios ala...@yahoo.com wrote:
 Actually I want to have a rough approximation. A process that takes one day
 and a half it is good to send me how many hours it gets. It is not a problem
 to convert the values of system.error the major is that I am not sure If I
 should use the user or elapsed time for getting an estimation of how much it
 takes once  I press enter theh process to finish

 A
 From: Jim Holtman jholt...@gmail.com
 To: Rolf Turner rolf.tur...@xtra.co.nz
 Cc: R-help@r-project.org R-help@r-project.org
 Sent: Tuesday, November 8, 2011 12:34 PM
 Subject: Re: [R] How much time a process need?

 could be that the process was waiting for the user to select a file from a
 list, or some other input before proceeding.  Would have to see what the
 overall performance of the system was at that point.  It could also have
 been that the process was low on physical memory and there was a lot of
 paging going on.

 Sent from my iPad

 On Nov 7, 2011, at 21:50, Rolf Turner rolf.tur...@xtra.co.nz wrote:

 On 08/11/11 02:40, Joshua Wiley wrote:
 On Mon, Nov 7, 2011 at 5:32 AM, Alaiosala...@yahoo.com  wrote:
 So I just need to get the

    user  system elapsed
  0.460  0.048  67.366


 user value and convert the seconds to days and then to hours ? Right?

 What about this elapsed field?
 It's all in seconds.  Convert whatever fields you want.

 That being said, doesn't having an elapsed time of over 67 seconds,
 when the actual calculation takes less than half a second, indicate
 that something weird is going on?  At the very least the R calculations
 are fighting for resources with some other very greedy processes.

    cheers,

        Rolf Turner

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rename a directory in R

2011-11-08 Thread Gavin Blackburn

Hi,

I want to be able to rename a folder using R, similar to file.rename. I want to 
paste  - done onto the folder name. The reason for this is we run a loop on a 
large number of folders on a server and it would be nice for people to be able 
to log in and instantly see if their data has been processed so they can remove 
it.

I have searched the help list but have not found anything and was wondering if 
this is possible?

I realise I could probably do this through the R/system command line interface 
but I thought there may be a simpler way to do this through R.

Many thanks,

Gavin.

Dr. Gavin Blackburn
SULSA Technologist

Strathclyde institute of Pharmacy and Biomedical Science
161 Cathedral Street,
Glasgow.
G4 0RE

Tel: +44 (0)1415483828

ScotMet: The Scottish Metabolomics Facility
www.metabolomics.strath.ac.ukhttp://www.metabolomics.strath.ac.uk


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] compare linear regressions

Yes, adding -1 or +0 both return a lm without an intercept term.

Michael

On Tue, Nov 8, 2011 at 8:52 AM, Holger Taschenberger
holger.taschenber...@mpi-bpc.mpg.de wrote:
 Hi,

        I'm trying to compare two linear regressions. I'm using the
 following approach:
 ##
 xx-1:100
 df1 - data.frame(x = xx, y = xx * 2 + 30 + rnorm(n=length(xx),sd=10), g = 1)
 df2 - data.frame(x = xx, y = xx * 4 + 9  + rnorm(n=length(xx),sd=10), g = 2)
 dta - rbind(df1, df2)
 dta$g - factor(dta$g)
 plot(df2$x,df2$y,type=l,col=red)
 lines(df1$x,df1$y,col=blue)
 summary(lm(formula = y ~ x + g + x:g, dta))
 ##
 I learned that the coefficients (g2 and x:g2) tell me about the
 differences in intercept and slope and the corresponding p-values.

 Now I'm trying to do the same except that there should be no intercept
 term:
 ##
 xx-1:100
 df1 - data.frame(x = xx, y = xx * 2 + rnorm(n=length(xx),sd=10), g = 1)
 df2 - data.frame(x = xx, y = xx * 4 + rnorm(n=length(xx),sd=10), g = 2)
 dta - rbind(df1, df2)
 dta$g - factor(dta$g)
 plot(df2$x,df2$y,type=l,col=red)
 lines(df1$x,df1$y,col=blue)
 summary(lm(formula = y ~ x - 1 + x:g, dta))
 ##
 I assume that the last line is the correct way to specify a linear model
 without intercept. But I'm not certain about that. Can someone please
 confirm?

 Thanks a lot,
        Holger

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple coordinate transformations

2011-11-08 Thread Duncan Murdoch


On 11-11-08 7:24 AM, Matthew Young wrote:

Hi,

I need to do some simple coordinate transforms between cartesian,
cylindrical and spherical coordinates.  I can't find any built in
functions or packages to do this, is this because they do not exist?
Obviously I can write my own code, but don't want to re-invent the wheel
if I can avoid it.


There are several contributed packages that do at least spherical 
coordinates.  Try


RSiteSearch(spherical coordinates)

to find them.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] from points in Lon/Lat to physical distance in dist class

2011-11-08 Thread Mao Jianfeng

Dear R-listers,

Here, I would like to hearing helps from you.

I have GPS data (multiple points in the geographic scale) in
longitude/latitude. I intend to calculate
distance (in kilometer) among such points and output the distance matrix in
dist class.

I have gotten some progress, but I still can not get final goal. Could
please give me any directions/advice?

This email cc. to Mr. Pierre, the author of BoSSA package.

Thanks a lot in advance.

Best wishes,

Jian-Feng, Mao

#
What I have gotten are
(1) distance among GPS points could be calculated by distGPS() function in
BoSSA package. But, it can
not output the distance in dist class. And, I do not know how to convert
such distance matrix to dist class.

(2) dist() function in base R can calculate distance among units and export
it in dist class. But, it could not
be used to work on points in lon/lat.

(3) some dummy codes
# (3.1) generate dummy points in lon/lat (in degree)
points - data.frame(lon=seq(95, 105),lat=seq(35, 45))

# (3.2) calculate distance between points using distGPS()

library(BoSSA)

Geodist-distGPS(points)

str(Geodist)
class(Geodist)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rekeying value denoting NA

2011-11-08 Thread Jean V Adams

SML wrote on 11/07/2011 09:10:30 PM:
 
 I'm trying to rekey values which denote there is no values, i.e.,
 '-999' in a dataset which contains both '-999' and NA entries.
 When I try the following command I get the following error:
 
  data.frame[data.frame$MAR = -9,MAR] - NA
 
 missing values are not allowed in subscripted assignments of data
 frames
 
 Example of data:
 YEAR   JAN FEB MAR ... DEC
 1931   5   -999NA  3
 1932   2   1   -9992
 .
 .
 .
 2010   -999NA  2   1
 
 
 I've tried to replace the NAs with -999 values first to remove the NA
 values, but got the same error.
 
 I'm quite new to R, and these little issues seem to be a stumbling
 block. Many thanks for any help you might be able to offer.


First of all, you should call your data frame something other than 
data.frame because data.frame is already a function in the base package of 
R.  Let's call it df, instead,
df - data.frame
rm(data.frame)

Secondly, it looks like the variables in your data frame (JAN, FEB, MAR, 
..., DEC) are character not numeric, because their values are left aligned 
in your example print out.  You can test this out by showing the class of 
each variable in the data frame,
lapply(df, class)

If the variables are character, you can convert them to numeric,
df2 - as.data.frame(lapply(df, as.numeric))

Then you can convert all the -999 values to NAs,
df2[df2  -99] - NA

YEAR JAN FEB MAR ... DEC
1931   5  NA  NA   3
1932   2  NA  NA   2
.
.
.
2010  NA  NA   2   1

Jean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rename a directory in R

2011-11-08 Thread R. Michael Weylandt michael.weyla...@gmail.com

apropos('rename')

you will find 'file.rename' which also works on directories.

On Tue, Nov 8, 2011 at 7:45 AM, Gavin Blackburn
gavin.blackb...@strath.ac.uk wrote:
 Hi,

 I want to be able to rename a folder using R, similar to file.rename. I want 
 to paste  - done onto the folder name. The reason for this is we run a loop 
 on a large number of folders on a server and it would be nice for people to 
 be able to log in and instantly see if their data has been processed so they 
 can remove it.

 I have searched the help list but have not found anything and was wondering 
 if this is possible?

 I realise I could probably do this through the R/system command line 
 interface but I thought there may be a simpler way to do this through R.

 Many thanks,

 Gavin.

 Dr. Gavin Blackburn
 SULSA Technologist

 Strathclyde institute of Pharmacy and Biomedical Science
 161 Cathedral Street,
 Glasgow.
 G4 0RE

 Tel: +44 (0)1415483828

 ScotMet: The Scottish Metabolomics Facility
 www.metabolomics.strath.ac.ukhttp://www.metabolomics.strath.ac.uk


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] from points in Lon/Lat to physical distance in dist class

2011-11-08 Thread David L Carlson

Look at distm() in package geosphere.

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Mao Jianfeng
Sent: Tuesday, November 08, 2011 8:22 AM
To: r-help@r-project.org
Cc: bossa.pack...@gmail.com
Subject: [R] from points in Lon/Lat to physical distance in dist class

Dear R-listers,

Here, I would like to hearing helps from you.

I have GPS data (multiple points in the geographic scale) in
longitude/latitude. I intend to calculate
distance (in kilometer) among such points and output the distance matrix in
dist class.

I have gotten some progress, but I still can not get final goal. Could
please give me any directions/advice?

This email cc. to Mr. Pierre, the author of BoSSA package.

Thanks a lot in advance.

Best wishes,

Jian-Feng, Mao

#
What I have gotten are
(1) distance among GPS points could be calculated by distGPS() function in
BoSSA package. But, it can
not output the distance in dist class. And, I do not know how to convert
such distance matrix to dist class.

(2) dist() function in base R can calculate distance among units and export
it in dist class. But, it could not
be used to work on points in lon/lat.

(3) some dummy codes
# (3.1) generate dummy points in lon/lat (in degree)
points - data.frame(lon=seq(95, 105),lat=seq(35, 45))

# (3.2) calculate distance between points using distGPS()

library(BoSSA)

Geodist-distGPS(points)

str(Geodist)
class(Geodist)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to use quadrature to integrate some complicated functions

Have you tried wrapping it in a function and using integrate()? R is pretty 
good at handling numerical integration. If integrate() isn't good for you, can 
you say more as to why?

Michael


On Nov 6, 2011, at 4:15 PM, JeffND zuofeng.shan...@nd.edu wrote:

 Hello to all,
 
 I am having trouble with intregrating a complicated uni-dimensional function
 of the following form
 
 Phi(x-a_1)*Phi(x-a_2)*...*Phi(x-a_{n-1})*phi(x-a_n).
 
 Here n is about 5000, Phi is the cumulative distribution function of
 standard normal, 
 phi is the density function of standard normal, and x ranges over
 (-infty,infty).
 
 My idea is to to use quadrature to handle this integral. But since Phi has
 not cloaed form,
 I don't know how to do this effeciently. I appreciate very much if someone
 has any ideas about it.
 
 Thanks!
 
 Jeff
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/how-to-use-quadrature-to-integrate-some-complicated-functions-tp3996765p3996765.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] compare linear regressions

2011-11-08 Thread Holger Taschenberger

Thank you. I was confused because the output of the line summary(lm(formula
=... reads:

Coefficients: (1 not defined because of singularities),

which did not look like a normal message (which can safely be ignored)
to me.

--Holger

On Tue, 8 Nov 2011 15:06:28 +0100
Peter Konings peter.l.e.koni...@gmail.com wrote:

 On Tue, Nov 8, 2011 at 2:52 PM, Holger Taschenberger 
 holger.taschenber...@mpi-bpc.mpg.de wrote:
 snip
 
  summary(lm(formula = y ~ x - 1 + x:g, dta))
  ##
  I assume that the last line is the correct way to specify a linear model
  without intercept. But I'm not certain about that. Can someone please
  confirm?
 
 
 Yes, that's true. See chapter 11 of the Introduction To R manual that was
 installed with R for an overview of model specification in R.
 
 HTH
 Peter.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building package problem




On 08.11.2011 13:50, Joshua Wiley wrote:

Hi Ed,

If the only error is in examples then this should work:

R CMD check --no-examples foopkg



Disabling the example checks is not the solution - well, it is the one 
to hide the errors, of course.




should not have anything to do with vignettes (although those may also
not run, who knows).  As far as building a binary, look at:

R CMD INSTALL --help

which leads you to

R CMD INSTALL --build foopkg



... and was the recommended way to build binaries as long as I am 
maintaining the Windows binaries on CRAN (which is more than 8 years, I 
think), if I remember correctly.





HTH,

Josh



On Tue, Nov 8, 2011 at 4:35 AM, Eduardo M. A. M. Mendes
emammen...@gmail.com  wrote:

Dear R-users

I am trying to recompile a CRAN package on Windows 32.   Rtools for 2.14 (that 
is the version I am running) and miktex were sucessfully installed on my 
machine.

Problems:

a) hydroGOF is a CRAN package, but R CMD check does not work on it.



It works on it and tells you there is an error. Since the CRAN check 
summary pages do not show that error, I anticipate your unstated 
versions of zoo and/or other packages are outdated. Please update all 
your packages an dtry again.


Best,
Uwe Ligges








C:\Users\eduardo\Documents\R_tests2R CMD check hydroGOF
* using log directory 'C:/Users/eduardo/Documents/R_tests2/hydroGOF.Rcheck'
* using R version 2.14.0 (2011-10-31)
* using platform: i386-pc-mingw32 (32-bit)
* using session charset: ISO8859-1
* checking for file 'hydroGOF/DESCRIPTION' ... OK
* checking extension type ... Package
* this is package 'hydroGOF' version '0.3-2'
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking whether package 'hydroGOF' can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking for portable file names ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking for unstated dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking contents of 'data' directory ... OK
* checking data for non-ASCII characters ... OK
* checking data for ASCII and uncompressed saves ... OK
* checking examples ... ERROR
Running examples in 'hydroGOF-Ex.R' failed
The error most likely occurred in:


### Name: plot2
### Title: Plotting 2 Time Series
### Aliases: plot2
### Keywords: dplot

### ** Examples

sim- 2:11
obs- 1:10
## Not run:
##D plot2(sim, obs)
## End(Not run)

##
# Loading daily streamflows of the Ega River (Spain), from 1961 to 1970
require(zoo)

Loading required package: zoo

Attaching package: 'zoo'

The following object(s) are masked from 'package:base':

as.Date, as.Date.numeric


data(EgaEnEstellaQts)
obs- EgaEnEstellaQts

# Generating a simulated daily time series, initially equal to the observed se

ries

sim- obs

# Randomly changing the first 2000 elements of 'sim', by using a normal distri

bution

# with mean 10 and standard deviation equal to 1 (default of 'rnorm').
sim[1:2000]- obs[1:2000] + rnorm(2000, mean=10)

# Plotting 'sim' and 'obs' in 2 separate panels
plot2(x=obs, y=sim)

# Plotting 'sim' and 'obs' in the same window
plot2(x=obs, y=sim, plot.type=single)

Error in as.POSIXlt.character(x, tz, ...) :
  character string is not in a standard unambiguous format
Calls: plot2 ... as.POSIXct.default -  as.POSIXct -  as.POSIXlt -  
as.POSIXlt.ch
aracter
Execution halted

b) option --binary is no longer available, is that so?  How can an extension 
zip can be built on Windows?

R CMD build --no-vignettes hydroGOF works.   And R CMD INSTALL 
hydroGOFxx.tar.gz too.

Many thanks

Ed

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and

Re: [R] compare linear regressions

2011-11-08 Thread Jeff Newmiller

Just because your model specification is valid does not mean your data can be 
analyzed using that particular model specification.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Holger Taschenberger holger.taschenber...@mpi-bpc.mpg.de wrote:

Thank you. I was confused because the output of the line
summary(lm(formula
=... reads:

Coefficients: (1 not defined because of singularities),

which did not look like a normal message (which can safely be ignored)
to me.

--Holger

On Tue, 8 Nov 2011 15:06:28 +0100
Peter Konings peter.l.e.koni...@gmail.com wrote:

 On Tue, Nov 8, 2011 at 2:52 PM, Holger Taschenberger 
 holger.taschenber...@mpi-bpc.mpg.de wrote:
 snip
 
  summary(lm(formula = y ~ x - 1 + x:g, dta))
  ##
  I assume that the last line is the correct way to specify a linear
model
  without intercept. But I'm not certain about that. Can someone
please
  confirm?
 
 
 Yes, that's true. See chapter 11 of the Introduction To R manual
that was
 installed with R for an overview of model specification in R.
 
 HTH
 Peter.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building package problem

2011-11-08 Thread Prof Brian Ripley

R CMD check is *not* 'building a package'.  Nor is making a Windows 
binary package.  'Building a package' is creating a source tarball 
from a source directory.


On Tue, 8 Nov 2011, Joshua Wiley wrote:


Hi Ed,

If the only error is in examples then this should work:

R CMD check --no-examples foopkg

should not have anything to do with vignettes (although those may also
not run, who knows).  As far as building a binary, look at:

R CMD INSTALL --help

which leads you to

R CMD INSTALL --build foopkg


And as for the hydroGOF issue, my guess is that the problem is your 
locale or timezone.  But despite the posting guide, you failed to tell 
us. AFAIK CRAN only checks packages in English locales.


(One thing we know is that in Columbia there was no midnight on one of 
the dates in that file.  So hydroGOF really ought to be specifying a 
timezone when reading character datetimes.)




HTH,

Josh



On Tue, Nov 8, 2011 at 4:35 AM, Eduardo M. A. M. Mendes
emammen...@gmail.com wrote:

Dear R-users

I am trying to recompile a CRAN package on Windows 32.   Rtools for 2.14 (that 
is the version I am running) and miktex were sucessfully installed on my 
machine.

Problems:

a) hydroGOF is a CRAN package, but R CMD check does not work on it.

C:\Users\eduardo\Documents\R_tests2R CMD check hydroGOF
* using log directory 'C:/Users/eduardo/Documents/R_tests2/hydroGOF.Rcheck'
* using R version 2.14.0 (2011-10-31)
* using platform: i386-pc-mingw32 (32-bit)
* using session charset: ISO8859-1
* checking for file 'hydroGOF/DESCRIPTION' ... OK
* checking extension type ... Package
* this is package 'hydroGOF' version '0.3-2'
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking whether package 'hydroGOF' can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking for portable file names ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking for unstated dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking contents of 'data' directory ... OK
* checking data for non-ASCII characters ... OK
* checking data for ASCII and uncompressed saves ... OK
* checking examples ... ERROR
Running examples in 'hydroGOF-Ex.R' failed
The error most likely occurred in:


### Name: plot2
### Title: Plotting 2 Time Series
### Aliases: plot2
### Keywords: dplot

### ** Examples

sim - 2:11
obs - 1:10
## Not run:
##D plot2(sim, obs)
## End(Not run)

##
# Loading daily streamflows of the Ega River (Spain), from 1961 to 1970
require(zoo)

Loading required package: zoo

Attaching package: 'zoo'

The following object(s) are masked from 'package:base':

   as.Date, as.Date.numeric


data(EgaEnEstellaQts)
obs - EgaEnEstellaQts

# Generating a simulated daily time series, initially equal to the observed se

ries

sim - obs

# Randomly changing the first 2000 elements of 'sim', by using a normal distri

bution

# with mean 10 and standard deviation equal to 1 (default of 'rnorm').
sim[1:2000] - obs[1:2000] + rnorm(2000, mean=10)

# Plotting 'sim' and 'obs' in 2 separate panels
plot2(x=obs, y=sim)

# Plotting 'sim' and 'obs' in the same window
plot2(x=obs, y=sim, plot.type=single)

Error in as.POSIXlt.character(x, tz, ...) :
 character string is not in a standard unambiguous format
Calls: plot2 ... as.POSIXct.default - as.POSIXct - as.POSIXlt - as.POSIXlt.ch
aracter
Execution halted

b) option --binary is no longer available, is that so?  How can an extension 
zip can be built on Windows?

R CMD build --no-vignettes hydroGOF works.   And R CMD INSTALL 
hydroGOFxx.tar.gz too.

Many thanks

Ed

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide

Re: [R] save at relative directory

Hmm I will try something like that

    if (file.exists('threshold')==FALSE)
      dir.create(paste('./',abs(threshold)))

    save(var,file=paste('./',abs(threshold),'/',DataSource[[4]],sep=)

I just need a bit of confirmation If I am doing soemthing terribly wrong that 
might harm my filesystem.

I also did accidentaly
var- -13
dir.create(paste('./',var)) 

which created a folder called -13 which I do not know how to remove it

 rmdir -13
rmdir: invalid option -- '1'
Try `rmdir --help' for more information.


B.R
Alex






From: Joshua Wiley jwiley.ps...@gmail.com

Cc: R-help@r-project.org R-help@r-project.org
Sent: Tuesday, November 8, 2011 2:43 PM
Subject: Re: [R] save at relative directory

Hi Alex,

Look at some of these functions:

apropos(dir)
apropos(exists)

Cheers,

Josh



 Dear all,
 I have a variable called thres and before I run a script I set it to a value
 like
 thres- -10
 at the end of the execution I am issuing a save(variablename,file='Results')
 which will end up with a file saved at the current directory with the name 
 Results

 I would like though to use thres value and do the followingg
 save at the directory called 10 so to get ./10/Results, (yes I want this in a 
 relative order)

 My question is how I can also check if the directory exists R to create it?

 I would like to thank you in advance for your help

 B.R
 Alex
        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] save at relative directory




On 08.11.2011 16:09, Alaios wrote:

Hmm I will try something like that

 if (file.exists('threshold')==FALSE)
   dir.create(paste('./',abs(threshold)))

 save(var,file=paste('./',abs(threshold),'/',DataSource[[4]],sep=)

I just need a bit of confirmation If I am doing soemthing terribly wrong that 
might harm my filesystem.

I also did accidentaly
var- -13
dir.create(paste('./',var))

which created a folder called -13 which I do not know how to remove it

  rmdir -13
rmdir: invalid option -- '1'
Try `rmdir --help' for more information.



In R:
  unlink(-13, recursive=TRUE)

or in your shell:

I guess your intated OS can do:


rm --help
which points you to:
rm -r -- -13

Uwe Ligges





B.R
Alex






From: Joshua Wileyjwiley.ps...@gmail.com

Cc: R-help@r-project.orgR-help@r-project.org
Sent: Tuesday, November 8, 2011 2:43 PM
Subject: Re: [R] save at relative directory

Hi Alex,

Look at some of these functions:

apropos(dir)
apropos(exists)

Cheers,

Josh




Dear all,
I have a variable called thres and before I run a script I set it to a value
like
thres- -10
at the end of the execution I am issuing a save(variablename,file='Results')
which will end up with a file saved at the current directory with the name 
Results

I would like though to use thres value and do the followingg
save at the directory called 10 so to get ./10/Results, (yes I want this in a 
relative order)

My question is how I can also check if the directory exists R to create it?

I would like to thank you in advance for your help

B.R
Alex
[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] save at relative directory

2011-11-08 Thread Joshua Wiley

Hi Alex,

For the R part, I would abstract it a bit:

mydir - paste(./, abs(threshold), sep = )
if (!file.exists(mydir)) dir.create(mydir)
save(var, file = paste(mydir, DataSource[[4]], sep = /))

if you use file.exists('threshold') you are testing for the existence
of threshold, not the value contained in threshold, and anyway, you
seem not not want the value contained in threshold, but the absolute
value of the value in threshold, hence, in part, the value of
abstraction.

In R, see ?unlink for ways to delete things, rmdir looks like you are
using the command prompt, and for that I will refer you to the help
for your OS/shell on how to go about removing unwanted directories
(hint, rmdir --help is not a bad place to start ;)

Cheers,

Josh

On Tue, Nov 8, 2011 at 7:09 AM, Alaios ala...@yahoo.com wrote:
 Hmm I will try something like that
     if (file.exists('threshold')==FALSE)
       dir.create(paste('./',abs(threshold)))

     save(var,file=paste('./',abs(threshold),'/',DataSource[[4]],sep=)
 I just need a bit of confirmation If I am doing soemthing terribly wrong
 that might harm my filesystem.
 I also did accidentaly
 var- -13
 dir.create(paste('./',var))
 which created a folder called -13 which I do not know how to remove it
  rmdir -13
 rmdir: invalid option -- '1'
 Try `rmdir --help' for more information.

 B.R
 Alex



 
 From: Joshua Wiley jwiley.ps...@gmail.com
 To: Alaios ala...@yahoo.com
 Cc: R-help@r-project.org R-help@r-project.org
 Sent: Tuesday, November 8, 2011 2:43 PM
 Subject: Re: [R] save at relative directory

 Hi Alex,

 Look at some of these functions:

 apropos(dir)
 apropos(exists)

 Cheers,

 Josh


 On Tue, Nov 8, 2011 at 5:36 AM, Alaios ala...@yahoo.com wrote:
 Dear all,
 I have a variable called thres and before I run a script I set it to a
 value
 like
 thres- -10
 at the end of the execution I am issuing a
 save(variablename,file='Results')
 which will end up with a file saved at the current directory with the name
 Results

 I would like though to use thres value and do the followingg
 save at the directory called 10 so to get ./10/Results, (yes I want this
 in a relative order)

 My question is how I can also check if the directory exists R to create
 it?

 I would like to thank you in advance for your help

 B.R
 Alex
        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 --
 Joshua Wiley
 Ph.D. Student, Health Psychology
 Programmer Analyst II, ATS Statistical Consulting Group
 University of California, Los Angeles
 https://joshuawiley.com/






-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] window?

2011-11-08 Thread Kevin Burton

Can someone enlighten me on why the following doesn't work?

 

 

setwd('C:/Temp/R')

 

d - rep(1:53,2)

(s - ts(d, frequency=53, start=c(2000,10)))

n - length(s)

k - n%/%3

 

for(i in (n-k):n)

{

st - c(start(s)[1] + (start(s)[2] + i)%/%frequency(s), (start(s)[2] +
i) %% frequency(s))

ed - c(start(s)[1] + (start(s)[2]+k+i)%/%frequency(s),
(start(s)[2]+i+k) %% frequency(s))

xshort - window(s, start=st, end=ed)

cat(Start , st,  End , ed, \n)

cat(Length , length(xshort),  start , start(xshort), 
end , end(xshort), \n)

}

 

I get a bunch of warnings like:

 

36: In window.default(x, ...) : 'end' value not changed

 

Thank you.

 

Kevin

rkevinbur...@charter.net


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] window?




On 08.11.2011 16:26, Kevin Burton wrote:

Can someone enlighten me on why the following doesn't work?





setwd('C:/Temp/R')



d- rep(1:53,2)

(s- ts(d, frequency=53, start=c(2000,10)))

n- length(s)

k- n%/%3



for(i in (n-k):n)

{

 st- c(start(s)[1] + (start(s)[2] + i)%/%frequency(s), (start(s)[2] +
i) %% frequency(s))

 ed- c(start(s)[1] + (start(s)[2]+k+i)%/%frequency(s),
(start(s)[2]+i+k) %% frequency(s))

 xshort- window(s, start=st, end=ed)

 cat(Start , st,  End , ed, \n)

 cat(Length , length(xshort),  start , start(xshort), 
end , end(xshort), \n)

}



I get a bunch of warnings like:



36: In window.default(x, ...) : 'end' value not changed



Yes, your original s has
End = c(2002, 9)
and you try to set to, e.g.,
c(2002, 45)
in your last iteration which is later and hence cannot be changed.

Uwe ligges








Thank you.



Kevin

rkevinbur...@charter.net


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RpgSQL row names

This is great, thanks!

I have another unrelated question. I'll create a new email for that one.

ben

On Mon, Nov 7, 2011 at 4:16 PM, Gabor Grothendieck
ggrothendi...@gmail.comwrote:

 On Mon, Nov 7, 2011 at 5:34 PM, Ben quant ccqu...@gmail.com wrote:
  Hello,
 
  Using the RpgSQL package, there must be a way to get the row names into
 the
  table automatically. In the example below, I'm trying to get rid of the
  cbind line, yet have the row names of the data frame populate a column.
 
  bentest = matrix(1:4,2,2)
  dimnames(bentest) = list(c('ra','rb'),c('ca','cb'))
  bentest
ca cb
  ra  1  3
  rb  2  4
  bentest = cbind(item_name=rownames(bentest),bentest)
  dbWriteTable(con, r.bentest, bentest)
  [1] TRUE
  dbGetQuery(con, SELECT * FROM r.bentest)
   item_name ca cb
  1ra  1  3
  2rb  2  4
 
 

 The RJDBC based drivers currently don't support that. You can create a
 higher level function that does it.

 dbGetQuery2 - function(...) {
  out - dbGetQuery(...)
  i - match(row_names, names(out), nomatch = 0)
  if (i  0) {
rownames(out) - out[[i]]
out - out[-1]
  }
  out
 }

 rownames(BOD) - letters[1:nrow(BOD)]
 dbWriteTable(con, BOD, cbind(row_names = rownames(BOD), BOD))
 dbGetQuery2(con, select * from BOD)

 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rekeying value denoting NA



On Nov 8, 2011, at 8:57 AM, Jean V Adams wrote:


SML wrote on 11/07/2011 09:10:30 PM:


I'm trying to rekey values which denote there is no values, i.e.,
'-999' in a dataset which contains both '-999' and NA entries.
When I try the following command I get the following error:


data.frame[data.frame$MAR = -9,MAR] - NA


missing values are not allowed in subscripted assignments of data
frames

Example of data:
YEAR   JAN FEB MAR ... DEC
1931   5   -999NA  3
1932   2   1   -9992
.
.
.
2010   -999NA  2   1


I've tried to replace the NAs with -999 values first to remove the NA
values, but got the same error.

I'm quite new to R, and these little issues seem to be a stumbling
block. Many thanks for any help you might be able to offer.



First of all, you should call your data frame something other than
data.frame because data.frame is already a function in the base  
package of

R.  Let's call it df, instead,
   df - data.frame
   rm(data.frame)

Secondly, it looks like the variables in your data frame (JAN, FEB,  
MAR,
..., DEC) are character not numeric, because their values are left  
aligned
in your example print out.  You can test this out by showing the  
class of

each variable in the data frame,
   lapply(df, class)

If the variables are character, you can convert them to numeric,
   df2 - as.data.frame(lapply(df, as.numeric))

Then you can convert all the -999 values to NAs,
   df2[df2  -99] - NA


Agreed this is what _should_ be done.


YEAR JAN FEB MAR ... DEC
1931   5  NA  NA   3
1932   2  NA  NA   2
.
.
.
2010  NA  NA   2   1


But ... I thought she wanted (unwisely in my opinion) to go the other  
way, NA's - -999. In R the replacement of NA's is a bit convoluted  
because nothing =='s NA. You might need use the `is.na` function in  
this manner.


 df[is.na(df[[MAR]]), MAR] - -999

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rename a directory in R

2011-11-08 Thread Gavin Blackburn

Thanks Jim,

I didn't think it worked on directories but I've got it working now.

Cheers,

Gavin.

-Original Message-
From: jim holtman [mailto:jholt...@gmail.com] 
Sent: 08 November 2011 14:18
To: Gavin Blackburn
Cc: r-help@r-project.org
Subject: Re: [R] Rename a directory in R

apropos('rename')

you will find 'file.rename' which also works on directories.

On Tue, Nov 8, 2011 at 7:45 AM, Gavin Blackburn
gavin.blackb...@strath.ac.uk wrote:
 Hi,

 I want to be able to rename a folder using R, similar to file.rename. I want 
 to paste  - done onto the folder name. The reason for this is we run a loop 
 on a large number of folders on a server and it would be nice for people to 
 be able to log in and instantly see if their data has been processed so they 
 can remove it.

 I have searched the help list but have not found anything and was wondering 
 if this is possible?

 I realise I could probably do this through the R/system command line 
 interface but I thought there may be a simpler way to do this through R.

 Many thanks,

 Gavin.

 Dr. Gavin Blackburn
 SULSA Technologist

 Strathclyde institute of Pharmacy and Biomedical Science
 161 Cathedral Street,
 Glasgow.
 G4 0RE

 Tel: +44 (0)1415483828

 ScotMet: The Scottish Metabolomics Facility
 www.metabolomics.strath.ac.ukhttp://www.metabolomics.strath.ac.uk


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to use quadrature to integrate some complicated functions



On Nov 8, 2011, at 9:43 AM, R. Michael Weylandt wrote:

Have you tried wrapping it in a function and using integrate()? R is  
pretty good at handling numerical integration. If integrate() isn't  
good for you, can you say more as to why?


Michael


On Nov 6, 2011, at 4:15 PM, JeffND zuofeng.shan...@nd.edu wrote:


Hello to all,

I am having trouble with intregrating a complicated uni-dimensional  
function

of the following form

Phi(x-a_1)*Phi(x-a_2)*...*Phi(x-a_{n-1})*phi(x-a_n).

Here n is about 5000, Phi is the cumulative distribution function of
standard normal,
phi is the density function of standard normal, and x ranges over
(-infty,infty).



The density of a standard normal is very tractable mathematically. Why  
wouldn't you extract the arguments and sum them before submitting to  
integration ... which also might not be needed since pnorm could  
economically provide the answer. Perhaps with limits a, b:


suitable_norm_factor*
   pnorm(
 dnorm( sum(x-a_1, x-a_2, ..., x-a_{n-1}, x-a_n) ), a,  
lower.tail=FALSE) ) -

   pnorm(
 dnorm( sum(x-a_1, x-a_2, ..., x-a_{n-1}, x-a_n) ), b,  
lower.tail=FALSE) ) )



My idea is to to use quadrature to handle this integral. But since  
Phi has

not cloaed form,
I don't know how to do this effeciently. I appreciate very much if  
someone

has any ideas about it.


If efficiency is desired ... use mathematical theory to maximum extent  
before resorting to pickaxes.


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] similar package in R like SKEW CALCULATOR?

2011-11-08 Thread Knut Krueger


Am 08.11.2011 13:01, schrieb R. Michael Weylandt:

See the replies given to you two days ago when you asked the same

Hi Michael,

thank you for your second answer.

I did not get my first question and I did not get your answer via mail - 
strange



Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] multi-line query

Hello,

I'm using package RpgSQL. Is there a better way to create a multi-line
query/character string? I'm looking for less to type and readability.

This is not very readable for large queries:
s -  'create table r.BOD(id int primary key,name varchar(12))'

I write a lot of code, so I'm looking to type less than this, but it is
more readable from and SQL standpoint:
s - gsub(\n, , 'create table r.BOD(
id int primary key
,name varchar(12))
')

How it is used:
dbSendUpdate(con, s)

Regards,

Ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rearrange set of items randomly

2011-11-08 Thread flokke

Thanks for the replies!

Indeed, I could use order() instead of sample()  but that it wouldnt be
random anymore, as it sorts data points in increasing(!) order. Second, I
have to use 1000 different samples. 

I got the hint that I have to do somethin with indeces, but still cant
figure out what this should be. 
Maybe someone of you knows?

--
View this message in context: 
http://r.789695.n4.nabble.com/rearrange-set-of-items-randomly-tp4013723p4016613.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sampling with conditions

2011-11-08 Thread SarahJoyes

Sorry about being confusing, I have so many loops in loops and ifelses that I
get mixed up sometimes, it was just a typo, it was supposed to be for(i in
1:5) Sorry, 
Thanks for  you help!
SJ

--
View this message in context: 
http://r.789695.n4.nabble.com/Sampling-with-conditions-tp4014036p4016058.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error message after updating pkg spatstat

2011-11-08 Thread Robin W Hunnewell


Hello,

I recently downloaded an updated version of the {spatstat} package 1.24-1. But 
now, when I do  help() to look up a certain function in the spatstat library, 
I get an error message. It only seems to be for spatstat fns. The following error:

Error in fetch(key) : internal error -3 in R_decompress1


Can anyone help?
Thanks greatly, my session info is below. I am running R 2.13.1 on a Mac.
Robin


sessionInfo()

R version 2.13.1 (2011-07-08)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 


other attached packages:
[1] gpclib_1.5-1rgdal_0.6-33maptools_0.8-10 lattice_0.19-33 
foreign_0.8-45  spatstat_1.24-1
[7] deldir_0.0-15   mgcv_1.7-6  sp_0.9-91  


loaded via a namespace (and not attached):
[1] grid_2.13.1Matrix_0.999375-50 nlme_3.1-102   tools_2.13.1 



--
Robin Hunnewell
PhD student 
Department of Biology

University of New Brunswick
Fredericton, NB  
Canada 
__

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sampling with conditions

2011-11-08 Thread SarahJoyes

That is exactly what I want, and it's so simple! 
Thanks so much!

--
View this message in context: 
http://r.789695.n4.nabble.com/Sampling-with-conditions-tp4014036p4016050.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with SEM package: Error message

2011-11-08 Thread lisamp85

Hello.  

I started using the sem package in R and after a lot of searching and trying
things I am still having difficulty.  I get the following error message when
I use the sem() function:

Warning message:
In sem.default(ram = ram, S = S, N = N, param.names = pars, var.names =
vars,  :
  Could not compute QR decomposition of Hessian.
Optimization probably did not converge.

I started with a simple example using the specify.model() function, but it
is really straight forward.  I uploaded my specify.model script and my data
covariance matrix here too so I wouldn't clutter this email with the entire
model (20 observed variables, 5 factors).  Could this error message be from
the data itself and not from my path model?

I have my observed variables X and my unobserved variables F.  I have ONLY
exogenous latent variables (i.e. they never appear on the right side of the
single head arrow -).  I include all possible factor covariances FjFk, and
the only constraints I've made was to restrict the Factor variances to 1.
My model follows in this basic format (as you can see from my uploaded
file):

# Factors (where I specify which observed variables load on to which
factors)
# I have only exogenous latent variables
F.i - X.j, lamj.i, NA
.
.
.
# Observed variable variances
X.j - X.j, ej, NA
.
.
.
# Factor variances (I fixed all factor variances to 1)
F.i - F.i, NA, 1
.
.
.
# Factor covariances (I represent all factor covariances, i.e. the upper or
lower triangle of a covariance matrix)
F.i - F.k, FiFk, NA
.
.
.

Did I do something wrong here?  
Here are my uploaded files:
CFA script:  http://r.789695.n4.nabble.com/file/n4016569/CFA_script.txt
CFA_script.txt 
Covariance matrix: 
http://r.789695.n4.nabble.com/file/n4016569/covariance_matrix.RData
covariance_matrix.RData 


Thank you so much for any and all of your help.
Lisa

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-SEM-package-Error-message-tp4016569p4016569.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question

2011-11-08 Thread Parida, Mrutyunjaya

Hi
My name is Rocky and I am trying to use the org.Dm.eg.db library.
When I am using the org.Dm.egFLYBASE2EG[fb_ids] it is stopping at a point where 
it cannot find any value for a given ID such as the following:
Error in .checkKeys(value, Rkeys(x), x@ifnotfound) :
  value for FBgn0004461 not found
Then the whole thing stops. I cannot retrieve any information on the values 
that has been found until this ID was reached.
Can anyone help in retrieving all the IDs without stopping at a certain ID or 
if the value for the ID not found just generate an NA for that ID but do not 
stop.
Another thing when I am using FlybaseID converter FBgn0004461 ID shows a value. 
I am not sure why R is not able to retrieve it.
Please comment on this.
Waiting for your reply.
Thanking you
Rocky

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lapply to list of variables

2011-11-08 Thread Ana

Hi

Can someone help me with this?

How can I apply a function to a list of variables.

something like this


listvar=list(Monday,Tuesday,Wednesday)
func=function(x){x[which(x=10)]=NA}

lapply(listvar, func)

were
Monday=[213,56,345,33,34,678,444]
Tuesday=[213,56,345,33,34,678,444]
...

in my case I have a neverending list of vectors.

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] determine length of bivariate polynomial

2011-11-08 Thread Nicolas Schuck

Dear R-community,

I have a fitted bivariate polynomial, i.e:

fit = lm(cbind(x, y)~poly(t, 15))

and I would like to determine the length of the line in the interval t =
[a, b]. Obviously, I could use predict and go through all the points, i.e.

for (t in a:(b-1)) {
length = length + sqrt((x.pred[t] - x.pred[t+1])^2 + (y.pred[t] -
y.pred[t+1])^2)
}

but that would take very long given the amount of data I have. Do you know
of any better solutions?

Many thanks!
Nicolas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] GAM

2011-11-08 Thread Gyanendra Pokharel

Hi R community!
I am analyzing the data set motorins in the package faraway by using
the generalized additive model. it shows the following error. Can some one
suggest me the right way?

library(faraway)
data(motorins)
motori - motorins[motorins$Zone==1,]
library(mgcv)
amgam - gam(log(Payment) ~ offset(log(Insured))+
s(as.numeric(Kilometres)) + s(Bonus) + Make + s(Claims),family = gaussian,
data = motori)
Error in smooth.construct.tp.smooth.
spec(object, dk$data, dk$knots) :
  A term has fewer unique covariate combinations than specified maximum
degrees of freedom
 summary(amgam)
Error in summary(amgam) : object 'amgam' not found
Gyan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rearrange set of items randomly



On Nov 8, 2011, at 11:31 AM, flokke wrote:


Thanks for the replies!

Indeed, I could use order() instead of sample()  but that it wouldnt  
be
random anymore, as it sorts data points in increasing(!) order.  
Second, I

have to use 1000 different samples.


Please re-read the help page for `order` _and_ the one for `sort`,  
examine their differences,  re-read the replies you have gotten so  
far, work through _all_ of the examples, and if illumination still  
does not arrive, then repeat the above process until it does.




I got the hint that I have to do somethin with indeces, but still cant
figure out what this should be.
Maybe someone of you knows?


Many (if not most)  of us do. We try to avoid having r-help becoming a  
homework tutoring site, however. You should be using your academic  
resources for this effort.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question



I suspect you are lost. This is almost certainly more appropriate on  
the BioConductor list. There are relatively few people on this list  
who will know what (un-named) packages you might be using. Read the  
Posting Guide, note the link to the Bioconductor mailing list page,   
configure your mailer for plain text, and subscribe to the help list  
at Bioc.


Then construct a post that shows your code, and mentions all the  
packages being required.


--
David.


On Nov 8, 2011, at 11:22 AM, Parida, Mrutyunjaya wrote:


Hi
My name is Rocky and I am trying to use the org.Dm.eg.db library.
When I am using the org.Dm.egFLYBASE2EG[fb_ids] it is stopping at a  
point where it cannot find any value for a given ID such as the  
following:

Error in .checkKeys(value, Rkeys(x), x@ifnotfound) :
 value for FBgn0004461 not found
Then the whole thing stops. I cannot retrieve any information on the  
values that has been found until this ID was reached.
Can anyone help in retrieving all the IDs without stopping at a  
certain ID or if the value for the ID not found just generate an NA  
for that ID but do not stop.
Another thing when I am using FlybaseID converter FBgn0004461 ID  
shows a value. I am not sure why R is not able to retrieve it.

Please comment on this.
Waiting for your reply.
Thanking you
Rocky

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] dbWriteTable with field data type

Hello,

When I do:

dbWriteTable(con, r.BOD, cbind(row_names = rownames(BOD), BOD))

...can I specify the data types such as varchar(12), float, double
precision, etc. for each of the fields/columns?

If not, what is the best way to create a table with specified field data
types (with the RpgSQL package/R)?

Regards,

Ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply to list of variables




On 08.11.2011 17:59, Ana wrote:

Hi

Can someone help me with this?

How can I apply a function to a list of variables.

something like this


listvar=list(Monday,Tuesday,Wednesday)


This is a list of length one character vectors rather than a list of 
variables.





func=function(x){x[which(x=10)]=NA}


To make it work, redefine:

func -function(x){
 x - get(x)
 is.na(x[x=10]) - TRUE
 x
}




lapply(listvar, func)

were
Monday=[213,56,345,33,34,678,444]
Tuesday=[213,56,345,33,34,678,444]


This is not R syntax.


...

in my case I have a neverending list of vectors.


Then your function will take an infinite amount of time - or you will 
get amazing reputation in computer sciences.


Uwe Ligges




Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] why NA coefficients

2011-11-08 Thread array chip

Sure it does, but still struggling with what is going on...

Thanks

John

From:David Winsemius dwinsem...@comcast.net
To:David Winsemius dwinsem...@comcast.net

oject.org
Sent:Monday, November 7, 2011 10:27 PM
Subject:Re: [R] why NA coefficients

But this output suggests there may be alligators in the swamp:

 predict(lmod, newdata=data.frame(treat=1, group=2))
         1
0.09133691
Warning message:
In predict.lm(lmod, newdata = data.frame(treat = 1, group = 2)) :
  prediction from a rank-deficient fit may be misleading

--David.

 --David.

 Thanks

 John

 From: David Winsemius dwinsem...@comcast.net

 Cc: r-help@r-project.org r-help@r-project.org
 Sent: Monday, November 7, 2011 5:13 PM
 Subject: Re: [R] why NA coefficients

 On Nov 7, 2011, at 7:33 PM, array chip wrote:

  Hi, I am trying to run ANOVA with an interaction term on 2 factors (treat 
  has 7 levels, group has 2 levels). I found the coefficient for the last 
  interaction term is always 0, see attached dataset and the code below:

  test-read.table(test.txt,sep='\t',header=T,row.names=NULL)
  lm(y~factor(treat)*factor(group),test)

  Call:
  lm(formula = y ~ factor(treat) * factor(group), data = test)

  Coefficients:
                   (Intercept)                factor(treat)2                
 factor(treat)3
                       0.429244                      0.499982                
       0.352971
                 factor(treat)4                factor(treat)5                
 factor(treat)6
                     -0.204752                      0.142042                 
      0.044155
                 factor(treat)7                factor(group)2  
 factor(treat)2:factor(group)2
                     -0.007775                      -0.337907                
       -0.208734
  factor(treat)3:factor(group)2  factor(treat)4:factor(group)2  
  factor(treat)5:factor(group)2
                     -0.195138                      0.800029                 
      0.227514
  factor(treat)6:factor(group)2  factor(treat)7:factor(group)2
                       0.331548                            NA

  I guess this is due to model matrix being singular or collinearity among 
  the matrix columns? But I can't figure out how the matrix is singular in 
  this case? Can someone show me why this is the case?

 Because you have no cases in one of the crossed categories.

 --David Winsemius, MD
 West Hartford, CT

 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.orgmailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] why NA coefficients

2011-11-08 Thread array chip

true, why it has to omit treat 7-group 2

Thanks again

From: David Winsemius dwinsem...@comcast.net

Cc: r-help@r-project.org r-help@r-project.org
Sent: Monday, November 7, 2011 10:19 PM
Subject: Re: [R] why NA coefficients

On Nov 7, 2011, at 10:07 PM, array chip wrote:

 Thanks David. The only category that has no cases is treat 1-group 2:

  with(test,table(treat,group))
      group
 treat 1 2
     1 8 0
     2 1 5
     3 5 5
     4 7 3
     5 7 4
     6 3 3
     7 8 2

 But why the coefficient for treat 7-group 2 is not estimable?

Well, it had to omit one of them didn't it?

(But I don't know why that level was chosen.)

--David.

 Thanks

 John

 From: David Winsemius dwinsem...@comcast.net

 Cc: r-help@r-project.org r-help@r-project.org
 Sent: Monday, November 7, 2011 5:13 PM
 Subject: Re: [R] why NA coefficients

 On Nov 7, 2011, at 7:33 PM, array chip wrote:

  Hi, I am trying to run ANOVA with an interaction term on 2 factors (treat 
  has 7 levels, group has 2 levels). I found the coefficient for the last 
  interaction term is always 0, see attached dataset and the code below:

  test-read.table(test.txt,sep='\t',header=T,row.names=NULL)
  lm(y~factor(treat)*factor(group),test)

  Call:
  lm(formula = y ~ factor(treat) * factor(group), data = test)

  Coefficients:
                   (Intercept)                factor(treat)2                
 factor(treat)3
                       0.429244                      0.499982                 
      0.352971
                 factor(treat)4                factor(treat)5                
 factor(treat)6
                     -0.204752                      0.142042                  
     0.044155
                 factor(treat)7                factor(group)2  
 factor(treat)2:factor(group)2
                     -0.007775                      -0.337907                 
      -0.208734
  factor(treat)3:factor(group)2  factor(treat)4:factor(group)2  
  factor(treat)5:factor(group)2
                     -0.195138                      0.800029                  
     0.227514
  factor(treat)6:factor(group)2  factor(treat)7:factor(group)2
                       0.331548                            NA

  I guess this is due to model matrix being singular or collinearity among 
  the matrix columns? But I can't figure out how the matrix is singular in 
  this case? Can someone show me why this is the case?

 Because you have no cases in one of the crossed categories.

 --David Winsemius, MD
 West Hartford, CT

David Winsemius, MD
West Hartford, CT
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multi-line query

Why not just send it in as is.  I use SQLite (via sqldf) and here is
the way I write my SQL statements:

inRange - sqldf('
select t.*
, r.start
, r.end
from total t, commRange r
where t.comm = r.comm and
t.loc between r.start and r.end and
t.loc != t.new
')

On Tue, Nov 8, 2011 at 11:43 AM, Ben quant ccqu...@gmail.com wrote:
 Hello,

 I'm using package RpgSQL. Is there a better way to create a multi-line
 query/character string? I'm looking for less to type and readability.

 This is not very readable for large queries:
 s -  'create table r.BOD(id int primary key,name varchar(12))'

 I write a lot of code, so I'm looking to type less than this, but it is
 more readable from and SQL standpoint:
 s - gsub(\n, , 'create table r.BOD(
 id int primary key
 ,name varchar(12))
 ')

 How it is used:
 dbSendUpdate(con, s)

 Regards,

 Ben

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sampling with conditions

2011-11-08 Thread Nordlund, Dan (DSHS/RDA)

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of SarahJoyes
 Sent: Tuesday, November 08, 2011 5:57 AM
 To: r-help@r-project.org
 Subject: Re: [R] Sampling with conditions

 That is exactly what I want, and it's so simple!
 Thanks so much!

Sarah,

I want to point out that my post was qualified by something like.  I am not 
sure it is exactly what you want.  Since you didn't quote my post, let me show 
my suggestion and then express my concern.

n - matrix(0,nrow=5, ncol=10)
repeat{
  c1 - sample(0:10, 4, replace=TRUE)
  if(sum(c1) = 10) break
}
n[,1] - c(c1,10-sum(c1))
n

This nominally meets your criteria, but it will tend to result in larger digits 
being under-represented.  For example, you unlikely to get a result like 
c(0,8,0,0,2) or (9,0,0,1,0).

That may be OK for your purposes, but I wanted to point it out.

You could use something like 

n - matrix(0,nrow=5, ncol=10)
c1 - rep(0,4)
for(i in 1:4){
  upper - 10-sum(c1)
  c1[i] - sample(0:upper, 1, replace=TRUE)
  if(sum(c1) == 10) break
}
n[,1] - c(c1,10-sum(c1))
n

if that would suit your purposes better.

Good luck,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multi-line query

Because I don't know anything about sqldf. :)

Here is what happens, but Im sure it is happening because I didn't read
the manual yet:

 s - sqldf('create table r.dat(id int primary key,val int)')
Error in ls(envir = envir, all.names = private) :
  invalid 'envir' argument
Error in !dbPreExists : invalid argument type

ben

On Tue, Nov 8, 2011 at 10:41 AM, jim holtman jholt...@gmail.com wrote:

 Why not just send it in as is.  I use SQLite (via sqldf) and here is
 the way I write my SQL statements:

inRange - sqldf('
select t.*
, r.start
, r.end
from total t, commRange r
where t.comm = r.comm and
t.loc between r.start and r.end and
t.loc != t.new
')

 On Tue, Nov 8, 2011 at 11:43 AM, Ben quant ccqu...@gmail.com wrote:
  Hello,
 
  I'm using package RpgSQL. Is there a better way to create a multi-line
  query/character string? I'm looking for less to type and readability.
 
  This is not very readable for large queries:
  s -  'create table r.BOD(id int primary key,name varchar(12))'
 
  I write a lot of code, so I'm looking to type less than this, but it is
  more readable from and SQL standpoint:
  s - gsub(\n, , 'create table r.BOD(
  id int primary key
  ,name varchar(12))
  ')
 
  How it is used:
  dbSendUpdate(con, s)
 
  Regards,
 
  Ben
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] passing dataframe col name through cbind()

2011-11-08 Thread Eric Rupley




Hi all ---

I note that the column name of the first column in a dataframe does not 
necessarily get passed on when using cbind (example below)…

I'm looking for help in clarifying why this behavior occurs, and how I can get 
all col names, including the first, passed on to the result…while I suspect 
it's obvious and documented to the cognoscenti, it's puzzling me…

Many thanks for any help on this...
Eric

 scores - 
 data.frame(name=c(Bob,Ron,Bud),round1=c(40,30,20),round2=c(5,6,4)) 
 #some toy data

 scores
  name round1 round2
1  Bob 40  5
2  Ron 30  6
3  Bud 20  4


 cbind(scores[,1],total=rowSums(scores[,2:3]),scores[,2:3])
  scores[, 1] total round1 round2
1 Bob45 40  5
2 Ron36 30  6
3 Bud24 20  4
 

...first column renamed...

…yet this passes all column names:

 cbind(scores[,1:3])
  name round1 round2
1  Bob 40  5
2  Ron 30  6
3  Bud 20  4
 

…but this doesn't:

 cbind(scores[,1],scores[,2:3])
  scores[, 1] round1 round2
1 Bob 40  5
2 Ron 30  6
3 Bud 20  4


--
 Eric Rupley
 University of Michigan, Museum of Anthropology
 1109 Geddes Ave, Rm. 4013
 Ann Arbor, MI 48109-1079
 
 erup...@umich.edu
 +1.734.276.8572
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] why NA coefficients



On Nov 8, 2011, at 12:36 PM, array chip wrote:


Sure it does, but still struggling with what is going on...


Have you considered redefining the implicit base level for treat so  
it does not create the missing crossed-category?


 test$treat2_ - factor(test$treat, levels=c(2:7, 1) )
 lm(y~treat2_*factor(group),test)

Call:
lm(formula = y ~ treat2_ * factor(group), data = test)

Coefficients:
(Intercept) treat2_3  
treat2_4
  0.9292256   -0.1470106
-0.7047343
   treat2_5 treat2_6  
treat2_7
 -0.3579398   -0.4558269
-0.5077571
   treat2_1   factor(group)2   
treat2_3:factor(group)2
 -0.4999820   -0.5466405 
0.0135963
treat2_4:factor(group)2  treat2_5:factor(group)2   
treat2_6:factor(group)2
  1.00876280.4362479 
0.5402821

treat2_7:factor(group)2  treat2_1:factor(group)2
  0.2087338   NA

All the group-less coefficients are for group1 , so  now get a  
prediction for group=1:treat=2 == Intercept, group=1:treat=3 ,   
a total of 7 values.


And there are 6 predictions for group2.

The onus is obviously on you to check the predictions against the  
data. 'aggregate' is a good function for that purpose.



--
david.





Thanks

John

From: David Winsemius dwinsem...@comcast.net
To: David Winsemius dwinsem...@comcast.net
Cc: array chip arrayprof...@yahoo.com; r-help@r-project.org r-help@r-project.org 


Sent: Monday, November 7, 2011 10:27 PM
Subject: Re: [R] why NA coefficients

But this output suggests there may be alligators in the swamp:

 predict(lmod, newdata=data.frame(treat=1, group=2))
1
0.09133691
Warning message:
In predict.lm(lmod, newdata = data.frame(treat = 1, group = 2)) :
  prediction from a rank-deficient fit may be misleading

--David.

 --David.

 Thanks

 John


 From: David Winsemius dwinsem...@comcast.net
 To: array chip arrayprof...@yahoo.com
 Cc: r-help@r-project.org r-help@r-project.org
 Sent: Monday, November 7, 2011 5:13 PM
 Subject: Re: [R] why NA coefficients


 On Nov 7, 2011, at 7:33 PM, array chip wrote:

  Hi, I am trying to run ANOVA with an interaction term on 2  
factors (treat has 7 levels, group has 2 levels). I found the  
coefficient for the last interaction term is always 0, see attached  
dataset and the code below:

 
  test-read.table(test.txt,sep='\t',header=T,row.names=NULL)
  lm(y~factor(treat)*factor(group),test)
 
  Call:
  lm(formula = y ~ factor(treat) * factor(group), data = test)
 
  Coefficients:
   (Intercept) 
factor(treat)2factor(treat)3
   0.429244   
0.499982  0.352971
 factor(treat)4 
factor(treat)5factor(treat)6
 -0.204752   
0.142042  0.044155
 factor(treat)7factor(group)2   
factor(treat)2:factor(group)2
 -0.007775   
-0.337907  -0.208734
  factor(treat)3:factor(group)2  factor(treat)4:factor(group)2   
factor(treat)5:factor(group)2
 -0.195138   
0.800029  0.227514

  factor(treat)6:factor(group)2  factor(treat)7:factor(group)2
   0.331548NA
 
 
  I guess this is due to model matrix being singular or  
collinearity among the matrix columns? But I can't figure out how  
the matrix is singular in this case? Can someone show me why this is  
the case?


 Because you have no cases in one of the crossed categories.

 --David Winsemius, MD
 West Hartford, CT




 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT





David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] why NA coefficients

2011-11-08 Thread Dennis Murphy

The cell mean mu_{12} is non-estimable because it has no data in the
cell. How can you estimate something that's not there (at least
without imputation :)? Every parametric function that involves mu_{12}
will also be non-estimable - in particular,  the interaction term and
the population marginal means . That's why you get the NA estimates
and the warning. All this follows from the linear model theory
described in, for example, Milliken and Johnson (1992), Analysis of
Messy Data, vol. 1, ch. 13.

Here's an example from Milliken and Johnson (1992) to illustrate:
  B1 B2   B3
T1  2, 6   8, 6
T23  14  12, 9
T36   9

Assume a cell means model E(Y_{ijk}) = \mu_{ij}, where the cell means
are estimated by the cell averages.

From M  J (p. 173, whence this example is taken):
Whenever treatment combinations are missing, certain
hypotheses cannot be tested without making some
additional assumptions about the parameters in the model.
Hypotheses involving parameters corresponding to the
missing cells generally cannot be tested. For example,
for the data [above] it is not possible to estimate any
linear combinations (or to test any hypotheses) that
involve parameters \mu_{12} and \mu_{33} unless one
is willing to make some assumptions about them.

They continue:
One common assumption is that there is no
interactions between the levels of T and the levels of B.
In our opinion, this assumption should not be made
without some supporting experimental evidence.

In other words, removing the interaction term makes the
non-estimability problem disappear, but it's a copout unless there is
some tangible scientific justification for an additive rather than an
interaction model.

For the above data, M  J note that it is not possible to estimate all
of the expected marginal means - in particular, one cannot estimate
the population marginal means $\bar{\mu}_{1.}$, $\bar{\mu}_{3.}$,
$\bar{\mu}_{.2}$ or $\bar{\mu}_{.3}$. OTOH, $\bar{\mu}_{2.}$ and
$\bar{\mu}_{.1}$ since these functions of the parameters involve terms
associated with the means of the missing cells. Moreover, any
hypotheses involving parametric functions that contain non-estimable
cell means are not testable. In this example, the test of equal row
population marginal means is not testable because $\bar{\mu}_{1.}$ and
$\bar{\mu}_{3.}$ are not estimable.

[Aside: if the term parametric function is not familiar, in this
context it refers to linear combinations of model parameters.  In the
M  J example, $\bar{\mu}_{1.} = \mu_{11} + \mu_{12} + \mu_{13}$ is a
parametric function.]

Hopefully this sheds some light on the situation.

Dennis

On Mon, Nov 7, 2011 at 10:17 PM, array chip arrayprof...@yahoo.com wrote:
 Hi Dennis,
 The cell mean mu_12 from the model involves the intercept and factor 2:
 Coefficients:
   (Intercept) factor(treat)2
 factor(treat)3
  0.429244
 0.499982   0.352971
    factor(treat)4 factor(treat)5
 factor(treat)6
     -0.204752
 0.142042   0.044155
    factor(treat)7 factor(group)2
 factor(treat)2:factor(group)2
     -0.007775
 -0.337907  -0.208734
 factor(treat)3:factor(group)2  factor(treat)4:factor(group)2
 factor(treat)5:factor(group)2
     -0.195138
 0.800029   0.227514
 factor(treat)6:factor(group)2  factor(treat)7:factor(group)2
  0.331548 NA
 So mu_12 = 0.429244-0.337907 = 0.091337. This can be verified by:
 predict(fit,data.frame(list(treat=1,group=2)))
  1
 0.09133691
 Warning message:
 In predict.lm(fit, data.frame(list(treat = 1, group = 2))) :
   prediction from a rank-deficient fit may be misleading

 But as you can see, it gave a warning about rank-deficient fit... why this
 is a rank-deficient fit?
 Because treat 1_group 2 has no cases, so why it is still estimable while
 on the contrary, treat 7_group 2 which has 2 cases is not?
 Thanks
 John




 
 From: Dennis Murphy djmu...@gmail.com
 To: array chip arrayprof...@yahoo.com
 Sent: Monday, November 7, 2011 9:29 PM
 Subject: Re: [R] why NA coefficients

 Hi John:

 What is the estimate of the cell mean \mu_{12}? Which model effects
 involve that cell mean? With this data arrangement, the expected
 population marginal means of treatment 1 and group 2 are not estimable
 either, unless you're willing to assume a no-interaction model.

 Chapters 13 and 14 of Milliken and Johnson's Analysis of Messy Data
 (vol. 1) cover this topic in some detail, but it assumes you're
 familiar with the matrix form of a linear statistical model. Both
 chapters cover the two-way model with interaction - Ch.13 from the
 cell means model approach and Ch. 14 from the model effects approach.
 Because this was written in

[R] rpanel package - retrieve data from panel

2011-11-08 Thread michalseneca

Dear  Co-Forumeees

Does anybody have experience with using rpanel..or how to retrieve data from
created panel.

For example my panel draws some interactive graph and computes something
inside the panel.

Question : is there a way to retrieve those data ?

For illustration:


if (interactive()) {
   plot.hist - function(panel) {
 with(panel, {
xlim - range(c(x, mean(x) + c(-3, 3) * sd(x)))
   if (panel$cbox[3])
  clr - lightblue else clr - NULL
   hist(x, freq = FALSE, col = clr, xlim = xlim)
   y-x+2
   if (panel$cbox[1]) {
  xgrid - seq(xlim[1], xlim[2], length = 50)
  dgrid - dnorm(xgrid, mean(x), sd(x))
  lines(xgrid, dgrid, col = red, lwd = 3)
  }
   if (panel$cbox[2])
  box()
   })
 panel
 }
   x - rnorm(50)
   panel - rp.control(x = x)
   rp.checkbox(panel, cbox, plot.hist, 
  labels = c(normal density, box, shading), title = Options)
   rp.do(panel, plot.hist)
   }

and I want to retrieve y in my further computations outside the panel.

Thanks and regards

Mike

--
View this message in context: 
http://r.789695.n4.nabble.com/rpanel-package-retrieve-data-from-panel-tp4016953p4016953.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] geoR, variofit/likfit

2011-11-08 Thread katiab81

up

--
View this message in context: 
http://r.789695.n4.nabble.com/geoR-variofit-likfit-tp4013734p4016970.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] passing dataframe col name through cbind()

Try this:

 cbind(scores[,1,drop = FALSE], scores[,2:3])
  name round1 round2
1  Bob 40  5
2  Ron 30  6
3  Bud 20  4


Then do ?'[' to learn about 'drop'

On Tue, Nov 8, 2011 at 1:06 PM, Eric Rupley erup...@umich.edu wrote:



 Hi all ---

 I note that the column name of the first column in a dataframe does not 
 necessarily get passed on when using cbind (example below)…

 I'm looking for help in clarifying why this behavior occurs, and how I can 
 get all col names, including the first, passed on to the result…while I 
 suspect it's obvious and documented to the cognoscenti, it's puzzling me…

 Many thanks for any help on this...
 Eric

 scores - 
 data.frame(name=c(Bob,Ron,Bud),round1=c(40,30,20),round2=c(5,6,4)) 
 #some toy data

 scores
  name round1 round2
 1  Bob     40      5
 2  Ron     30      6
 3  Bud     20      4


 cbind(scores[,1],total=rowSums(scores[,2:3]),scores[,2:3])
  scores[, 1] total round1 round2
 1         Bob    45     40      5
 2         Ron    36     30      6
 3         Bud    24     20      4


 ...first column renamed...

 …yet this passes all column names:

 cbind(scores[,1:3])
  name round1 round2
 1  Bob     40      5
 2  Ron     30      6
 3  Bud     20      4


 …but this doesn't:

 cbind(scores[,1],scores[,2:3])
  scores[, 1] round1 round2
 1         Bob     40      5
 2         Ron     30      6
 3         Bud     20      4


 --
  Eric Rupley
  University of Michigan, Museum of Anthropology
  1109 Geddes Ave, Rm. 4013
  Ann Arbor, MI 48109-1079

  erup...@umich.edu
  +1.734.276.8572
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] geoR, variofit/likfit




On Nov 8, 2011, at 1:20 PM, katiab81 wrote:


up


down  ... for a variety of reasons.

And...


PLEASE do read the posting guide http://www.R-project.org/posting-guide.html


It's very possible that you are not communicating with the package  
authors by posting to rhelp (by way of Nabble) , and since you are  
asking in your initial posting if there is an error, that would seem  
to be the proper avenue.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] why NA coefficients

2011-11-08 Thread array chip

Hi Dennis, Thanks very much for the details. All those explanations about 
non-estimable mu_{12} when it has no data make sense to me! 

Regarding my specific example data where mu_{12} should NOT be estimable in a 
linear model with interaction because it has no data, yet the linear model I 
created by using lm() in R still CAN estimate the mean mu_{12}, while on the 
other hand, mu_{72} is instead NOT estimable from lm() even this category does 
have data. Does this contradiction to the theory imply that the linear model by 
lm() in R on my specific example data is NOT reliable/trustable and should not 
be used?

Thanks

John




From: Dennis Murphy djmu...@gmail.com

Cc: r-help@r-project.org r-help@r-project.org
Sent: Tuesday, November 8, 2011 10:22 AM
Subject: Re: [R] why NA coefficients

The cell mean mu_{12} is non-estimable because it has no data in the
cell. How can you estimate something that's not there (at least
without imputation :)? Every parametric function that involves mu_{12}
will also be non-estimable - in particular,  the interaction term and
the population marginal means . That's why you get the NA estimates
and the warning. All this follows from the linear model theory
described in, for example, Milliken and Johnson (1992), Analysis of
Messy Data, vol. 1, ch. 13.

Here's an example from Milliken and Johnson (1992) to illustrate:
          B1         B2       B3
T1      2, 6                   8, 6
T2        3          14      12, 9
T3        6           9

Assume a cell means model E(Y_{ijk}) = \mu_{ij}, where the cell means
are estimated by the cell averages.

From M  J (p. 173, whence this example is taken):
Whenever treatment combinations are missing, certain
hypotheses cannot be tested without making some
additional assumptions about the parameters in the model.
Hypotheses involving parameters corresponding to the
missing cells generally cannot be tested. For example,
for the data [above] it is not possible to estimate any
linear combinations (or to test any hypotheses) that
involve parameters \mu_{12} and \mu_{33} unless one
is willing to make some assumptions about them.

They continue:
One common assumption is that there is no
interactions between the levels of T and the levels of B.
In our opinion, this assumption should not be made
without some supporting experimental evidence.

In other words, removing the interaction term makes the
non-estimability problem disappear, but it's a copout unless there is
some tangible scientific justification for an additive rather than an
interaction model.

For the above data, M  J note that it is not possible to estimate all
of the expected marginal means - in particular, one cannot estimate
the population marginal means $\bar{\mu}_{1.}$, $\bar{\mu}_{3.}$,
$\bar{\mu}_{.2}$ or $\bar{\mu}_{.3}$. OTOH, $\bar{\mu}_{2.}$ and
$\bar{\mu}_{.1}$ since these functions of the parameters involve terms
associated with the means of the missing cells. Moreover, any
hypotheses involving parametric functions that contain non-estimable
cell means are not testable. In this example, the test of equal row
population marginal means is not testable because $\bar{\mu}_{1.}$ and
$\bar{\mu}_{3.}$ are not estimable.

[Aside: if the term parametric function is not familiar, in this
context it refers to linear combinations of model parameters.  In the
M  J example, $\bar{\mu}_{1.} = \mu_{11} + \mu_{12} + \mu_{13}$ is a
parametric function.]

Hopefully this sheds some light on the situation.

Dennis


 Hi Dennis,
 The cell mean mu_12 from the model involves the intercept and factor 2:
 Coefficients:
   (Intercept) factor(treat)2
 factor(treat)3
  0.429244
 0.499982   0.352971
    factor(treat)4 factor(treat)5
 factor(treat)6
     -0.204752
 0.142042   0.044155
    factor(treat)7 factor(group)2
 factor(treat)2:factor(group)2
     -0.007775
 -0.337907  -0.208734
 factor(treat)3:factor(group)2  factor(treat)4:factor(group)2
 factor(treat)5:factor(group)2
     -0.195138
 0.800029   0.227514
 factor(treat)6:factor(group)2  factor(treat)7:factor(group)2
  0.331548 NA
 So mu_12 = 0.429244-0.337907 = 0.091337. This can be verified by:
 predict(fit,data.frame(list(treat=1,group=2)))
  1
 0.09133691
 Warning message:
 In predict.lm(fit, data.frame(list(treat = 1, group = 2))) :
   prediction from a rank-deficient fit may be misleading

 But as you can see, it gave a warning about rank-deficient fit... why this
 is a rank-deficient fit?
 Because treat 1_group 2 has no cases, so why it is still estimable while
 on the contrary, treat 7_group 2 which has 2 cases is not?
 Thanks
 John




 
 From: Dennis Murphy djmu...@gmail.com

 Sent:

Re: [R] ggplot2 reorder factors for faceting

2011-11-08 Thread Dennis Murphy

Hi:

(1) Here is one way to reorganize the levels of a factor:
plotData[['infection']] - factor(plotData[['infection']],
   levels = c('InfA', 'InfC', 'InfB', 'InfD'))

Do this ahead of the call to ggplot(), preferably after plotData is defined.

relevel() resets the baseline category of a factor, but here you want
to make multiple changes.

(2) You probably want a better title for the legend. Assuming you want
'Scale' as the title, you can add the following to labs:

labs(..., fill = 'Scale')

HTH,
Dennis


On Tue, Nov 8, 2011 at 3:51 AM, Iain Gallagher
iaingallag...@btopenworld.com wrote:


 Dear List

 I am trying to draw a heatmap using ggplot2. In this heatmap I have faceted 
 my data by 'infection' of which I have four. These four infections break down 
 into two types and I would like to reorder the 'infection' column of my data 
 to reflect this.

 Toy example below:

 library(ggplot2)

 # test data for ggplot reordering
 genes - (rep (c(rep('a',4), rep('b',4), rep('c',4), rep('d',4), rep('e',4), 
 rep('f',4)) ,4))
 fcData - rnorm(96)
 times - rep(rep(c(2,6,24,48),6),4)
 infection - c(rep('InfA', 24), rep('InfB', 24), rep('InfC', 24), rep('InfD', 
 24))
 infType - c(rep('M', 24), rep('D',24), rep('M', 24), rep('D', 24))

 # data is long format for ggplot2
 plotData - as.data.frame(cbind(genes, as.numeric(fcData), as.numeric(times), 
 infection, infType))

 hp2 - ggplot(plotData, aes(factor(times), genes)) + geom_tile(aes(fill = 
 scale(as.numeric(fcData + facet_wrap(~infection, ncol=4)

 # set scale
 hp2 - hp2 + scale_fill_gradient2(name=NULL, low=#0571B0, mid=#F7F7F7, 
 high=#CA0020, midpoint=0, breaks=NULL, labels=NULL, limits=NULL, 
 trans=identity)

 # set up text (size, colour etc etc)
 hp2 - hp2 + labs(x = Time, y = ) + scale_y_discrete(expand = c(0, 0)) + 
 opts(axis.ticks = theme_blank(), axis.text.x = theme_text(size = 10, angle = 
 360, hjust = 0, colour = grey25), axis.text.y = theme_text(size=10, colour 
 = 'gray25'))

 hp2 - hp2 + theme_bw()

 In the resulting plot I would like infections infA and infC plotted next to 
 each other and likewise for infB and infD. I have a column in the data - 
 infType - which I could use to reorder the infection column but so far I have 
 no luck getting this to work.

 Could someone give me a pointer to the best way to reorder the infection 
 factor and accompanying data into the order I would like?

 Best

 iain

 sessionInfo()
 R version 2.13.2 (2011-09-30)
 Platform: x86_64-pc-linux-gnu (64-bit)

 locale:
  [1] LC_CTYPE=en_GB.utf8   LC_NUMERIC=C
  [3] LC_TIME=en_GB.utf8    LC_COLLATE=en_GB.utf8
  [5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8
  [7] LC_PAPER=en_GB.utf8   LC_NAME=C
  [9] LC_ADDRESS=C  LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C

 attached base packages:
 [1] grid  stats graphics  grDevices utils datasets  methods
 [8] base

 other attached packages:
 [1] ggplot2_0.8.9 proto_0.3-9.2 reshape_0.8.4 plyr_1.6

 loaded via a namespace (and not attached):
 [1] digest_0.5.0 tools_2.13.2


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to use quadrature to integrate some complicated functions

It's polite to include the whole list in your replies so the threads
get archived properly.

Yes, underflow is occasionally a problem: hence the common use of
log-likelihood in MLE and other applications. Might that help you out
here? You can get log-likelihoods directly from pnorm and dnorm with
the log = TRUE option.

Michael

On Tue, Nov 8, 2011 at 11:35 AM, Zuofeng Shang zuofeng.shan...@nd.edu wrote:
 Hi Michael , Thanks for your suggestion!

 Integrate() possibly workx for my problem. However my function is a product
 of
 a sequence with values between zero and one, which could be extremely small
 when the length of the sequence is large. Note: my integrand is like

 Phi(x-a_1)*Phi(x-a_2)*...*Phi(x-a_{n-1})*phi(x-a_n).

 So when n is very small, the above product will be very close to zero.

 So my concern is whether integrate() can handle accurately such integrand.


 Thanks a lot for your time!
 Best wishes,
 Zuofeng



 于 2011/11/8 8:43, R. Michael Weylandt michael.weyla...@gmail.com 写道:

 Have you tried wrapping it in a function and using integrate()? R is
 pretty good at handling numerical integration. If integrate() isn't good for
 you, can you say more as to why?

 Michael


 On Nov 6, 2011, at 4:15 PM, JeffNDzuofeng.shan...@nd.edu  wrote:

 Hello to all,

 I am having trouble with intregrating a complicated uni-dimensional
 function
 of the following form

 Phi(x-a_1)*Phi(x-a_2)*...*Phi(x-a_{n-1})*phi(x-a_n).

 Here n is about 5000, Phi is the cumulative distribution function of
 standard normal,
 phi is the density function of standard normal, and x ranges over
 (-infty,infty).

 My idea is to to use quadrature to handle this integral. But since Phi
 has
 not cloaed form,
 I don't know how to do this effeciently. I appreciate very much if
 someone
 has any ideas about it.

 Thanks!

 Jeff

 --
 View this message in context:
 http://r.789695.n4.nabble.com/how-to-use-quadrature-to-integrate-some-complicated-functions-tp3996765p3996765.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] why NA coefficients

2011-11-08 Thread William Dunlap

It might make the discussion easier to follow if you used
a smaller dataset that anyone can make and did some experiments
with contrasts. E.g.,

 D - data.frame(expand.grid(X1=LETTERS[1:3], X2=letters[24:26])[-1,], 
 Y=2^(1:8))
 D
  X1 X2   Y
2  B  x   2
3  C  x   4
4  A  y   8
5  B  y  16
6  C  y  32
7  A  z  64
8  B  z 128
9  C  z 256
 lm(data=D, Y ~ X1 * X2)

Call:
lm(formula = Y ~ X1 * X2, data = D)

Coefficients:
(Intercept)  X1B  X1C  
   -188  190  192  
X2y  X2z  X1B:X2y  
196  252 -182  
X1C:X2y  X1B:X2z  X1C:X2z  
   -168 -126   NA  

 lm(data=D, Y ~ X1 * X2, contrasts=list(X2=contr.SAS))

Call:
lm(formula = Y ~ X1 * X2, data = D, contrasts = list(X2 = contr.SAS))

Coefficients:
(Intercept)  X1B  X1C  
 64   64  192  
X2x  X2y  X1B:X2x  
   -252  -56  126  
X1C:X2x  X1B:X2y  X1C:X2y  
 NA  -56 -168  


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of array chip
 Sent: Tuesday, November 08, 2011 10:57 AM
 To: Dennis Murphy
 Cc: r-help@r-project.org
 Subject: Re: [R] why NA coefficients
 
 Hi Dennis, Thanks very much for the details. All those explanations about 
 non-estimable mu_{12} when
 it has no data make sense to me!
 
 Regarding my specific example data where mu_{12} should NOT be estimable in a 
 linear model with
 interaction because it has no data, yet the linear model I created by using 
 lm() in R still CAN
 estimate the mean mu_{12}, while on the other hand, mu_{72} is instead NOT 
 estimable from lm() even
 this category does have data. Does this contradiction to the theory imply 
 that the linear model by
 lm() in R on my specific example data is NOT reliable/trustable and should 
 not be used?
 
 Thanks
 
 John
 
 
 
 
 From: Dennis Murphy djmu...@gmail.com
 
 Cc: r-help@r-project.org r-help@r-project.org
 Sent: Tuesday, November 8, 2011 10:22 AM
 Subject: Re: [R] why NA coefficients
 
 The cell mean mu_{12} is non-estimable because it has no data in the
 cell. How can you estimate something that's not there (at least
 without imputation :)? Every parametric function that involves mu_{12}
 will also be non-estimable - in particular,  the interaction term and
 the population marginal means . That's why you get the NA estimates
 and the warning. All this follows from the linear model theory
 described in, for example, Milliken and Johnson (1992), Analysis of
 Messy Data, vol. 1, ch. 13.
 
 Here's an example from Milliken and Johnson (1992) to illustrate:
           B1         B2       B3
 T1      2, 6                   8, 6
 T2        3          14      12, 9
 T3        6           9
 
 Assume a cell means model E(Y_{ijk}) = \mu_{ij}, where the cell means
 are estimated by the cell averages.
 
 From M  J (p. 173, whence this example is taken):
 Whenever treatment combinations are missing, certain
 hypotheses cannot be tested without making some
 additional assumptions about the parameters in the model.
 Hypotheses involving parameters corresponding to the
 missing cells generally cannot be tested. For example,
 for the data [above] it is not possible to estimate any
 linear combinations (or to test any hypotheses) that
 involve parameters \mu_{12} and \mu_{33} unless one
 is willing to make some assumptions about them.
 
 They continue:
 One common assumption is that there is no
 interactions between the levels of T and the levels of B.
 In our opinion, this assumption should not be made
 without some supporting experimental evidence.
 
 In other words, removing the interaction term makes the
 non-estimability problem disappear, but it's a copout unless there is
 some tangible scientific justification for an additive rather than an
 interaction model.
 
 For the above data, M  J note that it is not possible to estimate all
 of the expected marginal means - in particular, one cannot estimate
 the population marginal means $\bar{\mu}_{1.}$, $\bar{\mu}_{3.}$,
 $\bar{\mu}_{.2}$ or $\bar{\mu}_{.3}$. OTOH, $\bar{\mu}_{2.}$ and
 $\bar{\mu}_{.1}$ since these functions of the parameters involve terms
 associated with the means of the missing cells. Moreover, any
 hypotheses involving parametric functions that contain non-estimable
 cell means are not testable. In this example, the test of equal row
 population marginal means is not testable because $\bar{\mu}_{1.}$ and
 $\bar{\mu}_{3.}$ are not estimable.
 
 [Aside: if the term parametric function is not familiar, in this
 context it refers to linear combinations of model parameters.  In the
 M  J example, $\bar{\mu}_{1.} = \mu_{11} + \mu_{12} + \mu_{13}$ is a
 parametric function.]
 
 Hopefully this sheds some light on the situation.
 
 Dennis
 
 
  Hi Dennis,

[R] nesting scale_manual caracteristics in ggplot

2011-11-08 Thread Sigrid

Hi there,
I am having a little problem with combining three scale_manual commands in a
facet plot.  I am not able to combine the three different characteristics,
instead ending up with three different descriptions next to the graph for
the same geom.  I would like to see two separate labels (not three); one
describing lines 1-7 and the other 8-14. For each of the treatments (A-B) I
want a combination of color, line type and symbol.  How do I do this?

Here are my codes (Feel free to modify the example to make it easier to work
with. I was not able to do this while keeping the problem I wanted help
with)
df -structure(list(year = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), treatment =
structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 
6L, 7L, 7L, 7L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 
5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 
3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 1L, 1L, 
1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 
7L, 7L, 7L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 
5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 
3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 1L, 1L, 2L, 
2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 
7L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 
6L, 6L, 6L, 7L, 7L, 7L), .Label = c(A, B, C, D, E, 
F, G), class = factor), total = c(135L, 118L, 121L, 64L, 
53L, 49L, 178L, 123L, 128L, 127L, 62L, 129L, 126L, 99L, 183L, 
45L, 57L, 45L, 72L, 30L, 71L, 123L, 89L, 102L, 60L, 44L, 59L, 
124L, 145L, 126L, 103L, 67L, 97L, 66L, 76L, 108L, 36L, 48L, 41L, 
69L, 47L, 57L, 167L, 136L, 176L, 85L, 36L, 82L, 222L, 149L, 171L, 
145L, 122L, 192L, 136L, 164L, 154L, 46L, 57L, 57L, 70L, 55L, 
102L, 111L, 152L, 204L, 41L, 46L, 103L, 156L, 148L, 155L, 103L, 
124L, 176L, 111L, 142L, 187L, 43L, 52L, 75L, 64L, 91L, 78L, 196L, 
314L, 265L, 44L, 39L, 98L, 197L, 273L, 274L, 89L, 91L, 74L, 91L, 
112L, 98L, 140L, 90L, 121L, 120L, 161L, 83L, 230L, 266L, 282L, 
35L, 53L, 57L, 315L, 332L, 202L, 90L, 79L, 89L, 67L, 116L, 109L, 
44L, 68L, 75L, 29L, 52L, 52L, 253L, 203L, 87L, 105L, 234L, 152L, 
247L, 243L, 144L, 167L, 165L, 95L, 300L, 128L, 125L, 84L, 183L, 
88L, 153L, 185L, 175L, 226L, 216L, 118L, 118L, 94L, 224L, 259L, 
176L, 175L, 147L, 197L, 141L, 176L, 187L, 87L, 92L, 148L, 86L, 
139L, 122L), country = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L 
), .Label = c(high, low), class = factor)), .Names = c(year, 
treatment, total, country), class = data.frame, row.names = c(NA, 
-167L))


lines - structure(list(`Line #` = 1:14, country = structure(c(2L, 2L, 2L,
2L,2L,2L,2L,1L, 1L,1L,1L,1L,1L,1L), .Label = c(high, low), class =
factor),treatment = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L,1L, 2L,
3L,4L,5L,6L,7L), .Label = c(A, B, C,D, E, F,G,H ), class =
factor), Intercept = c(81.47, 31.809,69.892,82.059,106.392,45.059,38.809,
67.024, 17.357, 105.107,79.191,91.357,5.691,24.357), Slope = c(47.267,
20.234,33.717,14.667,13.434,25.817,21.967, 47.267, 20.234,
33.717,14.667,13.434,25.817,21.967)), .Names = c(Line #, country, 
treatment, Intercept, Slope), class = data.frame, row.names = c(NA, 
-14L))

ggplot(data = df, aes(x = year, y = total, colour = treatment,
linetype=treatment)) + 
geom_point(aes(shape = treatment)) +
  facet_wrap(~country) + 
scale_colour_manual(breaks=c('A','B','C','D','E','F','G'),
values=c('A'='black','B'='black', 'C'='grey','D'='grey',
'E'='red','F'='grey', 'G'='red'),  labels=c('A: Line 1','B: Line 2','C:
Line3','D: Line 4', 
'E: Line 5 ','F:Line 6','G:Line 7'))+
scale_linetype_manual(breaks=c('A','B','C','D','E','F','G'),

[R] Question about R mantissa and number of bits

2011-11-08 Thread Lafaye de Micheaux


Dear all,

I think that every number x in R can be represented in floating point 
arithmetic as:

x = (-1)^s (1+f) 2^(e-1023)
where s is coded on 1 bit, e (positive integer) is coded on 11 bits, and 
f (real in [0,1)) is coded on 52 bits.

Am I right?

We have f=\sum_{i=1}^{52} k_i 2^{-i} for some values k_i in {0,1}.

If this is the case (for the 52 bits), we should have:

The number next to 2^150 should be (-1)^02^150(1+2^(-52))=2^150+2^98
I can check this:
 a - 2^150; b - a + 2^97; b == a
[1] TRUE
 a - 2^150; b - a + 2^98; b == a
[1] FALSE

So it seems that the mantissa is really coded on 52 bits.

But now, if I issue the following commands (using some functions 
provided below to translate from decimal to binary):

 dec2bin(0.1,52)
[1] 0.0001100110011001100110011001100110011001100110011001
 
formatC(sum(as.numeric(strsplit(dec2bin(0.1,52),)[[1]][-(1:2)])*2^(-(1:52))),50)

[1] 0.099866773237044981215149164199829101562
 formatC(0.1,50)
[1] 0.155511151231257827021181583404541
 
formatC(sum(as.numeric(strsplit(dec2bin(0.1,55),)[[1]][-(1:2)])*2^(-(1:55))),50)

[1] 0.155511151231257827021181583404541
 formatC(0.1,50)
[1] 0.155511151231257827021181583404541

So now, using formatC() it seems that f is coded on 55 bits!

Do you have an explanation for this fact?

Many thanks!

Pierre


dec2bin.ent - function(x) {
  as.integer(paste(rev(as.integer(intToBits(x))), collapse=))
}

dec2bin.frac - function(x,prec=52) {
 res - rep(NA,prec)
 for (i in 1:prec) {
  res[i] - as.integer(x*2)
  x - (x*2) %% 1
 }
 return(paste(res,collapse=))
}

dec2bin - function(x,prec=52) {
 x - as.character(x)
 res - strsplit(x,.,fixed=TRUE)[[1]]
 
return(paste(dec2bin.ent(as.numeric(res[1])),dec2bin.frac(as.numeric(paste(0.,res[2],sep=)),prec),sep=.))
}


--
Pierre Lafaye de Micheaux

Adresse courrier:
Département de Mathématiques et Statistique
Université de Montréal
CP 6128, succ. Centre-ville
Montréal, Québec H3C 3J7
CANADA

Adresse physique:
Département de Mathématiques et Statistique
Bureau 4249, Pavillon André-Aisenstadt
2920, chemin de la Tour
Montréal, Québec H3T 1J4
CANADA

Tél.: (00-1) 514-343-6607 / Fax: (00-1) 514-343-5700
laf...@dms.umontreal.ca
http://www.biostatisticien.eu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rpanel package - retrieve data from panel

2011-11-08 Thread michalseneca

I found 

names(y)-y


before the end is working ;)

some more ideas ?


--
View this message in context: 
http://r.789695.n4.nabble.com/rpanel-package-retrieve-data-from-panel-tp4016953p4017082.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Estimate of intercept in loglinear model

2011-11-08 Thread Mark Difford

On Nov 08, 2011 at 11:16am Colin Aitken wrote:

 An unresolved problem is:  what does R do when the explanatory factors 
 are not defined as factors when it obtains a different value for the 
 intercept but the correct value for the fitted value?

Colin,

I don't think that happens (that the fitted values are identical if
predictors are cast as numerical), but the following could (really is
answered by my initial answer). Once again, using the example I gave above,
but using the second level of outcome as a reference level for a new fit,
called glm.D93R. (For this part of the question a corpse would have been
nice, though not really needed---yours was unfortunately buried too deeply
for me to find it,)

## Dobson (1990) Page 93: Randomized Controlled Trial : 
counts - c(18,17,15,20,10,20,25,13,12) 
outcome - gl(3,1,9) 
treatment - gl(3,3) 
glm.D93 - glm(counts ~ outcome + treatment, family=poisson())
glm.D93R - glm(counts ~ C(outcome, base=2) + treatment, family=poisson())

## treat predictor as numeric
glm.D93N - glm(counts ~ as.numeric(as.character(outcome)) +
as.numeric(as.character(treatment)), family=poisson())

 coef(glm.D93)
  (Intercept)  outcome2  outcome3treatment2treatment3 
 3.044522e+00 -4.542553e-01 -2.929871e-01  1.337909e-15  1.421085e-15

## Different value for the Intercept but same fitted values (see below) as
the earlier fit (above)
##
summary(glm.D93R)
 snipped and edited for clarity 
Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept) 2.590e+00  1.958e-01  13.230   2e-16 ***
outcome1   4.543e-01  2.022e-01   2.247   0.0246 *  
outcome3   1.613e-01  2.151e-01   0.750   0.4535
treatment2-3.349e-16  2.000e-01   0.000   1.
treatment3-6.217e-16  2.000e-01   0.000   1.
 snip 

 fitted(glm.D93)
   12345678   
9 
21.0 13.3 15.7 21.0 13.3 15.7 21.0 13.3
15.7

 fitted(glm.D93R)
   12345678   
9 
21.0 13.3 15.7 21.0 13.3 15.7 21.0 13.3
15.7

## if predictors treated as numeric---check summary(glm.D93N) yourself
 fitted(glm.D93N)
   12345678   
9 
19.40460 16.52414 14.07126 19.40460 16.52414 14.07126 19.40460 16.52414
14.07126

Regards, Mark.

-
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: 
http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4017091.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to handle empty arguments