date:20100112


In situations where an ancient version and a less ancient
version of R give different results for the same data it's
a good idea to check the NEWS file. For HoltWinters() you
will find an entry for R VERSION 2.8.0 that indicates a
change in the default number of start periods used to
autodetect start values from 3 to 2. While this may be
good for 'tidy' series, your series has a pretty messy
start. Another way to look for obvious differences in an
older version of a function and a newer one is to check
the help page and note the arguments and their default
values. Either way, you would find:

start.periods = 3 (R 2.7.2)
start.periods = 2 (R 2.9.1)

So, to get results more reasonable than the obvious junk
produced with start.periods = 2, try 3.

Do upgrade to 2.10.1, though.

 -Peter Ehlers

RobertNZ wrote:

Hi R-users,

I have a question relating to the HoltWinters() function.  I am trying to
forecast a series using the Holt Winters methodology but I am getting some
unusual results. I had previously been using R for Windows version 2.7.2 and
have just started using R 2.9.1.  While using version 2.7.2 I was getting
reasonable results however upon changing versions I found I started to see
unusual results.  If anybody would be able provide assistance with this it
would be much appreciated!

The series in question is ‘x’ below.

x = c(18, 18, 16, 19, 12, 12, 13, 12, 7, 9, 9, 9, 12.5, 16, 20,
22, 22, 22, 22, 22, 22, 22, 23, 23, 23, 23, 23, 23, 23, 23,
23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23,
23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 22, 17, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 14, 14, 14, 14, 14, 14,
14, 14, 14, 14, 14, 14, 14, 14, 14, 14.5, 15, 14, 14, 14,
14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
14, 14, 14, 14, 14, 14, 14, 14, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
9, 9, 9, 9, 9, 9, 6, 5, 5, 5)

x.ts = ts(data = x, start = c(1999,1), frequency = 12)

## USING R 2.9.1 I get the following results using HoltWinters().  I see
that the smoothing parameters are greater than 1 for alpha and beta which I
believe is unusual as I think the optim() function that is used to calculate
these  
## parameters in HoltWinters() should constrain the results to [0,1]? 


hw.ts = HoltWinters(x.ts)
hw.ts

Holt-Winters exponential smoothing with trend and additive seasonal
component.

Call:
 HoltWinters(x = x.ts) 


Smoothing parameters:
 alpha:  1.046340 
 beta :  3.198345 
 gamma:  1 


Coefficients:
 [,1]
a   -6.491044e+44
b   -3.313740e+46
s1  -4.035877e+40
s2   9.734997e+40 
s3  -2.348192e+41

s4   5.664108e+41
s5  -1.366248e+42
s6   3.295545e+42
s7  -7.949232e+42
s8   1.917446e+43
s9  -4.625098e+43
s10  1.115626e+44
s11 -2.691018e+44
s12  6.491044e+44

Subsequently using the predict() function the following results are produced
which are quite unusual.


x.pred = predict(hw.ts, n.ahead = 10,

+  prediction.interval = T, conf.level = 0.8)

x.pred

   fit   upr   lwr
Jan 2010 -3.378655e+46 -3.102583e+46 -3.654726e+46
Feb 2010 -6.692381e+46 -5.448602e+46 -7.936160e+46
Mar 2010 -1.000615e+47 -7.533863e+46 -1.247845e+47
Apr 2010 -1.331981e+47 -9.385468e+46 -1.725416e+47
May 2010 -1.663375e+47 -1.103422e+47 -2.223327e+47
Jun 2010 -1.994702e+47 -1.250080e+47 -2.739324e+47
Jul 2010 -2.326189e+47 -1.380352e+47 -3.272025e+47
Aug 2010 -2.657291e+47 -1.494943e+47 -3.819640e+47
Sep 2010 -2.989320e+47 -1.596167e+47 -4.382472e+47
Oct 2010 -3.319116e+47 -1.681697e+47 -4.956534e+47


### Applying the same code to time series x.ts using R 2.7.2 yields the
following results


hw.ts = HoltWinters(x.ts)
hw.ts

Holt-Winters exponential smoothing without trend and with additive seasonal
component.

Call:
 HoltWinters(x = x.ts) 


Smoothing parameters:
 alpha:  0.8560487 
 beta :  0 
 gamma:  1 


Coefficients:
  [,1]
a6.3820972
s1  -1.6592841
s2  -1.4172832
s3  -0.3896275
s4   1.1195576
s5   1.3899338
s6   1.8304666
s7   1.3751008
s8   0.5919732
s9  -0.5971810
s10 -0.7390197
s11 -1.0104958
s12 -1.3820972

The subsequent forecast this time is more reasonable


x.pred

  fit   uprlwr
Jan 2010 4.722813  7.943789  1.5018372
Feb 2010 4.964814  9.204797  0.7248308
Mar 2010 5.992470 11.050160  0.9347798
Apr 2010 7.501655 13.262123  1.7411862
May 2010 7.772031 14.158405  1.3856573
Jun 2010 8.212564 15.168751  1.2563766
Jul 2010 7.757198 15.239932  0.2744638
Aug 2010 6.974070 14.948660 -1.0005194
Sep 2010 5.784916 14.222739 -2.6529065
Oct 2010 5.643078 14.519993 -3.2338377


It would be much appreciated if anyone could help me with understanding why
I am seeing these unusual results when using R 2.9.1 compared with R 2.7.2?
I wonder if there is something that I have not considered or if there are
any remedies that I could take to fix this? 


Thanks in advance,
Robert



--
Peter Ehlers
University of Calgary
403.202.3921

__
R-help@r-project.org mailing

[R] Problems with betareg()

2010-01-12 Thread Al Leong

Hi,

In using the betareg package, I encounter the following error message:

Error in lm.wfit(x, linkfun(y), weights, offset = offset) : 
  NA/NaN/Inf in foreign function call (arg 4)

Any help will be most appreciated. Thanks in advance.

Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply a function down each column


See inline below.

Laetitia Schmid wrote:

Dear Steve,
my solution looks like it would work, but it does not.
I attached a text file with an extract of my data. Maybe you can try it 
yourself. I want to compare C1 with M1, C2 with M2, C3 with M3,,, for 
each column.

I do not really know what the problem is. R complains about a syntax error.
The function I am applying counts the common strings between the two. 
Greg Hirson helped me to write it.


lettermatch - function(a, b) {
   tb - merge(as.data.frame(table(strsplit(a, ))), 
as.data.frame(table(strsplit(b, ))), by=Var1)

   sum(apply(tb[-1], 1, min))
}

For example for the second column I tried:

for (x in 1:(nrow(dat)-1)) {
a - as.character(dat[(2x-1),1])


Shouldn't that be 2*x-1??

 -Peter Ehlers


b - as.character(dat[(2x),1])
 lettermatch(a,b)
}

or

 a - as.character(dat[seq(1, nrow(dat), by=2),2])
 b - as.character(dat[seq(2, nrow(dat), by=2), 2])
 all.results - lettermatch(a,b)

With dat-read.delim(data_lgs.txt,stringsAsFactors=FALSE) I can 
leave the as.character away in the formula above.


Laetitia

IndividualsSeq1Seq2Seq3Seq4
C1AATTCCGGCTTT
M1
C2AATTCCGGCTTT
M2AGGGAACTCCGGCGTT
C3AGGGAACTCCGGCGTT
M3AGGGAACTCCGGCGTT
C4AATTCCGGCCTT
M4AAATCGGGCTTT
C5AGGGACTTCCCGCTTT
M5AGGGCTTTCCTT
C6AGGGCTTTCCTT
M6AAAGCCTTCTTT
C7AAAGACCCCCCGGTTT
M7AAGGAACCCCGG
C8AATTCCGGCCTT
M8AATTCCGGCCTT
C9
M9
C11AGGGAAACCGGGGGTT
M11AATTCCGGCCTT



Am 11.01.2010 um 15:18 schrieb Steve Lianoglou:


Hi,

On Mon, Jan 11, 2010 at 8:41 AM, Laetitia Schmid laeti...@gmt.su.se 
wrote:

Hello World,
I have a function that makes pairwise comparisons between two 
strings. I would like to apply this function to my data (which 
consists of columns with different strings) in the way that it 
compares the first with the second entry, and then the third with the 
fourth, and then the fifth with the sixth, and so on down each column...

So (2x-1) and (2x) would be the different entries to be compared!

dat= my data:

for the first column: compare dat[(2x-1),1] with dat[(2x),1] and x 
would be 1:i, i=length(dat[,1])


I think the best way to do that is a loop:

a - as.character(dat[(2x-1),1])
b - as.character(dat[(2x),1])

for (i in 1:length(dat[,1]) my_function(a, b))

Can somebody help me to apply a function with a loop in the way I 
want to a column?


It seems as if you got it already, don't you?

for (x in 1:(nrow(dat)-1)) {
 a - dat[(2x-1),1]
 b - dat[(2x), 1]
 my_function(a,b)
}


Is there a specification of tapply for that?


I don't think so, but depending on what you want to do, the size of
your data, and the amount of RAM you have, it might be faster to
compare everything at once (assuming `my_function` can be
vectorized), for instance:

a - dat[seq(1, nrow(dat), by=2),1]
b - dat[seq(2, nrow(dat), by=2), 1]
all.results - my_function(a,b)

Also, as an aside, I see you keep calling as.character on your data
when you extract it from your data.frame. Is your data being converted
to factors? You can look to set stringsAsFactors=FALSE if this is the
case and you are reading in data using read.table/delim/etc (see:
?read.table)

Hope that helps,

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




--
Peter Ehlers
University of Calgary
403.202.3921

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with betareg()

2010-01-12 Thread Achim Zeileis


On Tue, 12 Jan 2010, Al Leong wrote:


Hi,

In using the betareg package, I encounter the following error message:

Error in lm.wfit(x, linkfun(y), weights, offset = offset) :
 NA/NaN/Inf in foreign function call (arg 4)

Any help will be most appreciated. Thanks in advance.


We can't possibly help you with this amount of information. Please provide 
a small reproducible example, preferably with a (small) artificial data 
set or with one available in R. (Also see the posting guide, linked at the 
end of this e-mail.)

Z


Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] find the corresponding mean y

2010-01-12 Thread Keith Jewell

Not very clear what you want, but perhaps...
?sortedXyData
...might help.

hth

KJ
luciferyan anniehyh...@googlemail.com wrote in message 
news:1263231740556-1011427.p...@n4.nabble.com...

 Hello, I have 49 paired data, x, y.
 I have sampled x (where replacement is true), and find its mean.
 How can I find the corresponding mean y, which is the paired data of above
 sample x?
 Thank you very much,
 Annie
 -- 
 View this message in context: 
 http://n4.nabble.com/find-the-corresponding-mean-y-tp1011427p1011427.html
 Sent from the R help mailing list archive at Nabble.com.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plot prices and dates in a nice way

2010-01-12 Thread Trafim Vanishek

Dear all,
I currently experience the problem with nicely plotting price data against
the dates.

Data - read.csv(C:/IBM.csv, header = TRUE, sep = ,)
plot(Data[,1], Data[,2])

I cannot find the way how can I choose the # of breaks for the x axis -
dates in this case?

Thanks a lot
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] warning inside loop

2010-01-12 Thread Rense


Dear William,

thank you kindly for this solution: it provides exactly what I need,
especially due to the fact that the encapsulating function returns a list,
from which I can extract all the information I need.

kind regards,

Rense Nieuwenhuis



William Dunlap wrote:
 
 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Rense
 Sent: Monday, January 11, 2010 3:07 PM
 To: r-help@r-project.org
 Subject: [R] warning inside loop
 
 
 Hi,
 
 I'm running some data simulations using (mixed effects)* 
 regression models
 that show difficulty to converge. Therefore, I seek a way of capturing
 warnings (of false convergence) inside a loop.
 
 Inside that loop, I modify data and estimate a model. I do so 
 many times
 with slightly different modifications of the data. Next, I 
 extract some of
 the model parameters and store these in a matrix. However, as 
 some of the
 models do not converge well, some of the stored parameters 
 are extracted
 from the ill-converged models. Therefore, I seek a way of 
 automatically
 detecting whether the estimation procedure has resulted in a 
 warning, so I
 can distinguish between the well- and ill-converged models.
 
 I have been trying to use functions as warnings(), as well as 
 using the
 object last.warning, but unfortunately to no avail.
 
 Try withCallingHandlers(), as in the following function
 with returns the value of the expression along with
 any warning messages as a list:
 
 withWarnings
 function (expr) 
 {
 warnings - character()
 retval - withCallingHandlers(expr, warning = function(ex) {
 warnings - c(warnings, conditionMessage(ex))
 invokeRestart(muffleWarning)
 })
 list(Value = retval, Warnings = warnings)
 }
 environment: R_GlobalEnv
 
 Typical usage would be:
 lapply(-1:1, function(i)withWarnings(log(i)))
 [[1]]
 [[1]]$Value
 [1] NaN
 
 [[1]]$Warnings
 [1] NaNs produced
 
 
 [[2]]
 [[2]]$Value
 [1] -Inf
 
 [[2]]$Warnings
 character(0)
 
 
 [[3]]
 [[3]]$Value
 [1] 0
 
 [[3]]$Warnings
 character(0)
 
 Perhaps there is some encapsulation of this already in some
 package, as try() encapsulates error catching.
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com 
 
 
 Although I cannot provide a reproducible example, I 
 schematically represent
 the procedure I seek to use below:
 
 
 for (i in 1:10)
 {
 modify data
 estimate model
 
 evaluate whether estimation produced warning
 
 extract model parameters, and store whether warning occured
 }
 
 I hope any one can give some guidelines on how to deal with 
 warnings inside
 a loop.
 
 With Kind regards,
 
 Rense
 
 
 
 
 
 *Although I use the lme4 package for that actual analysis, I sent my
 question to this mailinglist (instead of the R mixed list) 
 because I believe
 this is a general issue, rather than one associated 
 exclusively with mixed
 models.
 -- 
 View this message in context: 
 http://n4.nabble.com/warning-inside-loop-tp1011667p1011667.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://n4.nabble.com/warning-inside-loop-tp1011667p1011979.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read.spss: option to.data.frame and string variables

2010-01-12 Thread RINNER Heinrich

Dear R-users,

I am using R version 2.10.1 and package foreign version 0.8-39 under windows.

When reading .sav-Files (PASW Statistics 18.0.1) containing string variables, 
these are automatically converted to factors when using option to.data.frame = 
TRUE (see example below).
It's clear to me why this happens (the default behaviour of a call to 
as.data.frame). But this is not always what one might want (or even be aware 
of).

So maybe one of the following improvements could be made?
* Add a description of this behaviour in ?read.spss.
* Or (even better): Add an extra argument, like: 
read.spss(C:\\temp\\test.sav, to.data.frame = TRUE, stringsAsFactors = FALSE).

Just a suggestion;
kind regards
Heinrich.

# EXAMPLE:
Suppose there is a simple file test.sav, containing one variable (x) of 
type STRING with 3 values (a,b,c).
 library(foreign)
 test - read.spss(C:\\temp\\test.sav)
 test
$x
[1] abc   

attr(,label.table)
attr(,label.table)$x
NULL

attr(,codepage)
[1] 1252
 is.factor(test$x)
[1] FALSE
 is.character(test$x)
[1] TRUE
# Ok, that's just fine. But things change when using option to.data.frame = 
TRUE:
 test - read.spss(C:\\temp\\test.sav, to.data.frame = TRUE)
 test
 x
1 a
2 b
3 c
 is.factor(test$x)
[1] TRUE
 is.character(test$x)
[1] FALSE

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R for windows 64 bit

2010-01-12 Thread alessia matano

Dear all,

I just download and set this new version of R. I am now trying to
download the packages I need which are sperseM and quantreg. I
downloaded and insert into the library file the quantreg pacjkage and
it seems to work. However, when I try to do the same with sparseM I
get the following error message:

Loading required package: SparseM
Error in inDL(x, as.logical(local), as.logical(now), ...) :
  unable to load shared library
'C:/PROGRA~1/R/R-211~1.0DE/library/SparseM/libs/SparseM.dll':
  LoadLibrary failure:  %1 non è un'applicazione di Win32 valida.


Any help for it?
Thanks a lot
alessia

2010/1/11 Henrique Dallazuanna www...@gmail.com:
 Try this version (beta of development version):

 http://www.stats.ox.ac.uk/pub/RWin/Win64/R-2.11.0dev-win64.exe

 On Mon, Jan 11, 2010 at 2:29 PM, alessia matano alexis@gmail.com wrote:
 Dear all,

 do you know if there is any particular version of R to implement with
 windows 64 bit, in such a way to increase the amount of memory it can
 use?

 How should I increase the memory, and more importantly to set a higher
 max vector size? It still stops me saying Could not allocate vector
 of size 145

 thanks to all
 alessia

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plot ylab on the right

2010-01-12 Thread Mister Vanhalen

Hello,

I have a graphic and I want to plot the yaxis AND ylab on the right.
I manage to plot axis on the right with axis(4) but I don't know how to
write the ylab on the right.

barplot(data, name=leg, xlab=Probability, ylab=Number of links, axes=F)
axis(4)

And another question :
Is there an easy way to indicate that directly on barplot command ?


Thank you,

M

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R for windows 64 bit




On 12.01.2010 11:33, alessia matano wrote:

Loading required package: SparseM
Error in inDL(x, as.logical(local), as.logical(now), ...) :
   unable to load shared library
'C:/PROGRA~1/R/R-211~1.0DE/library/SparseM/libs/SparseM.dll':
   LoadLibrary failure:  %1 non è un'applicazione di Win32 valida.


These packages have libraries that work for 32-bit R.

You need to compile 64-bit versions from sources using the setup as 
described in Brian Ripley's message on R-devel and in the R Installation 
and Administration manual.


Package repositories for 64-bit versions are not yet online for that 
*experimental* 64-bit version - although I am currently syncing them. 
The latter means an also *experimental* repository of 64-bit packages 
for Windows may go online in the CRAN network within very few days.


Note that you will need to set that repository manually for now.

In the meantime, you can get packages from my more or less private 
repository as follows:


install.packages(c(SparseM, quantreg),

contriburl=http://www.statistik.tu-dortmund.de/~ligges/CRAN/bin/windows64/contrib/2.11;, 


dependencies = TRUE)

Best,
Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot ylab on the right

2010-01-12 Thread Jim Lemon


On 01/12/2010 09:41 PM, Mister Vanhalen wrote:

Hello,

I have a graphic and I want to plot the yaxis AND ylab on the right.
I manage to plot axis on the right with axis(4) but I don't know how to
write the ylab on the right.

barplot(data, name=leg, xlab=Probability, ylab=Number of links, axes=F)
axis(4)

   

Hi Mister Vanhalen,
Try this:

mtext(Number of links,4)

And another question :
Is there an easy way to indicate that directly on barplot command ?


   

A quick look at the axis command has not revealed it.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] drop1: warning message

No idea, you may want to tell us all your calls and give a reproducible 
example as well as properly formatted code that is more readable than 
the stuff below.


I guess you omitted some other warnings that tell you about too few 
obervations to fit the model (the fitted probabilities numerically 0 or 
1 occurred is a weak indication that your model may be overspecified).


Best,
Uwe Ligges



On 12.01.2010 08:17, Martin Bulla wrote:

Dear R collegues,

I am new to R and do not understad the warning messages, which appear
when I am trying to apply type III procedure to my glm:


inkm$Gamie glm.inkmZ drop1(glm.inkmZ,test=Ch)

Single term deletions

Model:
InkMb ~ IncStart + Vol + Gamie + PC1 + PC2 + PC3 + SpotsN + PC1:SpotsN
Df DevianceAIC LRT   Pr(Chi)  9.3879
27.388  IncStart1  14.6274 30.627  5.2395
0.0220791 *  Vol 1  15.9723 31.972  6.5844 0.0102876 *
Gamie   1  13.8659 29.866  4.4780 0.0343330 *  PC2 1
9.6899 25.690  0.3020 0.5826585PC3 1  10.7326 26.733  1.3447
0.2462124PC1:SpotsN  1  21.6517 37.652 12.2638 0.0004618 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Warning
messages:
1: In glm.fit(x[, jj, drop = FALSE], y, wt, offset = object$offset,  :
   fitted probabilities numerically 0 or 1 occurred
2: In glm.fit(x[, jj, drop = FALSE], y, wt, offset = object$offset,  :
   fitted probabilities numerically 0 or 1 occurred

Where did PC1 disappiered?  What is the meaning of these warning messages?

Similarly, I do not understand where did PCs dissapeared in the
following calculations:

inkm$Gamie glm.inkm0 drop1(glm.inkm0,test=Ch)

Single term deletions

Model:
InkMb ~ IncStart + Vol + Gamie + PC1 + PC2 + PC3 + SpotsN + Gamie:PC1 +
 Gamie:PC2 + Gamie:PC3
   Df DevianceAICLRT  Pr(Chi)18.368
40.368   IncStart   1   26.078 46.078 7.7106 0.005490 **
Vol1   26.515 46.515 8.1475 0.004312 **
SpotsN 1   18.663 38.664 0.2958 0.586537   Gamie:PC1  1   20.418
40.418 2.0505 0.152155   Gamie:PC2  1   18.460 38.460 0.0919 0.761818
Gamie:PC3  1   19.832 39.832 1.4647 0.226180   ---

Any suggestions would help me greatly.
Best regards,
Martin



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.spss: option to.data.frame and string variables

2010-01-12 Thread David Winsemius

It would be an significant undertaking to annotate all the places  
where the default behavior of strings-to-factors conversion might trip  
up the unwary. You are not the first by any means to complain. You  
might:
a) take the step that the Mayo Clinic has taken of setting the default  
in options() to FALSE, or
b) make your own read.spss with your desired arguments, and then put  
it in your .Rprofile.


--
David

On Jan 12, 2010, at 5:28 AM, RINNER Heinrich wrote:


Dear R-users,

I am using R version 2.10.1 and package foreign version 0.8-39 under  
windows.


When reading .sav-Files (PASW Statistics 18.0.1) containing string  
variables, these are automatically converted to factors when using  
option to.data.frame = TRUE (see example below).
It's clear to me why this happens (the default behaviour of a call  
to as.data.frame). But this is not always what one might want (or  
even be aware of).


So maybe one of the following improvements could be made?
* Add a description of this behaviour in ?read.spss.
* Or (even better): Add an extra argument, like: read.spss(C:\\temp\ 
\test.sav, to.data.frame = TRUE, stringsAsFactors = FALSE).


Just a suggestion;
kind regards
Heinrich.

# EXAMPLE:
Suppose there is a simple file test.sav, containing one variable  
(x) of type STRING with 3 values (a,b,c).

library(foreign)
test - read.spss(C:\\temp\\test.sav)
test

$x
[1] abc   

attr(,label.table)
attr(,label.table)$x
NULL

attr(,codepage)
[1] 1252

is.factor(test$x)

[1] FALSE

is.character(test$x)

[1] TRUE
# Ok, that's just fine. But things change when using option  
to.data.frame = TRUE:

test - read.spss(C:\\temp\\test.sav, to.data.frame = TRUE)
test

x
1 a
2 b
3 c

is.factor(test$x)

[1] TRUE

is.character(test$x)

[1] FALSE

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot prices and dates in a nice way




On 12.01.2010 10:39, Trafim Vanishek wrote:

Dear all,
I currently experience the problem with nicely plotting price data against
the dates.

Data- read.csv(C:/IBM.csv, header = TRUE, sep = ,)
plot(Data[,1], Data[,2])

I cannot find the way how can I choose the # of breaks for the x axis -
dates in this case?



No idea since we do not know hat is in IBM.csv and hence we do not what 
Data actually includes ...


Uwe Ligges



Thanks a lot



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drought Severity Index (DSI)

2010-01-12 Thread Jim Lemon


On 01/12/2010 07:21 PM, Muhammad Rahiz wrote:

Does anyone have the code to calculate the drought severity index?



Hi Muhammad,
If you mean does anyone have the algorithm? it seems pretty hard to 
find. Even that old standby Wikipedia didn't have a description of 
Palmer's algorithm. If you happen to have the algorithm but not the R 
code, perhaps mentioning that might get some responses.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Beginer data.frame

2010-01-12 Thread Jean-Baptiste Combes

Hello,

I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am
using XP.

I have a data which has a data.frame format called x.df (read from a csv
file). I want to take from this data observations for which the variable
Code starts with an R. I took all the Code and put them into a vector
vec-grep(R[A-Z][A-Z],x.df$Code,value=TRUE)

Then I created a function that is supposed to take all the lines in the my
data x.df for which Code equals one value of vec. See the code below
where I created a loop to do that.

 myfunc-function(data,var2,var1)
+ {
+ i=1
+ while (i632){
+ line-subset(data,var2==var1[i])
+ if (i==1){
+ df-line
+ df-data.frame(df)
+ }
+ else {
+ line-data.frame(line)
+ df-rbind(df,line)
+ }
+ i-i+1
+ }
+ fix(df)
+ }


The results of my program higly depend on the few last lines of the program.
If I put fix(df), as above, the function opens a window with my data and
it seems a sensible results (I have not checked in details but I barely have
what I am suppose to get).
 myfunc-function(data,var2,var1)
...
+ }
+ df-data.frame(df)
+ print(is.data.frame(df))
+ }
 myfunc(x.df,x.df$Code,vec)
[1] TRUE
 print(is.data.frame(df))
[1] FALSE

In the case above I ask whether or not the df is a data.frame and the
answer is true, when the program has ended, I ask again and the answer is
false.

Could anyone tell me what to do to get this data and could anyone tell me
why those differences in the results?

 as.data.frame(df)
Erreur dans as.data.frame.default(df) :
  impossible de convertir automatiquement la classe  function en un
tableau de données (data.frame)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drought Severity Index (DSI)

2010-01-12 Thread Muhammad Rahiz


Hello Jim,

Thanks for the response. I have the algorithm based on the original 
paper (Bryant et al. 1994) and subsequent modification by Philips and 
McGregor(1994). The latter gives the formula for calculating the index 
in MS Excel. I'm trying to translate it to R.






Muhammad

Muhammad Rahiz  |  Doctoral Student in Regional Climate Modeling

Climate Research Laboratory, School of Geography  the Environment  
Oxford University Centre for the Environment
South Parks Road, Oxford, OX1 3QY, United Kingdom 
Tel: +44 (0)1865-285194	 Mobile: +44 (0)7854-625974

Email: muhammad.ra...@ouce.ox.ac.uk






Jim Lemon wrote:

On 01/12/2010 07:21 PM, Muhammad Rahiz wrote:
  

Does anyone have the code to calculate the drought severity index?




Hi Muhammad,
If you mean does anyone have the algorithm? it seems pretty hard to 
find. Even that old standby Wikipedia didn't have a description of 
Palmer's algorithm. If you happen to have the algorithm but not the R 
code, perhaps mentioning that might get some responses.


Jim




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reading a file with mixed cyrillic/latin characters

2010-01-12 Thread Stephan Kolassa


Dear useRs,

I am trying to read a tab-delimited Unicode text file containing both 
latin and cyrillic characters and failing miserably. The file looks like 
this (I hope it comes across right):


A   B   C
3   foo ФОО
5   bar БАР

read.table(foo.txt,sep=\t,header=TRUE)

I am guessing that I can use the fileEncoding argument to read.table() 
to read this, but I can find no list of supported values of 
fileEncoding, and fileEncoding=Unicode gives an error.


The FAQ and the FAQ for Windows don't help. I have searched both the 
list archives and RSeek and am still seeking enlightenment. I am running 
R 2.10.1 on Windows XP, sessionInfo() below.


Cheers
Stephan


R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252 
LC_MONETARY=German_Germany.1252 LC_NUMERIC=C

[5] LC_TIME=German_Germany.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R for windows 64 bit

2010-01-12 Thread alessia matano

Fine, it worked. I will try in this way.

Just the last question and I won't bother you further today. My
machine right now has just 6 giga of RAM (it will be increased to 16
in a few days), and I see that with this experimental version
memory.limit is 6135.

How is the command to increase the memory usage until the maximum I
can (5 giga?). If I am writing memory.limit(5000) it still gives me
the error:

don't be silly! Your machine has a 4Gb address limit

which is quite odd.

Many thanks
Best
A.

2010/1/12 alessia matano alexis@gmail.com:
 ok, perfect!
 I will try with it...many many thanks. Have you got there also the
 quantreg package, which has actually the same problem of sparseM
 (32bit version)?

 best
 alessia

 2010/1/12 Uwe Ligges lig...@statistik.tu-dortmund.de:


 On 12.01.2010 12:09, alessia matano wrote:

 I am sorry, I know it is an experimental version, and I have been
 misleading saying a new version.

 Therefore, I will wait for when they will be available officially,
 since it is just a few days.

 Or just use today my private repository I indicated in the other mail.

 Uwe Ligges


 However, I tried also to go to the cran pages and download them and
 insert into the library. For quantreg it worked, for sparseM it did
 not probably because it's a win32 version, as you said.



 2010/1/12 Prof Brian Ripleyrip...@stats.ox.ac.uk:

 On Tue, 12 Jan 2010, alessia matano wrote:

 Dear all,

 I just download and set this new version of R. I am now trying to
 download the packages I need which are sperseM and quantreg. I
 downloaded and insert into the library file the quantreg pacjkage and
 it seems to work. However, when I try to do the same with sparseM I
 get the following error message:

 Loading required package: SparseM
 Error in inDL(x, as.logical(local), as.logical(now), ...) :
  unable to load shared library
 'C:/PROGRA~1/R/R-211~1.0DE/library/SparseM/libs/SparseM.dll':
  LoadLibrary failure:  %1 non è un'applicazione di Win32 valida.


 Any help for it?

 Please do refer to the posting referred to in that thread (and Henrique,
 please do not post just the URL without the explanations).

 https://stat.ethz.ch/pipermail/r-devel/2010-January/056301.html

 You cannot mix 32-bit Windows binary packages with this experimental port
 (it is not a 'new version'): you need to install from the package
 sources.
  If that is too difficult for you, please do not try to use unsupported
 experimental builds (and Uwe Ligges may have some binary packages
 available
 for test in a few days).


 Thanks a lot
 alessia

 2010/1/11 Henrique Dallazuannawww...@gmail.com:

 Try this version (beta of development version):

 http://www.stats.ox.ac.uk/pub/RWin/Win64/R-2.11.0dev-win64.exe

 On Mon, Jan 11, 2010 at 2:29 PM, alessia matanoalexis@gmail.com
 wrote:

 Dear all,

 do you know if there is any particular version of R to implement with
 windows 64 bit, in such a way to increase the amount of memory it can
 use?

 How should I increase the memory, and more importantly to set a higher
 max vector size? It still stops me saying Could not allocate vector
 of size 145

 thanks to all
 alessia

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
 Brian D. Ripley,                  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford,             Tel:  +44 1865 272861 (self)
 1 South Parks Road,                     +44 1865 272866 (PA)
 Oxford OX1 3TG, UK                Fax:  +44 1865 272595




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Beginer data.frame

2010-01-12 Thread Stephan Kolassa


Hi Jean-Baptiste,

two points:

1) Your variable df is a *local* variable which you define in your 
function myfunc(), so it is not known outside myfunc(). When you ask 
is.data.frame(df), R looks at the global definition of df - which is the 
density function of the F distribution. To make your function run 
(especially interactively) will require a major rewrite.


2) Generally, using variable names that are already used as R objects 
(like df in your example) is a bad idea. For an example of the 
problems you can run into, see 1) above.


3) Loops are not the R way. Depending on what you want to do with the 
subset of your data.frame, you may want to do something like this:


x.df[substr(x.df$Code,1,1)==R,]

Look at ?substr to learn more - this function is vectorized, meaning 
that it takes a vector input and returns a vector output. Look at 
section 2.7 in An introduction to R.


Good luck!
Stephan


Jean-Baptiste Combes schrieb:

Hello,

I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am
using XP.

I have a data which has a data.frame format called x.df (read from a csv
file). I want to take from this data observations for which the variable
Code starts with an R. I took all the Code and put them into a vector
vec-grep(R[A-Z][A-Z],x.df$Code,value=TRUE)

Then I created a function that is supposed to take all the lines in the my
data x.df for which Code equals one value of vec. See the code below
where I created a loop to do that.


myfunc-function(data,var2,var1)

+ {
+ i=1
+ while (i632){
+ line-subset(data,var2==var1[i])
+ if (i==1){
+ df-line
+ df-data.frame(df)
+ }
+ else {
+ line-data.frame(line)
+ df-rbind(df,line)
+ }
+ i-i+1
+ }
+ fix(df)
+ }

The results of my program higly depend on the few last lines of the program.
If I put fix(df), as above, the function opens a window with my data and
it seems a sensible results (I have not checked in details but I barely have
what I am suppose to get).

myfunc-function(data,var2,var1)

...
+ }
+ df-data.frame(df)
+ print(is.data.frame(df))
+ }

myfunc(x.df,x.df$Code,vec)

[1] TRUE

print(is.data.frame(df))

[1] FALSE

In the case above I ask whether or not the df is a data.frame and the
answer is true, when the program has ended, I ask again and the answer is
false.

Could anyone tell me what to do to get this data and could anyone tell me
why those differences in the results?


as.data.frame(df)

Erreur dans as.data.frame.default(df) :
  impossible de convertir automatiquement la classe  function en un
tableau de données (data.frame)

[[alternative HTML version deleted]]





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Beginer data.frame

2010-01-12 Thread K. Elo

Hi!

Jean-Baptiste Combes wrote:
 Hello,
 
 I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am
 using XP.
 
 I have a data which has a data.frame format called x.df (read from a csv
 file). I want to take from this data observations for which the variable
 Code starts with an R. I took all the Code and put them into a vector
 vec-grep(R[A-Z][A-Z],x.df$Code,value=TRUE)


I am not sure if I understood you correctly, but could a simple:

subset(x.df, substring(Code,1,1)==R)

be an appropriate solution?

HTH,
Kimmo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Beginer data.frame

2010-01-12 Thread David Winsemius



On Jan 12, 2010, at 6:17 AM, Jean-Baptiste Combes wrote:


Hello,

I use R 2.10, and I am new in R (I used to use SAS and lately  
Stata), I am

using XP.

I have a data which has a data.frame format called x.df (read from a  
csv
file). I want to take from this data observations for which the  
variable
Code starts with an R. I took all the Code and put them into a  
vector

vec-grep(R[A-Z][A-Z],x.df$Code,value=TRUE)


vec is going to be a vector of row numbers that can be used to address  
the data.frame


Then I created a function that is supposed to take all the lines in  
the my
data x.df for which Code equals one value of vec. See the code  
below

where I created a loop to do that.


That seems to be a very short R one-liner:

data[vec, ]

?[

--
David.

myfunc-function(data,var2,var1)

+ {
+ i=1
+ while (i632){ #where does that come from ?
+ line-subset(data,var2==var1[i])
+ if (i==1){
+ df-line
+ df-data.frame(df)
+ }
+ else {
+ line-data.frame(line)
+ df-rbind(df,line)
+ }
+ i-i+1
+ }
+ fix(df)
+ }




The results of my program higly depend on the few last lines of the  
program.
If I put fix(df), as above, the function opens a window with my  
data and
it seems a sensible results (I have not checked in details but I  
barely have

what I am suppose to get).

myfunc-function(data,var2,var1)

...
+ }
+ df-data.frame(df)
+ print(is.data.frame(df))
+ }

myfunc(x.df,x.df$Code,vec)

[1] TRUE

print(is.data.frame(df))

[1] FALSE

In the case above I ask whether or not the df is a data.frame and  
the
answer is true, when the program has ended, I ask again and the  
answer is

false.

Could anyone tell me what to do to get this data and could anyone  
tell me

why those differences in the results?


as.data.frame(df)

Erreur dans as.data.frame.default(df) :
 impossible de convertir automatiquement la classe  function en un
tableau de données (data.frame)




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] time series analys by resting the effect of a covariate

2010-01-12 Thread Simone Santoro


Hi,

Does anyone know a way to estimate the existence of a temporal trend (each unit 
of the sample is a count) by resting the possible effect of a covariate (i.e. 
climatic factor)?
I have periodical counts of several species of waterbirds during the last 13 
years and I want to know if, resting the effect of the flooded area available, 
there would be a temporal trend.
I know the flooded area is correlated with the time series data of most of 
species I'm taking in account.

I had a look to time series section of the http://cran.r-project.org
but I didn't find nothing on this issue.

Thanks for any response
  
_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] coerce vector into array - change filling sequence

2010-01-12 Thread Kohleth Chia

Dear all,

When I coerce a vector into a multi dimensional array, I would like R to start 
filling the array along the last dimension, then the 2nd last etc.
Let's jump straight into an example.

x - 1 : 24
y - array(dim=c(2,2,6))

I would like to have:
y[1,1,1] = 1
y[1,1,2] = 2
...
y[1,1,6] = 6
y[1,2,1] = 7
y[1,2,2] = 8
...
y[2,1,1] = 13
...
y[2,2,1] = 19

if I do y- array(x, dim=c(2,2,6)), i think I will get
y[1,1,1] = 1
y[2,1,1] = 2
(or something not I want) instead.

Of course, I need a fast solution, as I am actually dealing with array of much 
larger size.
Any input will be appreciated
Thanks a lot
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] coerce vector into array - change filling sequence

2010-01-12 Thread Stephan Kolassa


Hi,

you can permute array dimensions using aperm():

x - 1 : 24
z - array(x, dim=c(6,2,2))
y - aperm(z,perm=c(3,2,1))
y[1,1,]

HTH,
Stephan


Kohleth Chia schrieb:

Dear all,

When I coerce a vector into a multi dimensional array, I would like R to start 
filling the array along the last dimension, then the 2nd last etc.
Let's jump straight into an example.

x - 1 : 24
y - array(dim=c(2,2,6))

I would like to have:
y[1,1,1] = 1
y[1,1,2] = 2
...
y[1,1,6] = 6
y[1,2,1] = 7
y[1,2,2] = 8
...
y[2,1,1] = 13
...
y[2,2,1] = 19

if I do y- array(x, dim=c(2,2,6)), i think I will get
y[1,1,1] = 1
y[2,1,1] = 2
(or something not I want) instead.

Of course, I need a fast solution, as I am actually dealing with array of much 
larger size.
Any input will be appreciated
Thanks a lot
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Conditional Sampling

2010-01-12 Thread ehcpieterse


Hi,

I am hoping someone can help me with a sampling question.

I am using the following function to sample 10 unique observations: x -
sample(1:100, 10, replace=F)
Given the first 10 observations, I need to sample another 5 unique
observations from the remainder. I essentially want to do a Monte Carlo type
analysis on the results.

I would appreciate any feedback.

Thanks
-- 
View this message in context: 
http://n4.nabble.com/Conditional-Sampling-tp1012072p1012072.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot ylab on the right

2010-01-12 Thread Mister Vanhalen

It's works perfectly ! ;) Thank you very much !!

M  A

On Tue, Jan 12, 2010 at 11:56 AM, Jim Lemon j...@bitwrit.com.au wrote:

 On 01/12/2010 09:41 PM, Mister Vanhalen wrote:

 Hello,

 I have a graphic and I want to plot the yaxis AND ylab on the right.
 I manage to plot axis on the right with axis(4) but I don't know how to
 write the ylab on the right.

 barplot(data, name=leg, xlab=Probability, ylab=Number of links,
 axes=F)
 axis(4)



 Hi Mister Vanhalen,
 Try this:

 mtext(Number of links,4)

  And another question :
 Is there an easy way to indicate that directly on barplot command ?




 A quick look at the axis command has not revealed it.

 Jim



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Beginer data.frame

2010-01-12 Thread Gabor Grothendieck

See help(grepl) so using built in data frame CO2 this gets rows whose
Plant column start with Qn:

subset(CO2, grepl(^Qn, Plant))

On Tue, Jan 12, 2010 at 6:17 AM, Jean-Baptiste Combes
jeanbaptiste.combes.a...@googlemail.com wrote:
 Hello,

 I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am
 using XP.

 I have a data which has a data.frame format called x.df (read from a csv
 file). I want to take from this data observations for which the variable
 Code starts with an R. I took all the Code and put them into a vector
 vec-grep(R[A-Z][A-Z],x.df$Code,value=TRUE)

 Then I created a function that is supposed to take all the lines in the my
 data x.df for which Code equals one value of vec. See the code below
 where I created a loop to do that.

 myfunc-function(data,var2,var1)
 + {
 + i=1
 + while (i632){
 + line-subset(data,var2==var1[i])
 + if (i==1){
 + df-line
 + df-data.frame(df)
 + }
 + else {
 + line-data.frame(line)
 + df-rbind(df,line)
 + }
 + i-i+1
 + }
 + fix(df)
 + }


 The results of my program higly depend on the few last lines of the program.
 If I put fix(df), as above, the function opens a window with my data and
 it seems a sensible results (I have not checked in details but I barely have
 what I am suppose to get).
 myfunc-function(data,var2,var1)
 ...
 + }
 + df-data.frame(df)
 + print(is.data.frame(df))
 + }
 myfunc(x.df,x.df$Code,vec)
 [1] TRUE
 print(is.data.frame(df))
 [1] FALSE

 In the case above I ask whether or not the df is a data.frame and the
 answer is true, when the program has ended, I ask again and the answer is
 false.

 Could anyone tell me what to do to get this data and could anyone tell me
 why those differences in the results?

 as.data.frame(df)
 Erreur dans as.data.frame.default(df) :
  impossible de convertir automatiquement la classe  function en un
 tableau de données (data.frame)


        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Pharmacokinetic and pharmacodynamic modeling and simulation

2010-01-12 Thread Dick Verkerk



 Pharmacokinetic and pharmacodynamic modeling and simulation
 By Dr. Jan Freijer
 March 18, 2010
 Amsterdam, The Netherlands

 http://www.can.nl/events/details.php?id=57

 This course is aimed at users of R or S-PLUS in the bio-pharmaceutical
 sciences who would like to use R for clinical trial simulations.

 topics include

 Working with packages in R
 - MASS
 - odesolve

 Random generation from univariate distributions
 - density, distribution function, quantile function and random generation
 - various distributions

 Random generation from multivariate distributions
 - normal distribution
 - working with the covariance matrix
 - simulating PK and PK-PD model parameters

 Solving differential equations
 - solving differential equations in R
 - testing the numerical solution versus analytical solution
 - PK models for single dose oral or IV administration
 - PK models for multiple dose oral or IV administration
 - implementing PD models

 Clinical trial simulations
 - combining the structural model with the random effects model
 - uncertainty versus variability
 - example: two compartment PK model with indirect response model


 Location:  Amsterdam
 Date:  March 18th
 Time:  10:00h.-16:30h.
 Price   :  EURO 395,- excluding VAT

 Register:
 - phone : +31-(0)20-560-8400
 - Email : pau...@can.nl
 - Web   : http://www.can.nl/events/details.php?id=57

 There is a maximum of 12 participants.

 You may register by replying to this email and provide us with the
 following information.

 Name   :  M / F
 Title  :
 Department :
 Institute  :
 Address:
 City   :
 Zip:
 Telephone  :
 Fax:
 Email  :

 Please let us know if you have any questions.

 Please feel free to send this message on to your colleagues and friends
 for whom it might be interesting!!

 Kind regards,

 Dick Verkerk


 _

 CANdiensten, Nieuwpoortkade 23-25, NL-1055 RX Amsterdam
 tel: +31 20 5608410 fax: +31 20 5608448 verk...@candiensten.nl
 _
 Your Partner in Mathematics and Statistics!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Conditional Sampling

2010-01-12 Thread Ted Harding

On 12-Jan-10 12:58:13, ehcpieterse wrote:
 Hi,
 
 I am hoping someone can help me with a sampling question.
 
 I am using the following function to sample 10 unique observations:
 x - sample(1:100, 10, replace=F)
 Given the first 10 observations, I need to sample another 5 unique
 observations from the remainder. I essentially want to do a Monte
 Carlo type analysis on the results.
 
 I would appreciate any feedback.
 
 Thanks
 -- 

If your second sampling is to be on the same footing as the first,
i.e. a random subset of 5 out of the remaining 90, then this is
equivalent to sampling 15 in the first place and using the first
10 of these as your first sample:

  set.seed(54321)
  x0 - sample(1:100, 15, replace=F)
  x  - x0[1:10]
  y  - x0[-(1:10)]
  x0
  # [1] 43 50 18 27 21 83  5 20 32 34 13 61  4 57 86
  x
  # [1] 43 50 18 27 21 83  5 20 32 34
  y
  # [1] 13 61  4 57 86

  set.seed(54321)
  sample(1:100, 10, replace=F)
  # [1] 43 50 18 27 21 83  5 20 32 34

However, if the manner of taking the second sample from the
remainder will depend on the results of the first sample, then
further consideration is necessary. If this is the case, can you
indicate how the values in the first sample would influence how
the second sample is to be obtained?

Another approach, which leaves more options, is to use sample.int():

  set.seed(54321)
  X - (1:100) ## (or any other 100 values)
  n - sample.int(100,10,replace=FALSE) ## returns subset of (1:100)
  x - X[n]
  Y - X[-n]
  y - sample(Y,5,replace=FALSE)

  x
  # [1] 43 50 18 27 21 83  5 20 32 34 ## (as before)
  y
  # [1] 14 70  4 66 96## (as before)

Hoping this helps,
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 12-Jan-10   Time: 13:21:21
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] trouble with installing SJava

2010-01-12 Thread Jiiindo


Colleagues, 
How i can solve this error when i install SJava package
Thanks

R CMD INSTALL -c /usr/local/lib/R/SJava_0.69-0.tar.gz

* installing to library ‘/usr/local/lib/R/site-library’
* installing *source* package ‘SJava’ ...
checking for java... /usr/lib/jvm/java-6-sun/bin/java
Java VM /usr/lib/jvm/java-6-sun/bin/java
checking for javah... /usr/lib/jvm/java-6-sun/bin/javah
Looking in /usr/lib/jvm/java-6-sun/include
Looking in /usr/lib/jvm/java-6-sun/include/linux
checking for g++... g++
checking for C++ compiler default output... a.out
checking whether the C++ compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables... 
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking for gcc... gcc
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ANSI C... none needed
checking for Rf_initEmbeddedR in -lR... no
No R shared library found
configure: creating ./config.status
config.status: creating Makevars
config.status: creating src/Makevars
config.status: creating src/RSJava/Makefile
config.status: creating Makefile_rules
config.status: creating inst/scripts/RJava.bsh
config.status: creating inst/scripts/RJava.csh
config.status: creating R/zzz.R
config.status: creating cleanup
config.status: creating inst/scripts/RJava
Copying the cleanup script to the scripts/ directory
Building libRSNativeJava.so in
/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava
if  test ! -d /usr/local/lib/R/site-library/SJava/libs ; then \
mkdir /usr/local/lib/R/site-library/SJava/libs ; \
fi
gcc -std=gnu99 -g -O2 -D_R_ -I/usr/local/lib/R/include
-I/usr/local/lib/R/include/R_ext
-I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava  -I.
-I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include 
-I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux 
-c CtoJava.c
CtoJava.cweb:148: error: expected '=', ',', ';', 'asm' or '__attribute__'
before 'vm1_args'
CtoJava.cweb:215: error: static declaration of 'std_env' follows non-static
declaration
CtoJava.cweb:195: error: previous declaration of 'std_env' was here
CtoJava.cweb: In function 'create_Java_vm':
CtoJava.cweb:256: error: 'vm1_args' undeclared (first use in this function)
CtoJava.cweb:256: error: (Each undeclared identifier is reported only once
CtoJava.cweb:256: error: for each function it appears in.)
make: *** [CtoJava.o] Error 1
Generating JNI header files from Java classes.
   RForeignReference, RManualFunctionActionListener, ROmegahatInterpreter 
REvaluator
*
Warning:
At present, to use the library you must set the 
LD_LIBRARY_PATH environment variable
to
 
/usr/local/lib/R/site-library/SJava/libs:/usr/lib/jvm/java-6-sun-1.6.0.14/jre/lib/i386/server:/usr/lib/jvm/java-6-sun-1.6.0.14/jre/lib/i386:/usr/lib/jvm/java-6-sun-1.6.0.14/jre/../lib/i386::/usr/java/packages/lib/i386:/lib:/usr/lib
or use one of the RJava.bsh or RJava.csh scripts
*
** libs
gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include
-I/usr/local/lib/R/include/R_ext
-I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava  -I.
-I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include  -IRSJava
-I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux
-I/usr/local/include-fpic  -g -O2 -c ConverterExamples.c -o
ConverterExamples.o
ConverterExamples.cweb: In function ‘RS_JAVA_setFunctionConverter’:
ConverterExamples.cweb:213: warning: assignment discards qualifiers from
pointer target type
ConverterExamples.cweb: In function ‘RS_JAVA_toJavaFunctionConverter’:
ConverterExamples.cweb:312: warning: passing argument 1 of
‘getOmegahatReferenceValue’ discards qualifiers from pointer target type
gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include
-I/usr/local/lib/R/include/R_ext
-I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava  -I.
-I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include  -IRSJava
-I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux
-I/usr/local/include-fpic  -g -O2 -c Converters.c -o Converters.o
Converters.cweb: In function ‘RS_JAVA_removeConverter’:
Converters.cweb:399: warning: assignment discards qualifiers from pointer
target type
gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include
-I/usr/local/lib/R/include/R_ext
-I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava  -I.
-I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include  -IRSJava
-I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux
-I/usr/local/include-fpic  -g -O2 -c REmbed.c -o REmbed.o
gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include
-I/usr/local/lib/R/include/R_ext
-I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava  -I.
-I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include

Re: [R] Drought Severity Index (DSI)

2010-01-12 Thread Steve_Friedman

A few years ago, I work with Stuart Gage who had developed a Heat /
Precipitation Index as a measure of drought severity.  It works just as
well if not better than the Palmer Drought Index.

You can find the formula in this only pdf report:  Climate Variability in
the North Central Region:
  http://goes.msu.edu/publications/pdfs_ps/CGCEO%2085.pdf

Google search:  Stuart Gage Heat Precipitation Index

If you can not download the file.  Write to me individually and I'll send
it directly.

Steve

Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147


   
 Muhammad Rahiz
 muhammad.ra...@o 
 uce.ox.ac.uk  To 
 Sent by:  Jim Lemon j...@bitwrit.com.au  
 r-help-boun...@r-  cc 
 project.org   r-help@r-project.org  
   r-help@r-project.org  
   Subject 
 01/12/2010 06:21  Re: [R] Drought Severity Index  
 AM(DSI)   
   
   
   
   
   
   




Hello Jim,

Thanks for the response. I have the algorithm based on the original
paper (Bryant et al. 1994) and subsequent modification by Philips and
McGregor(1994). The latter gives the formula for calculating the index
in MS Excel. I'm trying to translate it to R.





Muhammad

Muhammad Rahiz  |  Doctoral Student in Regional Climate Modeling

Climate Research Laboratory, School of Geography  the Environment

Oxford University Centre for the Environment
South Parks Road, Oxford, OX1 3QY, United Kingdom
Tel: +44 (0)1865-285194 Mobile: +44 (0)7854-625974
Email: muhammad.ra...@ouce.ox.ac.uk






Jim Lemon wrote:
 On 01/12/2010 07:21 PM, Muhammad Rahiz wrote:

 Does anyone have the code to calculate the drought severity index?



 Hi Muhammad,
 If you mean does anyone have the algorithm? it seems pretty hard to
 find. Even that old standby Wikipedia didn't have a description of
 Palmer's algorithm. If you happen to have the algorithm but not the R
 code, perhaps mentioning that might get some responses.

 Jim



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drought Severity Index (DSI)

2010-01-12 Thread Muhammad Rahiz


Thanks Steve!

Muhammad Rahiz  |  Doctoral Student in Regional Climate Modeling

Climate Research Laboratory, School of Geography  the Environment  
Oxford University Centre for the Environment
South Parks Road, Oxford, OX1 3QY, United Kingdom 
Tel: +44 (0)1865-285194	 Mobile: +44 (0)7854-625974

Email: muhammad.ra...@ouce.ox.ac.uk






steve_fried...@nps.gov wrote:

A few years ago, I work with Stuart Gage who had developed a Heat /
Precipitation Index as a measure of drought severity.  It works just as
well if not better than the Palmer Drought Index.

You can find the formula in this only pdf report:  Climate Variability in
the North Central Region:
  http://goes.msu.edu/publications/pdfs_ps/CGCEO%2085.pdf

Google search:  Stuart Gage Heat Precipitation Index

If you can not download the file.  Write to me individually and I'll send
it directly.

Steve

Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147


   
 Muhammad Rahiz
 muhammad.ra...@o 
 uce.ox.ac.uk  To 
 Sent by:  Jim Lemon j...@bitwrit.com.au  
 r-help-boun...@r-  cc 
 project.org   r-help@r-project.org  
   r-help@r-project.org  
   Subject 
 01/12/2010 06:21  Re: [R] Drought Severity Index  
 AM(DSI)   
   
   
   
   
   
   





Hello Jim,

Thanks for the response. I have the algorithm based on the original
paper (Bryant et al. 1994) and subsequent modification by Philips and
McGregor(1994). The latter gives the formula for calculating the index
in MS Excel. I'm trying to translate it to R.





Muhammad

Muhammad Rahiz  |  Doctoral Student in Regional Climate Modeling

Climate Research Laboratory, School of Geography  the Environment

Oxford University Centre for the Environment
South Parks Road, Oxford, OX1 3QY, United Kingdom
Tel: +44 (0)1865-285194 Mobile: +44 (0)7854-625974
Email: muhammad.ra...@ouce.ox.ac.uk






Jim Lemon wrote:
  

On 01/12/2010 07:21 PM, Muhammad Rahiz wrote:



Does anyone have the code to calculate the drought severity index?



  

Hi Muhammad,
If you mean does anyone have the algorithm? it seems pretty hard to
find. Even that old standby Wikipedia didn't have a description of
Palmer's algorithm. If you happen to have the algorithm but not the R
code, perhaps mentioning that might get some responses.

Jim





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Conditional Sampling

2010-01-12 Thread ehcpieterse


Thanks Ted, your solution does make perfect sense.

The only question I still have is that I would like to sample the remaining
5 observations after I have randomly selected the first 10. Given the
initial 10, I would like to sample the following 5 say 1,000 times to get a
simulated conditional sample, if that makes any sense.

I want to build this into an iterative process to see how the first sample
affects the resulting samples. Even though all the observations have the
same probabilty to get sampled, they each have a different expected value.
-- 
View this message in context: 
http://n4.nabble.com/Conditional-Sampling-tp1012072p1012114.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sparseM and kronecker product_R latest version

2010-01-12 Thread Martin Maechler

 am == alessia matano alexis@gmail.com
 on Mon, 11 Jan 2010 16:20:57 +0100 writes:

am Many thanks for it.
am However it is strange that when I put the numbers rather than ncol(R)
am (a matrix with ncol=36698) it worked. Look below

 dim(res2)
am [1] 170471  25822
 D- as.matrix.csr(0,nrow(tmpb),25822)
 D- as.matrix.csr(0,nrow(tmpb),ncol(res2))
am Error in if (length(x) == nrow * ncol) x - matrix(x, nrow, ncol) else 
{ :
am missing value where TRUE/FALSE needed
am In addition: Warning message:
am In nrow * ncol : NAs produced by integer overflow

am But probably it is true what you said, anyway.

yes, it is true.
The clue is that  typeof(25822)   is double and not integer.

am So, do you suggest me to use directly the simple matrix command? or a
am kind of sparse matrix within your package?!? 

the latter, e.g.,

 library(Matrix)
 D - Matrix(0, nrow = 113289, ncol=36698)
 ## or
 D. - sparseMatrix(x=double(0), i=integer(0), j=integer(0),
dims = c(113289,36698))
 identical(D, D.) ##-- TRUE

## and, e.g.,

 Dk - kronecker(D, Diagonal(x=5:2))
 identical(Dk, D %x% Diagonal(x = 5:2))
[1] TRUE
 dim(D)
[1] 113289  36698
 dim(Dk)
[1] 453156 146792
 

Regards,
Martin Maechler, ETH Zurich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Conditional Sampling

2010-01-12 Thread Magnus Torfason


Would the following work, or is there a reason why it would not?

risk.set  - 1:100
first.10  - sample(risk.set, 10)
remainder - setdiff(risk.set, first.10)

for ( i in 1:1000 )
{
next.5 - sample(remainder, 5)
do.something.with(next.5)
}

Best,
Magnus

On 1/12/2010 9:00 AM, ehcpieterse wrote:


Thanks Ted, your solution does make perfect sense.

The only question I still have is that I would like to sample the remaining
5 observations after I have randomly selected the first 10. Given the
initial 10, I would like to sample the following 5 say 1,000 times to get a
simulated conditional sample, if that makes any sense.

I want to build this into an iterative process to see how the first sample
affects the resulting samples. Even though all the observations have the
same probabilty to get sampled, they each have a different expected value.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Conditional Sampling

2010-01-12 Thread Ted Harding

On 12-Jan-10 14:00:24, ehcpieterse wrote:
 Thanks Ted, your solution does make perfect sense.
 
 The only question I still have is that I would like to sample
 the remaining 5 observations after I have randomly selected the
 first 10. Given the initial 10, I would like to sample the
 following 5 say 1,000 times to get a simulated conditional sample,
 if that makes any sense.
 
 I want to build this into an iterative process to see how the
 first sample affects the resulting samples. Even though all the
 observations have the same probabilty to get sampled, they each
 have a different expected value.
 -- 

OK, if I now understand you, you are interested in the properties
of the remaining (90) observations, given that they do not include
any of the (10) cases sampled in the first round.

In that case, I think you should adopt the sample.int() approach
I also suggested:

  X - (1:100) ## (or any other 100 values)
  n - sample.int(100,10,replace=FALSE) ## returns subset of (1:100)
  x - X[n]
  Y - X[-n]   ## The set remaining after the first 10 were taken
  ## Now you can sample repeatedly from Y until your eyes fall out.
  ## So build up a matrix of (say) 1000 samples from Y:
  M - sample(Y,5,replace=FALSE)
  for(i in (2:1000)){ M - rbind(M,sample(Y,5,replace=FALSE)) }

The repeated samples M of 5 from Y of course imply replacing each
sample of 5 back in Y, so they are available at each turn. You can
not, of course, sample 1000*5 from 100 without replacement! (Each
sample of 5 is obtained without replacement, however).

I hope this is getting close!
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 12-Jan-10   Time: 14:34:13
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solving graph theory problems with R ? (minimum vertex cover)

2010-01-12 Thread Magnus Torfason


On 1/12/2010 12:12 AM, Johannes Hüsing wrote:

Tal Galili schrieb:

My specific problem is called:
Minimum vertex cover for a hypergraph


I know nothing about the problem at hand, but on the Wikipedia
page it says that the problem can be formulated as an integer
linear program. There is an R packages that interfaces to a
linear programming package (Rglpk), which may or may not
help you.


There are also two graph/network analysis packages available for R, 
'igraph' and 'sna'. I don't think either of them has a formal support 
for hypergraphs, but it is possible that they could be jerry-rigged to 
solve your problem. Even if not, the people involved may be able to 
help. For example, the igraph mailing list (igraph-h...@nongnu.org) is 
pretty active and the developers are very helpful.


Best,
Magnus

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] The TeX-source for the package manual.

2010-01-12 Thread BXC (Bendix Carstensen)

I have noted that the later versions of Rcmd check cleans out the directory 
pkg.Rcheck so that only package-manual.log and package-manual.pdf are left.
Formerly the package-manual.tex was around too --- very handy for various 
purposes.

Is there a way to generate the .tex - version of the manual for a package?

br.
Bendix
__

Bendix Carstensen 
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2-4
DK-2820 Gentofte
Denmark
+45 44 43 87 38 (direct)
+45 30 75 87 38 (mobile)
b...@steno.dk   http://www.biostat.ku.dk/~bxc
www.steno.dk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] List arguments from data frame columns in formula

2010-01-12 Thread npobedina


Hi!
I'm trying to run logistic regression on a dataset which is contained in
dataframe data (y is in the first col, and 28 parameters for the model).
How can I write formula for function `glm` without listing explicitly all 28
paramaters?
`glm(data[,1]~data[,2]+data[,3]+data[,4]+...,family=binomial)`

As an option I can use `glm.fit(data[,-1],data[,1],family =
binomial(link=logit))`. But the obtained object cannot be used in function
`predict.glm`.

Thanks,
Natalia
-- 
View this message in context: 
http://n4.nabble.com/List-arguments-from-data-frame-columns-in-formula-tp1012146p1012146.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] List arguments from data frame columns in formula

2010-01-12 Thread Gabor Grothendieck

Try this:

glm(y ~ ., family = binomial, data = data, ...)

On Tue, Jan 12, 2010 at 9:45 AM, npobedina npobed...@gmail.com wrote:

 Hi!
 I'm trying to run logistic regression on a dataset which is contained in
 dataframe data (y is in the first col, and 28 parameters for the model).
 How can I write formula for function `glm` without listing explicitly all 28
 paramaters?
 `glm(data[,1]~data[,2]+data[,3]+data[,4]+...,family=binomial)`

 As an option I can use `glm.fit(data[,-1],data[,1],family =
 binomial(link=logit))`. But the obtained object cannot be used in function
 `predict.glm`.

 Thanks,
 Natalia
 --
 View this message in context: 
 http://n4.nabble.com/List-arguments-from-data-frame-columns-in-formula-tp1012146p1012146.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multiple symbols per single line in a legend

2010-01-12 Thread Primoz PETERLIN

Hello everybody,

Is it possible to coax legend() into displaying more than one simbol per
line in legend? I have a graph like the one attached to this mail; I would
like to reorganize the legend in such a way that the duplicate text would be
omitted, i.e., the first line would read square triangledown increasing
frequency and the second one would read circle triangleup decreasing
frequency. Before resorting to box() and text() I would like to check
whether some clever method already exists that would solve my problem. :)

Thanks in advance.

All the best,
Primoz
attachment: example.png__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with bio3d package

2010-01-12 Thread rgfrance


Hello,
I have a problem, when I run read.fasta.pdb I get this error:
pdb/seq: 1   name: dcluster.1
Error: subscript out of bounds

My FASTA file is the sequence of the bovine insuline. 
Thanks in advance,
Rg
-- 
View this message in context: 
http://n4.nabble.com/problem-with-bio3d-package-tp1012116p1012116.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Conditional Sampling

2010-01-12 Thread ehcpieterse


Thanks Ted, it's exactly what I'm after. Thanks for the help.
-- 
View this message in context: 
http://n4.nabble.com/Conditional-Sampling-tp1012072p1012180.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] List arguments from data frame columns in formula

2010-01-12 Thread npobedina


Thanks a lot for help! :)
I've invented a more complicated way to make it work: 
f - as.formula(paste(data.y~, paste( names(data.x), collapse = +)))
fml-glm(f, data=data.x,family=binomial)


Try this:

glm(y ~ ., family = binomial, data = data, ...)


-- 
View this message in context: 
http://n4.nabble.com/List-arguments-from-data-frame-columns-in-formula-tp1012146p1012230.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] optim: abnormal termination in lnsrch

2010-01-12 Thread Mario Valle


I'm using optim() to minimize a certain function.
Often the minimization ends with the message:
ERROR: ABNORMAL_TERMINATION_IN_LNSRCH

What is optim() trying to say?
What have I to change in my function to make the minimization succeed?
Do you think using BBoptim() instead of optim() changes anything?

Thanks for your help!
mario


--
Ing. Mario Valle
Data Analysis and Visualization Group| 
http://www.cscs.ch/~mvalle

Swiss National Supercomputing Centre (CSCS)  | Tel:  +41 (91) 610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax:  +41 (91) 610.82.82

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] optim: abnormal termination in lnsrch (resend)

2010-01-12 Thread Mario Valle


[sorry, forgot some details...]

I'm using optim(param, fun, method='L-BFGS-B', lower=lo, upper=up) to 
minimize a certain function.

Often the minimization ends with the message:
ERROR: ABNORMAL_TERMINATION_IN_LNSRCH

What is optim() trying to say?
What have I to change in my function to make the minimization succeed?
Do you think using BBoptim() instead of optim() changes anything?

Thanks for your help!
mario


--
Ing. Mario Valle
Data Analysis and Visualization Group| 
http://www.cscs.ch/~mvalle

Swiss National Supercomputing Centre (CSCS)  | Tel:  +41 (91) 610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax:  +41 (91) 610.82.82

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] optim: abnormal termination in lnsrch (resend)

2010-01-12 Thread Ravi Varadhan

You forgot a lot of details.  Can you send us more information about the
fn and also some minimal code that can reproduce the problem?

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Mario Valle
Sent: Tuesday, January 12, 2010 11:53 AM
To: R-help@r-project.org
Subject: [R] optim: abnormal termination in lnsrch (resend)

[sorry, forgot some details...]

I'm using optim(param, fun, method='L-BFGS-B', lower=lo, upper=up) to 
minimize a certain function.
Often the minimization ends with the message:
ERROR: ABNORMAL_TERMINATION_IN_LNSRCH

What is optim() trying to say?
What have I to change in my function to make the minimization succeed?
Do you think using BBoptim() instead of optim() changes anything?

Thanks for your help!
mario


-- 
Ing. Mario Valle
Data Analysis and Visualization Group| 
http://www.cscs.ch/~mvalle
Swiss National Supercomputing Centre (CSCS)  | Tel:  +41 (91) 610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax:  +41 (91) 610.82.82

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Placing eps files from R into Adobe InDesign documents: specifying fontfamily

2010-01-12 Thread dwalcerz


This is a solution I am posting for a problem that others may have.

If you want to:
1.  Place lattice graphics from R into an Adobe InDesign document, and
2.  Use the export as eps function in R to maximize resolution (it is much
better than exporting as a metafile or bitmap), and
3.  Use long strings of text in your titles or captions or to label your
axes.

Then you will have problems because:
1.  Adobe InDesign doesn't recognize R fontfamilies in eps files, and
2.  Adobe InDesing replaces the default R font, Helvetica, with a fontfamily
that looks like Courier, which significantly changes the physical length of
each character string and disrupts the spacing and justification of titles,
captions, and axis labels.

The way I solved this problem is:
1.  When you execute a graph in R, use the Hershey family of fonts.  I
particulary like HersheySans.  You can specify the fontfamily for axes and
strips and titles as shown in the example below.
2.  Export the file in eps by right-clicking the on-screen display and
selecting 'save as postscript'.
3.  Place the file in InDesign using the usual command (ctrl-D) and
selecting the file.  You will still get the Adobe error about unrecognized
fonts, but the automatic replacement that Adobe uses for the Hershey
fontfamily is much better than the one they use for Helvetica, your spacing
and justification will be almost perfect, and you get the excellent
resolution and small size of vector-graphics.

In the following example the fontfamily is specified for strips, axes, and
titles with these lines:
par.strip.text=list(fontfamily=HersheySans) #for strips
scales=list(alternating=1, tck=c(1,0), fontfamily=HersheySans) #for axes
xlab=list(Combined Score, fontfamily=HersheySans) #for titles such as
'main', 'sub', 'xlab' and 'ylab'

Here is the example:
bwplot(school.name~score|assessment+course_code, data=temp2.stack,
plot.points=FALSE, drop.unused.levels=TRUE,
panel=function(..., box.ratio, varwidth) {
panel.violin(..., col=cornsilk, varwidth=FALSE, 
box.ratio=box.ratio)
panel.bwplot(..., box.ratio=0.1)
},
layout=c(2,3,1),
par.strip.text=list(fontfamily=HersheySans),
scales=list(alternating=1, tck=c(1,0),
fontfamily=HersheySans,
x=list(relation=same, cex=0.7, rot=90),
y=list(relation=same, cex=0.7, rot=0)),
xlab=list(Combined Score, fontfamily=HersheySans),
ylab=list(School(State)(students), fontfamily=HersheySans)
)

-- 
View this message in context: 
http://n4.nabble.com/Placing-eps-files-from-R-into-Adobe-InDesign-documents-specifying-fontfamily-tp1012186p1012186.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] getting p values

2010-01-12 Thread Rosario Garcia Gil

Dear colleges

I need to get the p values for a table with 15000 entries of t values. Does any 
of you know how to do it? I can, of course, get one by one but that is not 
sensible.

Thanks
Rosario

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] barplot: border color when stacked

2010-01-12 Thread RINNER Heinrich

Dear R-users,

I am using R version 2.10.1 under windows.

In a barplot, I want to mark one of the bars with a special border color.
For example:
barplot(c(3, 7, 11), border = c(NA, red, NA))

But how to do this when the bars are stacked?
for example:
barplot(matrix(1:6, ncol=3)) # border of second bar (i.e. the one with total 
height = 7) should be red again, I try:
barplot(matrix(1:6, ncol=3), border = c(NA, red, NA))

Obviously, this doesn't give me what I want.
Your advice would be appreciated;
kind regards
Heinrich.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] barplot: border color when stacked

2010-01-12 Thread Henrique Dallazuanna

You can edit the barplot function to do this:

mybarplot -
function (height, width = 1, space = NULL, names.arg = NULL,
legend.text = NULL, beside = FALSE, horiz = FALSE, density = NULL,
angle = 45, col = NULL, border = par(fg), main = NULL,
sub = NULL, xlab = NULL, ylab = NULL, xlim = NULL, ylim = NULL,
xpd = TRUE, log = , axes = TRUE, axisnames = TRUE, cex.axis =
par(cex.axis),
cex.names = par(cex.axis), inside = TRUE, plot = TRUE,
axis.lty = 0, offset = 0, add = FALSE, args.legend = NULL,
...)
{
if (!missing(inside))
.NotYetUsed(inside, error = FALSE)
if (is.null(space))
space - if (is.matrix(height)  beside)
c(0, 1)
else 0.2
space - space * mean(width)
if (plot  axisnames  is.null(names.arg))
names.arg - if (is.matrix(height))
colnames(height)
else names(height)
if (is.vector(height) || (is.array(height)  (length(dim(height)) ==
1))) {
height - cbind(height)
beside - TRUE
if (is.null(col))
col - grey
}
else if (is.matrix(height)) {
if (is.null(col))
col - grey.colors(nrow(height))
}
else stop('height' must be a vector or a matrix)
if (is.logical(legend.text))
legend.text - if (legend.text  is.matrix(height))
rownames(height)
stopifnot(is.character(log))
logx - logy - FALSE
if (log != ) {
logx - length(grep(x, log))  0L
logy - length(grep(y, log))  0L
}
if ((logx || logy)  !is.null(density))
stop(Cannot use shading lines in bars when log scale is used)
NR - nrow(height)
NC - ncol(height)
if (beside) {
if (length(space) == 2)
space - rep.int(c(space[2L], rep.int(space[1L],
NR - 1)), NC)
width - rep(width, length.out = NR)
}
else {
width - rep(width, length.out = NC)
}
offset - rep(as.vector(offset), length.out = length(width))
delta - width/2
w.r - cumsum(space + width)
w.m - w.r - delta
w.l - w.m - delta
log.dat - (logx  horiz) || (logy  !horiz)
if (log.dat) {
if (min(height + offset, na.rm = TRUE) = 0)
stop(log scale error: at least one 'height + offset' value = 0)
if (logx  !is.null(xlim)  min(xlim) = 0)
stop(log scale error: 'xlim' = 0)
if (logy  !is.null(ylim)  min(ylim) = 0)
stop(log scale error: 'ylim' = 0)
rectbase - if (logy  !horiz  !is.null(ylim))
ylim[1L]
else if (logx  horiz  !is.null(xlim))
xlim[1L]
else 0.9 * min(height, na.rm = TRUE)
}
else rectbase - 0
if (!beside)
height - rbind(rectbase, apply(height, 2L, cumsum))
rAdj - offset + (if (log.dat)
0.9 * height
else -0.01 * height)
delta - width/2
w.r - cumsum(space + width)
w.m - w.r - delta
w.l - w.m - delta
if (horiz) {
if (is.null(xlim))
xlim - range(rAdj, height + offset, na.rm = TRUE)
if (is.null(ylim))
ylim - c(min(w.l), max(w.r))
}
else {
if (is.null(xlim))
xlim - c(min(w.l), max(w.r))
if (is.null(ylim))
ylim - range(rAdj, height + offset, na.rm = TRUE)
}
if (beside)
w.m - matrix(w.m, ncol = NC)
if (plot) {
opar - if (horiz)
par(xaxs = i, xpd = xpd)
else par(yaxs = i, xpd = xpd)
on.exit(par(opar))
if (!add) {
plot.new()
plot.window(xlim, ylim, log = log, ...)
}
xyrect - function(x1, y1, x2, y2, horizontal = TRUE,
...) {
if (horizontal)
rect(x1, y1, x2, y2, ...)
else rect(y1, x1, y2, x2, ...)
}
if (beside)
xyrect(rectbase + offset, w.l, c(height) + offset,
w.r, horizontal = horiz, angle = angle, density = density,
col = col, border = border)
else {
for (i in 1L:NC) {
xyrect(height[1L:NR, i] + offset[i], w.l[i],
  height[-1, i] + offset[i], w.r[i], horizontal = horiz,
  angle = angle, density = density, col = col,
  border = border[ifelse(i  length(border), 1, i)])
 Line edited
}
}
if (axisnames  !is.null(names.arg)) {
at.l - if (length(names.arg) != length(w.m)) {
if (length(names.arg) == NC)
  colMeans(w.m)
else stop(incorrect number of names)
}
else w.m
axis(if (horiz)
2
else 1, at = at.l, labels = names.arg, lty = axis.lty,
cex.axis = cex.names, ...)
}
if (!is.null(legend.text)) {
legend.col - rep(col, length.out = length(legend.text))
if ((horiz  beside) || (!horiz  !beside)) {
legend.text -

Re: [R] apply a function down each column


Laetitia,

I was just responding to your comment that R complains
about a syntax error. But I realize now that 2x would
probably cause an unexpected symbol error.

Here's what I get when I run your loop; what do you get?

 for (x in 1:(nrow(dat)-1)) {
+  a - as.character(dat[(2x-1),1])
Error: unexpected symbol in:
for (x in 1:(nrow(dat)-1)) {
 a - as.character(dat[(2x
  b - as.character(dat[(2x),1])
Error: unexpected symbol in  b - as.character(dat[(2x
  lettermatch(a,b)
Error in strsplit(a, ) : object 'a' not found
 }
Error: unexpected '}' in }


and here's what I get when I fix the obvious syntax
error:

 for (x in 1:(nrow(dat)-1)) {
+  a - as.character(dat[(2*x-1),1])
+  b - as.character(dat[(2*x),1])
+  lettermatch(a,b)
+ }
Error in fix.by(by.x, x) : 'by' must specify valid column(s)


That leaves two problems:
1) you're looking at the wrong column in dat[,1]; that
   should be dat[,2], etc.
2) that error message indicates that your index variable (x)
   gets to invalid values.

Try this:

for (x in 1:(nrow(dat)/2)) {
 a - dat[(2*x-1),2]  # odd rows
 b - dat[(2*x),2]# even rows
 print(lettermatch(a,b))
}

You don't need the as.character() if you have character data.
Always do a str(dat) before you do any analysis.

 -Peter Ehlers

Laetitia Schmid wrote:

Dear Peter,
thank you for the suggestion.
Unfortunately the star did not help. Did it work for you? For me it seems 
incomplete somehow.
Laetitia


From: Peter Ehlers [ehl...@ucalgary.ca]
Sent: Tuesday, January 12, 2010 09:54 AM
To: Laetitia Schmid
Cc: Steve Lianoglou; r-help@r-project.org
Subject: Re: [R] apply a function down each column

See inline below.

Laetitia Schmid wrote:

Dear Steve,
my solution looks like it would work, but it does not.
I attached a text file with an extract of my data. Maybe you can try it
yourself. I want to compare C1 with M1, C2 with M2, C3 with M3,,, for
each column.
I do not really know what the problem is. R complains about a syntax error.
The function I am applying counts the common strings between the two.
Greg Hirson helped me to write it.

lettermatch - function(a, b) {
   tb - merge(as.data.frame(table(strsplit(a, ))),
as.data.frame(table(strsplit(b, ))), by=Var1)
   sum(apply(tb[-1], 1, min))
}

For example for the second column I tried:

for (x in 1:(nrow(dat)-1)) {
a - as.character(dat[(2x-1),1])


Shouldn't that be 2*x-1??

  -Peter Ehlers


b - as.character(dat[(2x),1])
 lettermatch(a,b)
}

or

 a - as.character(dat[seq(1, nrow(dat), by=2),2])
 b - as.character(dat[seq(2, nrow(dat), by=2), 2])
 all.results - lettermatch(a,b)

With dat-read.delim(data_lgs.txt,stringsAsFactors=FALSE) I can
leave the as.character away in the formula above.

Laetitia

IndividualsSeq1Seq2Seq3Seq4
C1AATTCCGGCTTT
M1
C2AATTCCGGCTTT
M2AGGGAACTCCGGCGTT
C3AGGGAACTCCGGCGTT
M3AGGGAACTCCGGCGTT
C4AATTCCGGCCTT
M4AAATCGGGCTTT
C5AGGGACTTCCCGCTTT
M5AGGGCTTTCCTT
C6AGGGCTTTCCTT
M6AAAGCCTTCTTT
C7AAAGACCCCCCGGTTT
M7AAGGAACCCCGG
C8AATTCCGGCCTT
M8AATTCCGGCCTT
C9
M9
C11AGGGAAACCGGGGGTT
M11AATTCCGGCCTT



Am 11.01.2010 um 15:18 schrieb Steve Lianoglou:


Hi,

On Mon, Jan 11, 2010 at 8:41 AM, Laetitia Schmid laeti...@gmt.su.se
wrote:

Hello World,
I have a function that makes pairwise comparisons between two
strings. I would like to apply this function to my data (which
consists of columns with different strings) in the way that it
compares the first with the second entry, and then the third with the
fourth, and then the fifth with the sixth, and so on down each column...
So (2x-1) and (2x) would be the different entries to be compared!

dat= my data:

for the first column: compare dat[(2x-1),1] with dat[(2x),1] and x
would be 1:i, i=length(dat[,1])

I think the best way to do that is a loop:

a - as.character(dat[(2x-1),1])
b - as.character(dat[(2x),1])

for (i in 1:length(dat[,1]) my_function(a, b))

Can somebody help me to apply a function with a loop in the way I
want to a column?

It seems as if you got it already, don't you?

for (x in 1:(nrow(dat)-1)) {
 a - dat[(2x-1),1]
 b - dat[(2x), 1]
 my_function(a,b)
}


Is there a specification of tapply for that?

I don't think so, but depending on what you want to do, the size of
your data, and the amount of RAM you have, it might be faster to
compare everything at once (assuming `my_function` can be
vectorized), for instance:

a - dat[seq(1, nrow(dat), by=2),1]
b - dat[seq(2, nrow(dat), by=2), 1]
all.results - my_function(a,b)

Also, as an aside, I see you keep calling as.character on your data
when you extract it from your data.frame.

Re: [R] sparseM and kronecker product_R latest version

2010-01-12 Thread alessia matano

I see, now I got it.

and thanks for the example with matrix.
best
alessia

2010/1/12 Martin Maechler maech...@stat.math.ethz.ch:
 am == alessia matano alexis@gmail.com
     on Mon, 11 Jan 2010 16:20:57 +0100 writes:

    am Many thanks for it.
    am However it is strange that when I put the numbers rather than ncol(R)
    am (a matrix with ncol=36698) it worked. Look below

     dim(res2)
    am [1] 170471  25822
     D- as.matrix.csr(0,nrow(tmpb),25822)
     D- as.matrix.csr(0,nrow(tmpb),ncol(res2))
    am Error in if (length(x) == nrow * ncol) x - matrix(x, nrow, ncol) else 
 { :
    am missing value where TRUE/FALSE needed
    am In addition: Warning message:
    am In nrow * ncol : NAs produced by integer overflow

    am But probably it is true what you said, anyway.

 yes, it is true.
 The clue is that  typeof(25822)   is double and not integer.

    am So, do you suggest me to use directly the simple matrix command? or a
    am kind of sparse matrix within your package?!?

 the latter, e.g.,

  library(Matrix)
  D - Matrix(0, nrow = 113289, ncol=36698)
  ## or
  D. - sparseMatrix(x=double(0), i=integer(0), j=integer(0),
                    dims = c(113289,36698))
  identical(D, D.) ##-- TRUE

 ## and, e.g.,

 Dk - kronecker(D, Diagonal(x=5:2))
 identical(Dk, D %x% Diagonal(x = 5:2))
 [1] TRUE
 dim(D)
 [1] 113289  36698
 dim(Dk)
 [1] 453156 146792


 Regards,
 Martin Maechler, ETH Zurich


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] optim: abnormal termination in lnsrch (resend)

2010-01-12 Thread Mario Valle


Attached a script that reproduces the problem.
My function is fold.val() and at the end seems the curve contained in 
lnsrch.dat is fitted quite well, but optim generates the error.


Thanks again!
mario
-

I'm using optim(param, fun, method='L-BFGS-B', lower=lo, upper=up) to 
minimize a certain function.

Often the minimization ends with the message:
ERROR: ABNORMAL_TERMINATION_IN_LNSRCH

What is optim() trying to say?
What have I to change in my function to make the minimization succeed?
Do you think using BBoptim() instead of optim() changes anything?

Thanks for your help!
mario


--
Ing. Mario Valle
Data Analysis and Visualization Group| 
http://www.cscs.ch/~mvalle

Swiss National Supercomputing Centre (CSCS)  | Tel:  +41 (91) 610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax:  +41 (91) 610.82.82

x1 - seq(0, 1, 0.1)

fold.val - function(x, mu, vr, thresh, smooth) {

a - pnorm(x1, mean=thresh, sd=smooth)
d - dnorm(x1, mean=mu, sd=sqrt(vr))
y - d*(1-a)

yr - rev(d*a)
xr - rev(2*thresh-x1)

yr1 - approx(xr, yr, x1)$y

yr1[is.na(yr1)] - 0
return(approx(x1, y+yr1, x)$y)
}

fold.err - function(p, xx, yy) {

r - fold.val(xx, mu=p[1], vr=p[2], thresh=p[3], smooth=p[4])
return(sum((r-yy)^2))
}


h - read.table(lnsrch.dat, col.names=c('x', 'y'))

param - c(.413, 0.00687, .5228, .01255)
up- c(.45,  0.0072,  .53,   .014)
lo- c(.4,   0.0060,  .52,   .011)

m - optim(param, fold.err, xx=h$x, yy=h$y, method='L-BFGS-B', lower=lo, 
upper=up)

cat(m$message, \n)

plot(h$x, h$y, type='l', xlab='Distance', ylab='Density')
lines(h$x, fold.val(h$x, mu=m$par[1], vr=m$par[2], thresh=m$par[3], 
smooth=m$par[4]), lty=3, lwd=2, col='orange')
0 0.000176362361606926
0.000586510263929619 0.000231378280914905
0.00117302052785924 0.000295589496143618
0.00175953079178886 0.000368478469738331
0.00234604105571848 0.000449107043434096
0.00293255131964809 0.00053621721990612
0.00351906158357771 0.000628406534146135
0.00410557184750733 0.000724352089678939
0.00469208211143695 0.000823019379524589
0.00527859237536657 0.000923769963807285
0.00586510263929619 0.0010263098797736
0.0064516129032258 0.00113046157071014
0.00703812316715543 0.00123574038696884
0.00762463343108504 0.00134128566231542
0.00821114369501466 0.00144585907333914
0.00879765395894428 0.00154814766557669
0.0093841642228739 0.00164720012094507
0.00997067448680352 0.00174279078421419
0.0105571847507331 0.00183552090932921
0.0111436950146628 0.00192657682698567
0.0117302052785924 0.00201721475915703
0.012316715542522 0.0021081579678769
0.0129032258064516 0.00219912372548760
0.0134897360703812 0.00228864332012703
0.0140762463343109 0.00237423713314793
0.0146627565982405 0.00245290831576602
0.0152492668621701 0.00252185164895379
0.0158357771260997 0.00257923683549348
0.0164222873900293 0.00262490019714924
0.0170087976539589 0.00266075875058930
0.0175953079178886 0.00269076521650816
0.0181818181818182 0.00272028739787151
0.0187683284457478 0.00275494219114877
0.0193548387096774 0.00279911977041479
0.0199413489736070 0.00285462086919127
0.0205278592375367 0.00291989826644678
0.0211143695014663 0.00299027032424890
0.0217008797653959 0.00305916799687892
0.0222873900293255 0.0031200913977286
0.0228739002932551 0.00316865054306491
0.0234604105571848 0.00320399165284089
0.0240469208211144 0.00322911512411607
0.024633431085044 0.00325009141115046
0.0252199413489736 0.00327448251393837
0.0258064516129032 0.00330863483141881
0.0263929618768328 0.00335664878021895
0.0269794721407625 0.00341998836253680
0.0275659824046921 0.00349792087037627
0.0281524926686217 0.00358846480856140
0.0287390029325513 0.00368937637895126
0.0293255131964809 0.00379878944173669
0.0299120234604106 0.00391534984847043
0.0304985337243402 0.00403793448762949
0.0310850439882698 0.00416520590831426
0.0316715542521994 0.00429527045546386
0.032258064516129 0.00442560376781198
0.0328445747800587 0.00455325817307603
0.0334310850439883 0.00467525233059903
0.0340175953079179 0.00478900552079331
0.0346041055718475 0.00489270623934628
0.0351906158357771 0.0049809321012
0.0357771260997067 0.00506785770230776
0.0363636363636364 0.00514095495458048
0.036950146627566 0.00520698251869515
0.0375366568914956 0.00526847101068982
0.0381231671554252 0.00532783817372676
0.0387096774193548 0.00538686868362264
0.0392961876832845 0.00544630067045661
0.0398826979472141 0.00550562536463334
0.0404692082111437 0.00556316103457661
0.0410557184750733 0.00561640167932586
0.0416422287390029 0.00566258222546389
0.0422287390029326 0.00569935446480522
0.0428152492668622 0.00572527363548914
0.0434017595307918 0.00574066288496805
0.0439882697947214 0.00574780398040942
0.044574780058651 0.00575013267506571
0.0451612903225806 0.00575179443458933
0.0457478005865103 0.00575684258138564
0.0463343108504399 0.00576851671527049
0.0469208211143695

[R] how to handle missing values . when importing data in R

2010-01-12 Thread karena


hi, I have a question about importing data in R.

I want to import a file which has missing value in it, and the missing
values are denoted as ., I want to first read in the file, and then change
the . into the number zero 0.

how can I do that?

thank you,

karena
-- 
View this message in context: 
http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012298.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Functions for QUAIDS and nonlinear SUR?

2010-01-12 Thread Arne Henningsen

On Sat, Jan 9, 2010 at 1:21 AM, Werner W. pensterfuz...@yahoo.de wrote:
 I would like to estimate a quadratic almost ideal demand system in R which is
 estimated usually by nonlinear seemingly unrelated regression. But there is no
 such function in R yet

The systemfit package has the function nlsystemfit() for estimating
systems of non-linear equations, e.g. by non-linear SUR. However, in
contrast to the systemfit() function for estimating systems of linear
equations, the function nlsystemfit() is still under development and
has convergence problems rather often. So, I cannot recommend using
nlsystemfit() for an important analysis. :-(

 but it is readily available in STATA (nlsur), see B. Poi (2008): Demand-system
 estimation: Update, Stata Journal 8(4).
 Now I am thinking, what is quicker learning to program STATA which seems
 not really comfortable for programming or implement the method in R which
 might be above my head in terms of econometrics.

You do not have to start from scratch but you could improve the
nlsystemfit() function, e.g. by implementing analytical gradients of
the objective function -- and I could assist you with this. If you are
interested in improving nlsystemfit(), please apply at R-Forge [1] for
getting write access to systemfit's SVN repository.

[1] http://r-forge.r-project.org/projects/systemfit/

/Arne

-- 
Arne Henningsen
http://www.arne-henningsen.name

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to handle missing values . when importing data in R

2010-01-12 Thread jim holtman

?read.table

na.strings='.'

Then change all NAs to zero df$col[is.na(df$col)] - 0

On Tue, Jan 12, 2010 at 12:46 PM, karena dr.jz...@gmail.com wrote:


 hi, I have a question about importing data in R.

 I want to import a file which has missing value in it, and the missing
 values are denoted as ., I want to first read in the file, and then
 change
 the . into the number zero 0.

 how can I do that?

 thank you,

 karena
 --
 View this message in context:
 http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012298.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] getting p values

2010-01-12 Thread Duncan Murdoch


On 12/01/2010 10:47 AM, Rosario Garcia Gil wrote:

Dear colleges

I need to get the p values for a table with 15000 entries of t values. Does any 
of you know how to do it? I can, of course, get one by one but that is not 
sensible.

  
Put the t values into a vector, then use pt() in an appropriate way to 
calculate them all at once.  (An appropriate way depends on details 
like whether you want one or two tailed value, degrees of freedom, etc.)


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Calculate the percentages of the numbers in every column.

2010-01-12 Thread Kelvin

Dear friends,

I have a table like this, I have A B C D ... levels, the first column
you see is just the index, and there are different numbers in the
table.

  A  B  C  D  ...
10   2   1   0
21   0   2   1
32   3   0   0
40   0   1   0
50   2   3   1
...

I want to calculate the frequencies or the percentages of the numbers
in every column.

How do I get a table like this, the first column is the levels of
numbers, and the numbers inside the table are the percentages. All the
percentages should add up to 1 in every column.

  A B  C D   ...
  0  0.2   0.3   0.1   0.1
  1  0.1   0.1   0.2   0.1
  2  0.1   0.2   0.2   0.2
  3  0.2   0.1   0.1   0
  ...

Thanks your help!

Kelvin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to handle missing values . when importing data in R

2010-01-12 Thread karena

Hi, tim,

thank you very much for the reply, but I am really a new user. How to change
all NAs to zero?

thanks again.

karena

jholtman wrote:

?read.table

na.strings='.'

Then change all NAs to zero df$col[is.na(df$col)] - 0

On Tue, Jan 12, 2010 at 12:46 PM, karena dr.jz...@gmail.com wrote:

hi, I have a question about importing data in R.

I want to import a file which has missing value in it, and the missing
values are denoted as ., I want to first read in the file, and then
change
the . into the number zero 0.

how can I do that?

thank you,

karena
--
View this message in context:
http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012298.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
View this message in context:
http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012318.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] optimization challenge

I have a challenge that I want to share with the group.

This is not homework (but I may assign it as such if I teach the appropriate 
class again) and I have found one solution, so don't need anything urgent.  
This is more for fun to see if others can find a better solution than I did.

The challenge:

I want to read a book in a given number of days.  I want to read an integer 
number of chapters each day (there are more chapters than days), no stopping 
part way through a chapter, and at least 1 chapter each day.  The chapters are 
very non uniform in length (some very short, a few very long, many in between) 
so I would like to come up with a reading schedule that minimizes the variance 
of the length of the days readings (read multiple short chapters on the same 
day, long chapters are the only one read that day).  I also want to read 
through the book in order (no skipping ahead to combine short chapters that are 
not naturally next to each other.

My thought was that the optim function with method=SANN would be an 
appropriate approach, but my first couple of tries did not give very good 
results.  I have since come up with an optim with SANN solution that gives what 
I consider good results (but I accept that better is possible).

Below is a data frame with the lengths of the chapters for the book that 
originally sparked the challenge for me (but the general idea should work for 
any book).  Each row represents a chapter (in order) with 3 different measures 
of the length of the chapter.

For this challenge I want to read the book in 128 days (there are 239 chapters).

I will post my solutions in a few days, but I want to wait so that my direction 
does not influence people from trying other approaches (if there is something 
better than optim, that is fine).

Good luck for anyone interested in the challenge,

The data frame:

bom3 - structure(list(Chapter = structure(1:239, .Label = c(1 Nephi 1, 
1 Nephi 2, 1 Nephi 3, 1 Nephi 4, 1 Nephi 5, 1 Nephi 6, 
1 Nephi 7, 1 Nephi 8, 1 Nephi 9, 1 Nephi 10, 1 Nephi 11, 
1 Nephi 12, 1 Nephi 13, 1 Nephi 14, 1 Nephi 15, 1 Nephi 16, 
1 Nephi 17, 1 Nephi 18, 1 Nephi 19, 1 Nephi 20, 1 Nephi 21, 
1 Nephi 22, 2 Nephi 1, 2 Nephi 2, 2 Nephi 3, 2 Nephi 4, 
2 Nephi 5, 2 Nephi 6, 2 Nephi 7, 2 Nephi 8, 2 Nephi 9, 
2 Nephi 10, 2 Nephi 11, 2 Nephi 12, 2 Nephi 13, 2 Nephi 14, 
2 Nephi 15, 2 Nephi 16, 2 Nephi 17, 2 Nephi 18, 2 Nephi 19, 
2 Nephi 20, 2 Nephi 21, 2 Nephi 22, 2 Nephi 23, 2 Nephi 24, 
2 Nephi 25, 2 Nephi 26, 2 Nephi 27, 2 Nephi 28, 2 Nephi 29, 
2 Nephi 30, 2 Nephi 31, 2 Nephi 32, 2 Nephi 33, Jacob 1, 
Jacob 2, Jacob 3, Jacob 4, Jacob 5, Jacob 6, Jacob 7, 
Enos 1, Jarom 1, Omni 1, Words of Mormon 1, Mosiah 1, 
Mosiah 2, Mosiah 3, Mosiah 4, Mosiah 5, Mosiah 6, Mosiah 7, 
Mosiah 8, Mosiah 9, Mosiah 10, Mosiah 11, Mosiah 12, 
Mosiah 13, Mosiah 14, Mosiah 15, Mosiah 16, Mosiah 17, 
Mosiah 18, Mosiah 19, Mosiah 20, Mosiah 21, Mosiah 22, 
Mosiah 23, Mosiah 24, Mosiah 25, Mosiah 26, Mosiah 27, 
Mosiah 28, Mosiah 29, Alma 1, Alma 2, Alma 3, Alma 4, 
Alma 5, Alma 6, Alma 7, Alma 8, Alma 9, Alma 10, 
Alma 11, Alma 12, Alma 13, Alma 14, Alma 15, Alma 16, 
Alma 17, Alma 18, Alma 19, Alma 20, Alma 21, Alma 22, 
Alma 23, Alma 24, Alma 25, Alma 26, Alma 27, Alma 28, 
Alma 29, Alma 30, Alma 31, Alma 32, Alma 33, Alma 34, 
Alma 35, Alma 36, Alma 37, Alma 38, Alma 39, Alma 40, 
Alma 41, Alma 42, Alma 43, Alma 44, Alma 45, Alma 46, 
Alma 47, Alma 48, Alma 49, Alma 50, Alma 51, Alma 52, 
Alma 53, Alma 54, Alma 55, Alma 56, Alma 57, Alma 58, 
Alma 59, Alma 60, Alma 61, Alma 62, Alma 63, Helaman 1, 
Helaman 2, Helaman 3, Helaman 4, Helaman 5, Helaman 6, 
Helaman 7, Helaman 8, Helaman 9, Helaman 10, Helaman 11, 
Helaman 12, Helaman 13, Helaman 14, Helaman 15, Helaman 16, 
3 Nephi 1, 3 Nephi 2, 3 Nephi 3, 3 Nephi 4, 3 Nephi 5, 
3 Nephi 6, 3 Nephi 7, 3 Nephi 8, 3 Nephi 9, 3 Nephi 10, 
3 Nephi 11, 3 Nephi 12, 3 Nephi 13, 3 Nephi 14, 3 Nephi 15, 
3 Nephi 16, 3 Nephi 17, 3 Nephi 18, 3 Nephi 19, 3 Nephi 20, 
3 Nephi 21, 3 Nephi 22, 3 Nephi 23, 3 Nephi 24, 3 Nephi 25, 
3 Nephi 26, 3 Nephi 27, 3 Nephi 28, 3 Nephi 29, 3 Nephi 30, 
4 Nephi 1, Mormon 1, Mormon 2, Mormon 3, Mormon 4, 
Mormon 5, Mormon 6, Mormon 7, Mormon 8, Mormon 9, Ether 1, 
Ether 2, Ether 3, Ether 4, Ether 5, Ether 6, Ether 7, 
Ether 8, Ether 9, Ether 10, Ether 11, Ether 12, Ether 13, 
Ether 14, Ether 15, Moroni 1, Moroni 2, Moroni 3, Moroni 4, 
Moroni 5, Moroni 6, Moroni 7, Moroni 8, Moroni 9, Moroni 10
), class = factor), Words = c(908L, 879L, 1067L, 1262L, 761L, 
202L, 992L, 1221L, 259L, 924L, 1315L, 860L, 1899L, 1284L, 1488L, 
1618L, 2523L, 1217L, 1292L, 698L, 945L, 1506L, 1543L, 1460L, 
1170L, 1300L, 1169L, 895L, 405L, 812L, 2388L, 966L, 338L, 647L, 
587L, 203L, 857L, 370L, 687L, 570L, 587L, 928L, 520L, 134L, 587L, 
891L, 1699L, 1483L, 1461L, 1240L, 804L, 708L, 988L, 426L, 647L, 
719L, 1365L, 619L, 929L, 3758L, 511L, 1242L, 1160L, 734L, 1398L, 
857L, 966L, 2112L, 1117L, 1605L, 740L, 309L, 1555L, 938L, 864L, 
957L, 1271L,

[R] Non-metric multidimensional scaling (NMDS) help

2010-01-12 Thread kellys17


Hi,

 I am currently working on some data and feel that NMDS would return an
excellent result. With my current data set however I have been experiencing
some problems and cannot carry out metaMDS. I have tried with a few smaller
data sets which I created for practice sake and this has worked fine.

 I think it is the set up of my data set that is causing me trouble. I have
18 columns and 18 rows, as needed for the n x n matrix. However, within the
data set I have a lot of zeros, i.e. more than just the zeros where column B
meets row B. Do I need to get rid of these excess zeros in order for metaMDS
to work?

 Any help is much appreciated,

 Seán Kelly.
 
-- 
View this message in context: 
http://n4.nabble.com/Non-metric-multidimensional-scaling-NMDS-help-tp1012336p1012336.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] svm

2010-01-12 Thread Amy Hessen


Hi Steve,

 

Thank you so much for your reply. I really needed to know how SVM works without 
removing the class label while receiving it in the formula parameter. It does 
not if I remove the class label.
 
 
Cheers,
Amy
 

 Date: Sat, 9 Jan 2010 15:48:49 -0500
 Subject: Re: [R] svm
 From: mailinglist.honey...@gmail.com
 To: amy_4_5...@hotmail.com
 CC: r-help@r-project.org
 
 Hi,
 
 On Fri, Jan 8, 2010 at 11:57 AM, Amy Hessen amy_4_5...@hotmail.com wrote:
  Hi Steve,
 
  Thank you very much for your reply. Your code is more readable and obvious 
  than mine
 
 No Problem.
 
  Could you please help me in these questions?:
 
  1) Formula is an alternative to y parameter in SVM. is it correct?
 
 No, that's not correct.
 
 There are two svm functions, one that takes a formula object
 (svm.formula), and one that takes an x matrix, and a y vector
 (svm.default). The svm.formula function is called when the first
 argument in your svm(..) call is a formula object. This function
 simply parses the formula and manipulates your data object into an x
 matrix and y vector, then calls the svm.default function with those
 params ... I usually prefer to just skip the formula and provide the x
 and y objects directly.
 
 Load the e1071 library and look at the source code:
 
 R library(e1071)
 R e1071:::svm.formula
 
 You'll see what I mean.
 
  2) I forgot to remove the class label from the dataset besides I gave the
  program the class label in formula parameter but the program works! Could
  you please clarify this point to me?
 
 The author of the e1071 package did you a favor. The predict.svm
 function checks to see if your svm object was built using the formula
 interface .. if so, it looks for you label column in the data you are
 trying to predict on and ignores it.
 
 Look at the function's source code (eg, type e1071:::predict.svm at
 the R prompt), and look for the call to the delete.response function
 ... you can also look at the help in ?delete.response.
 
 -steve
 
 
  Date: Wed, 6 Jan 2010 18:44:13 -0500
  Subject: Re: [R] svm
  From: mailinglist.honey...@gmail.com
  To: amy_4_5...@hotmail.com
  CC: r-help@r-project.org
 
  Hi Amy,
 
  On Wed, Jan 6, 2010 at 4:33 PM, Amy Hessen amy_4_5...@hotmail.com wrote:
   Hi Steve,
  
   Thank you very much for your reply.
  
   Im trying to do something systematic/general in the program so that I
   can
   try different datasets without changing much in the program (without
   knowing
   the name of the class label that has different name from dataset to
   another)
  
   Could you please tell me your opinion about this code:-
  
   library(e1071)
  
   mydata-read.delim(the_whole_dataset.txt)
  
   class_label - names(mydata)[1]# Ill always put
   the
   class label in the first column.
  
   myformula - formula(paste(class_label,~ .))
  
   x - subset(mydata, select = - mydata[, 1])
  
   mymodel-(svm(myformula, x, cross=3))
  
   summary(model)
  
   
 
  Since you're not doing anything funky with the formula, a preference
  of mine is to just skip this way of calling SVM and go straight to
  the svm(x,y,...) method:
 
  R mydata - as.matrix(read.delim(the_whole_dataset.txt))
  R train.x - mydata[,-1]
  R train.y - mydata[,1]
 
  R mymodel - svm(train.x, train.y, cross=3, type=C-classification)
  ## or
  R mymodel - svm(train.x, train.y, cross=3, type=eps-regression)
 
  As an aside, I also like to be explicit about the type= parameter to
  tell what I want my SVM to do (regression or classification). If it's
  not specified, the SVM picks which one to do based on whether or not
  your y vector is a vector of factors (does classification), or not
  (does regression)
 
   Do I have to the same steps with testingset? i.e. the testing set must
   not
   contain the label too? But contains the same structure as the training
   set?
   Is it correct?
 
  I guess you'll want to report your accuracy/MSE/something on your
  model for your testing set? Just load the data in the same way then
  use `predict` to calculate the metric your after. You'll have to have
  the labels for your data to do that, though, eg:
 
  testdata - as.matrix(read.delim('testdata.txt'))
  test.x - testdata[,-1]
  test.y - testdata[,1]
  preds - predict(mymodel, test.x)
 
  Let's assume you're doing classification, so let's report the accuracy:
 
  acc - sum(preds == test.y) / length(test.y)
 
  Does that help?
  -steve
 
  --
  Steve Lianoglou
  Graduate Student: Computational Systems Biology
  | Memorial Sloan-Kettering Cancer Center
  | Weill Medical College of Cornell University
  Contact Info: http://cbio.mskcc.org/~lianos/contact
 
  
  Sell your old one fast! Time for a new car?
 
 
 
 -- 
 Steve Lianoglou
 Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
 Contact Info: http://cbio.mskcc.org/~lianos/contact

Re: [R] how to handle missing values . when importing data in

2010-01-12 Thread Ted Harding

On 12-Jan-10 17:46:47, karena wrote:
 hi, I have a question about importing data in R.
 
 I want to import a file which has missing value in it, and the missing
 values are denoted as ., I want to first read in the file, and then
 change the . into the number zero 0.
 
 how can I do that?
 
 thank you,
 
 karena

It may depend on what format the file is in, but if it is a tabular
text file or a CSV file then you can use the na.strings parameter.
Here is an example of a little CSV file with . used for missing:

file temp.csv:
--
A,B,C,D
1.1,1.2,1.3,1.4
2.1,2.2,.,2.4
3.1,.,3.3,3.4
4.1,.,.,4.4

  D - read.csv(temp.csv,na.strings=.)
  D
  # A   B   C   D
  # 1 1.1 1.2 1.3 1.4
  # 2 2.1 2.2  NA 2.4
  # 3 3.1  NA 3.3 3.4
  # 4 4.1  NA  NA 4.4

So the . have gone in as NA (the right thing to do in the first
instance with missing data). Now you can replace these by zeros:

  D[is.na(D)] - 0
  D
  # 1 1.1 1.2 1.3 1.4
  # 2 2.1 2.2 0.0 2.4
  # 3 3.1 0.0 3.3 3.4
  # 4 4.1 0.0 0.0 4.4

Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 12-Jan-10   Time: 18:42:40
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] optim: abnormal termination in lnsrch (resend)

2010-01-12 Thread Ravi Varadhan

Mario,

It seems likely that your function is not smooth in the parameters.  This
may create problems for some optimizers that require smoothness.  However, I
was able to get good convergence with `spg' function in my BB package.  

Here is how it works:

 require(BB)
Loading required package: BB
Loading required package: numDeriv

 m - spg(param, fold.err, xx=h$x, yy=h$y, lower=lo, upper=up)
iter:  0  f-value:  7.597257  pgrad:  0.037 
iter:  10  f-value:  7.551395  pgrad:  0.03674868 
iter:  20  f-value:  7.5513  pgrad:  0.02421642 
iter:  30  f-value:  7.551299  pgrad:  1.865619e-05 
 m
$par
  [,1][,2] [,3]   [,4]
[1,] 0.4132586 0.006864837 0.522723 0.01279469

$value
[1] 7.551299

$gradient
[1] 3.019807e-08

$fn.reduction
[1] 0.04595781

$iter
[1] 34

$feval
[1] 44

$convergence
[1] 0

$message
[1] Successful convergence
 
It is also interesting that `spg' converges well from a random, infeasible
starting point.

 set.seed(123)
 m - spg(runif(4), fold.err, xx=h$x, yy=h$y, lower=lo, upper=up)
iter:  0  f-value:  102.6793  pgrad:  0.05 
iter:  10  f-value:  7.552826  pgrad:  0.01328252 
iter:  20  f-value:  7.551299  pgrad:  0.03674152 
iter:  30  f-value:  7.551299  pgrad:  0.003764237 


Hope this helps,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Mario Valle
Sent: Tuesday, January 12, 2010 12:45 PM
To: R-help@r-project.org
Subject: [R] optim: abnormal termination in lnsrch (resend)

Attached a script that reproduces the problem.
My function is fold.val() and at the end seems the curve contained in
lnsrch.dat is fitted quite well, but optim generates the error.

Thanks again!
mario
-

I'm using optim(param, fun, method='L-BFGS-B', lower=lo, upper=up) to
minimize a certain function.
Often the minimization ends with the message:
ERROR: ABNORMAL_TERMINATION_IN_LNSRCH

What is optim() trying to say?
What have I to change in my function to make the minimization succeed?
Do you think using BBoptim() instead of optim() changes anything?

Thanks for your help!
mario


--
Ing. Mario Valle
Data Analysis and Visualization Group| 
http://www.cscs.ch/~mvalle
Swiss National Supercomputing Centre (CSCS)  | Tel:  +41 (91) 610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax:  +41 (91) 610.82.82

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Conditional Sampling

The last 2 lines of your code can be replaced with:

M - replicate(1000, sample(Y,5,replace=FALSE) )



-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Ted Harding
 Sent: Tuesday, January 12, 2010 7:34 AM
 To: r-help@r-project.org
 Cc: ehcpieterse
 Subject: Re: [R] Conditional Sampling
 
 On 12-Jan-10 14:00:24, ehcpieterse wrote:
  Thanks Ted, your solution does make perfect sense.
 
  The only question I still have is that I would like to sample
  the remaining 5 observations after I have randomly selected the
  first 10. Given the initial 10, I would like to sample the
  following 5 say 1,000 times to get a simulated conditional sample,
  if that makes any sense.
 
  I want to build this into an iterative process to see how the
  first sample affects the resulting samples. Even though all the
  observations have the same probabilty to get sampled, they each
  have a different expected value.
  --
 
 OK, if I now understand you, you are interested in the properties
 of the remaining (90) observations, given that they do not include
 any of the (10) cases sampled in the first round.
 
 In that case, I think you should adopt the sample.int() approach
 I also suggested:
 
   X - (1:100) ## (or any other 100 values)
   n - sample.int(100,10,replace=FALSE) ## returns subset of (1:100)
   x - X[n]
   Y - X[-n]   ## The set remaining after the first 10 were taken
   ## Now you can sample repeatedly from Y until your eyes fall out.
   ## So build up a matrix of (say) 1000 samples from Y:
   M - sample(Y,5,replace=FALSE)
   for(i in (2:1000)){ M - rbind(M,sample(Y,5,replace=FALSE)) }
 
 The repeated samples M of 5 from Y of course imply replacing each
 sample of 5 back in Y, so they are available at each turn. You can
 not, of course, sample 1000*5 from 100 without replacement! (Each
 sample of 5 is obtained without replacement, however).
 
 I hope this is getting close!
 Ted.
 
 
 E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
 Fax-to-email: +44 (0)870 094 0861
 Date: 12-Jan-10   Time: 14:34:13
 -- XFMail --
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] post-hoc after ancova

2010-01-12 Thread Mahua Ghara

I have done ancova with categorical and continuous predictor variables.
The categorical predictor variable shows significant effect on the dependent
variable.
I would like to do a post-hoc test to see which groups in the categorical
variable differ.

I have explored Tukey test in multcomp package. My study is similar to the
litter data. In the code it's mentioned that the contrast matrix also has
some trends like otrend, atrend and ltrend.

otrend = c(-1.5, -0.5, 0.5, 1.5),
atrend = doselev - mean(doselev),
ltrend = log(1:4) - mean(log(1:4)))

Here are my questions:

Are this trends absolutely essential for conducting the Tukey test?
If yes, how can I set these trends?


thanks,
mahua

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] optim: abnormal termination in lnsrch (resend)

2010-01-12 Thread Berend Hasselman



Mario Valle wrote:
 
 [sorry, forgot some details...]
 
 I'm using optim(param, fun, method='L-BFGS-B', lower=lo, upper=up) to 
 minimize a certain function.
 Often the minimization ends with the message:
 ERROR: ABNORMAL_TERMINATION_IN_LNSRCH
 
 What is optim() trying to say?
 What have I to change in my function to make the minimization succeed?
 Do you think using BBoptim() instead of optim() changes anything?
 

We need more information.
You can also try the additional argument to optim for tracing  the
optimization.
Use optim(param, fun, method='L-BFGS-B', lower=lo, upper=up
,control=list(trace=6))
to get more information.

Berend
-- 
View this message in context: 
http://n4.nabble.com/optim-abnormal-termination-in-lnsrch-resend-tp1012255p1012370.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Making routine faster by using apply instead of for-loop

2010-01-12 Thread Etienne Stockhausen


Hey everybody,

I have a small problem with a routine, which prepares some data for
plotting.
I've made a small example:

   c=10
   mat=data.frame(matrix(1:(c*c),c,c))
   row.names(mat)=seq(c,1,length=c)
   names(mat)=c(seq(2,c,length=c/2),seq(c,2,length=c/2))
   v=as.numeric(row.names(mat))
   w=as.numeric(names(mat))
   for(i in 1:c)
   { for(j in 1:c)
   {
   if(v[j]+w[i]=c)(mat[i,j]=NA)
   }}

This produces exactly the data I need to go on, but if I increase the
constant c ,to for instance 500 , it takes a very long time to set the NA's.
I've heard there is a much faster way to set the NA's using the command
apply( ), but I don't know how.
I'm looking forward for any ideas or hints, that might help me.

Best regards

Etienne

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to handle missing values . when importing data in R

2010-01-12 Thread jim holtman

What is the structure of the data that you are reading in? Are you using
'read.table', 'scan', etc.? Are all the columns numeric, or do you just
want to change some of them? If you have used 'na.strings' to cause the
values of the missing data to be set to NA, then you can iterate through the
appropriate columns to change NAs to zero, but this depends on your
structure. For example if you know the names of the columns, you could do:

for (i in c('col1', 'col3', 'col8')) df[[i]][is.na(df[[i]])] - 0

On Tue, Jan 12, 2010 at 1:06 PM, karena dr.jz...@gmail.com wrote:

Hi, tim,

thank you very much for the reply, but I am really a new user. How to
change
all NAs to zero?

thanks again.

karena

jholtman wrote:

?read.table

na.strings='.'

Then change all NAs to zero df$col[is.na(df$col)] - 0

On Tue, Jan 12, 2010 at 12:46 PM, karena dr.jz...@gmail.com wrote:

hi, I have a question about importing data in R.

I want to import a file which has missing value in it, and the missing
values are denoted as ., I want to first read in the file, and then
change
the . into the number zero 0.

how can I do that?

thank you,

karena
--
View this message in context:

http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012298.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

--
View this message in context:
http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012318.html
Sent from the R help mailing list archive at Nabble.com.

--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] expand.grid game

This also has a closed form solution:

 choose(16+8-1,7) - choose(7+8-1, 7) - 7*choose(6+8-1,7)
[1] 229713


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Brian Diggs
 Sent: Thursday, December 31, 2009 3:08 PM
 To: baptiste auguie; David Winsemius
 Cc: r-help
 Subject: Re: [R] expand.grid game
 
 baptiste auguie wrote:
  2009/12/19 David Winsemius dwinsem...@comcast.net:
  On Dec 19, 2009, at 9:06 AM, baptiste auguie wrote:
 
  Dear list,
 
  In a little numbers game, I've hit a performance snag and I'm not
 sure
  how to code this in C.
 
  The game is the following: how many 8-digit numbers have the sum of
  their digits equal to 17?
  And are you considering the number 0089 to be in the
 acceptable set?
  Or is the range of possible numbers in 1079:9800 ?
 
 
  The latter, the first digit should not be 0. But if you have an
  interesting solution for the other case, let me know anyway.
 
  I should also stress that this is only for entertainment and
 curiosity's sake.
 
  baptiste
 
 
 I realize I'm late coming to this, but I was reading it in my post-
 vacation catch-up and it sounded interesting so I thought I'd give it a
 shot.
 
 After coding a couple of solutions that were exponential in time (for
 the number of digits), I rearranged things and came up with something
 that is linear in time (for the number of digits) and gives the count
 of numbers for all sums at once:
 
 library(plyr)
 nsum3 - function(digits) {
   digits - as.integer(digits)[[1L]]
   if (digits==1) {
 rep(1,9)
   } else {
 dm1 - nsum3(digits-1)
 Reduce(+, llply(0:9, function(x) {c(rep(0,x),dm1,rep(0,9-x))}))
   }
 }
 
 nsums - llply(1:8, nsum3)
 nsums[[5]][17]
 # [1] 3675
 nsums[[8]][17]
 # [1] 229713
 
 The whole thing runs in well under a second on my machine (a several
 years old dual core Windows machine).  In the results of nsum3, the i-
 th element is the number of numbers whose digits sum to i.  The basic
 idea is recursion on the number of digits; if n_{t,d} is the number of
 d-digit numbers that sum to t, then n_{t,d} = \sum_{i\in(0,9)} n_{t-
 i,d-1}. (Adding the digit i to each of those numbers makes their sum
 t and increases the digits to d).  When digits==1, then 0 isn't a valid
 choice and that also implies the sum of digits can't be 0, which fits
 well with the 1 indexing of arrays.
 
 --
 Brian Diggs, Ph.D.
 Senior Research Associate, Department of Surgery, Oregon Health 
 Science University
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The TeX-source for the package manual.

2010-01-12 Thread Rolf Turner



On 13/01/2010, at 3:40 AM, BXC (Bendix Carstensen) wrote:

I have noted that the later versions of Rcmd check cleans out the  
directory pkg.Rcheck so that only package-manual.log and package- 
manual.pdf are left.
Formerly the package-manual.tex was around too --- very handy for  
various purposes.


Is there a way to generate the .tex - version of the manual for a  
package?


On unix-alike systems one can do

R CMD Rd2dvi --no-clean package name

and then look in a (hidden) directory .Rd2dvinnn where ``nnn''
represents a 3 digit number.

You get a message saying

 You may want to clean up by 'rm -rf .Rd2dvinnn' 

which tells you the value of ``nnn''.  The tex file you want
is called Rd2.tex.

There is probably a similar incantation that works under Windoze.

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] expand.grid game

2010-01-12 Thread baptiste auguie

Nice --- am I missing something or was this closed form solution not
entirely trivial to find?

I ought to compile the various clever solutions given in this thread
someday, it's fascinating!

Thanks,

baptiste

2010/1/12 Greg Snow greg.s...@imail.org:
 This also has a closed form solution:

 choose(16+8-1,7) - choose(7+8-1, 7) - 7*choose(6+8-1,7)
 [1] 229713


 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Brian Diggs
 Sent: Thursday, December 31, 2009 3:08 PM
 To: baptiste auguie; David Winsemius
 Cc: r-help
 Subject: Re: [R] expand.grid game

 baptiste auguie wrote:
  2009/12/19 David Winsemius dwinsem...@comcast.net:
  On Dec 19, 2009, at 9:06 AM, baptiste auguie wrote:
 
  Dear list,
 
  In a little numbers game, I've hit a performance snag and I'm not
 sure
  how to code this in C.
 
  The game is the following: how many 8-digit numbers have the sum of
  their digits equal to 17?
  And are you considering the number 0089 to be in the
 acceptable set?
  Or is the range of possible numbers in 1079:9800 ?
 
 
  The latter, the first digit should not be 0. But if you have an
  interesting solution for the other case, let me know anyway.
 
  I should also stress that this is only for entertainment and
 curiosity's sake.
 
  baptiste
 

 I realize I'm late coming to this, but I was reading it in my post-
 vacation catch-up and it sounded interesting so I thought I'd give it a
 shot.

 After coding a couple of solutions that were exponential in time (for
 the number of digits), I rearranged things and came up with something
 that is linear in time (for the number of digits) and gives the count
 of numbers for all sums at once:

 library(plyr)
 nsum3 - function(digits) {
   digits - as.integer(digits)[[1L]]
   if (digits==1) {
     rep(1,9)
   } else {
     dm1 - nsum3(digits-1)
     Reduce(+, llply(0:9, function(x) {c(rep(0,x),dm1,rep(0,9-x))}))
   }
 }

 nsums - llply(1:8, nsum3)
 nsums[[5]][17]
 # [1] 3675
 nsums[[8]][17]
 # [1] 229713

 The whole thing runs in well under a second on my machine (a several
 years old dual core Windows machine).  In the results of nsum3, the i-
 th element is the number of numbers whose digits sum to i.  The basic
 idea is recursion on the number of digits; if n_{t,d} is the number of
 d-digit numbers that sum to t, then n_{t,d} = \sum_{i\in(0,9)} n_{t-
 i,d-1}. (Adding the digit i to each of those numbers makes their sum
 t and increases the digits to d).  When digits==1, then 0 isn't a valid
 choice and that also implies the sum of digits can't be 0, which fits
 well with the 1 indexing of arrays.

 --
 Brian Diggs, Ph.D.
 Senior Research Associate, Department of Surgery, Oregon Health 
 Science University

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Drop last numeral

2010-01-12 Thread LCOG1


Hello all, 
  Frustrated and i know you can help 

I need to drop the last numeral of each of my values in my data set.  So for
the following i have tried the ?substring but since i have to specify the
length, but because my data are of varying lengths it doenst work so well

Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,,
2241 ,2242,414342 ,414371 ,414372)
Bldgid-substring(as.character(Data),1,3)

returns:
113 113 173 173 182 182 222 222 224 224 414 414
414

but i want

113, 113, 173 ,173 ,182 ,182, 222 ,222, 224 ,224,41434
,41437 ,41437)

The values thats have more than 4 numerals are whats messing things up. 
Tried ?formatC as well but couldn't get it to coerce things correctly. 
Thanks for the help

JR
-- 
View this message in context: 
http://n4.nabble.com/Drop-last-numeral-tp1012347p1012347.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drop last numeral

2010-01-12 Thread Nutter, Benjamin

Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,,
2241 ,2242,414342 ,414371 ,414372)
Bldgid-substring(as.character(Data),1,nchar(Data)-1)

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of LCOG1
Sent: Tuesday, January 12, 2010 1:37 PM
To: r-help@r-project.org
Subject: [R] Drop last numeral

Hello all,
  Frustrated and i know you can help 

I need to drop the last numeral of each of my values in my data set.  So
for the following i have tried the ?substring but since i have to
specify the length, but because my data are of varying lengths it doenst
work so well

Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,,
2241 ,2242,414342 ,414371 ,414372)
Bldgid-substring(as.character(Data),1,3)

returns:
113 113 173 173 182 182 222 222 224 224 414 414
414

but i want

113, 113, 173 ,173 ,182 ,182, 222 ,222, 224
,224,41434
,41437 ,41437)

The values thats have more than 4 numerals are whats messing things up. 
Tried ?formatC as well but couldn't get it to coerce things correctly. 
Thanks for the help

JR
--
View this message in context:
http://n4.nabble.com/Drop-last-numeral-tp1012347p1012347.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News  World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.

Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Non-metric multidimensional scaling (NMDS) help

2010-01-12 Thread Gavin Simpson

On Tue, 2010-01-12 at 10:28 -0800, kellys17 wrote:
 Hi,
 
  I am currently working on some data and feel that NMDS would return an
 excellent result. With my current data set however I have been experiencing
 some problems and cannot carry out metaMDS. I have tried with a few smaller
 data sets which I created for practice sake and this has worked fine.

What were the errors/warnings you received that led you to this
conclusion? Please read the posting guide:

http://www.R-project.org/posting-guide.html

before replying to the list as from the above, there is almost no way
that we can help you.

 
  I think it is the set up of my data set that is causing me trouble. I have
 18 columns and 18 rows, as needed for the n x n matrix. However, within the
 data set I have a lot of zeros, i.e. more than just the zeros where column B
 meets row B. Do I need to get rid of these excess zeros in order for metaMDS
 to work?

You can provide a dissimilarity matrix *or* a community matrix to
metaMDS. If the latter, it will compute the dissimilarity for you (via
metaMDSdist() ) and you can avail yourself of the argument 'zerodist' to
that function --- see ?metaMDS for the arguments.

From the above, I deduce that the error from metaMDS is along the lines
of:

...zero or negative distance between objects X and Y...

which is because isoMDS can't work with samples that have zero
dissimilarity to one another. Indeed - by definition they should be
placed in the same location so why ordinate them at all, just use one of
the samples.

Is your matrix your own dissimilarity matrix? If so, is there some
reason you can't use the ones provided in vegan or metaMDS?

If there is a good reason, and you want to include all samples, then
you'll need to come up with a means for handling them. metaMDSdist allow
you to add a small value to the zero dissimilarities. The details are in
the code, but effectively all zero distances are replaced by half the
smallest non zero distance. You could do a similar replacement yourself
if you feel this is warranted and/or justified.

minDij - min(Dij[Dij  0) / 2
Dij[Dij = 0] - minDij

Will do this replacement if Dij is your matrix (replace Dij with
whatever the name of your matrix is). Then supply the new matrix to
metaMDS.

For most applications I have needed nMDS for, I would delete the samples
with duplicated species composition rather than add ad hoc amounts to
samples just to get the software to produce a result.

HTH

G

 
  Any help is much appreciated,
 
  Seán Kelly.
  
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drop last numeral

2010-01-12 Thread Dennis Murphy

Also try
sub('[0-9]$', '', Data)

[1] 113   113   173   173   182   182   222   222   224
[10] 224   41434 41437 41437

HTH,
Dennis

On Tue, Jan 12, 2010 at 10:36 AM, LCOG1 jr...@lcog.org wrote:


 Hello all,
  Frustrated and i know you can help

 I need to drop the last numeral of each of my values in my data set.  So
 for
 the following i have tried the ?substring but since i have to specify the
 length, but because my data are of varying lengths it doenst work so well

 Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,,
 2241 ,2242,414342 ,414371 ,414372)
 Bldgid-substring(as.character(Data),1,3)

 returns:
 113 113 173 173 182 182 222 222 224 224 414 414
 414

 but i want

 113, 113, 173 ,173 ,182 ,182, 222 ,222, 224
 ,224,41434
 ,41437 ,41437)

 The values thats have more than 4 numerals are whats messing things up.
 Tried ?formatC as well but couldn't get it to coerce things correctly.
 Thanks for the help

 JR
 --
 View this message in context:
 http://n4.nabble.com/Drop-last-numeral-tp1012347p1012347.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to handle missing values . when importing data in R

2010-01-12 Thread karena


thank you guys. All the columns of my data are numeric. 
I tried both methods, and they both work.

I appreciate your help.

-k
-- 
View this message in context: 
http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012397.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R for windows 64 bit

2010-01-12 Thread Alexander Shenkin

Hi Alessia,

Note that, while your physical limit might be 6 GB, Windows memory
management allows more memory than that to be allocated (aka Virtual
Memory, or at least that's what they called it in XP).  Windows swaps
out memory from RAM to the hard disk and back when necessary (please
excuse the explanation if you already know all this).  For processing
large vectors, this swapping might bring your system to a standstill. 
Regardless, the maximum memory for a windows process is larger than
the physical RAM you have available.

allie

On 1/12/2010 6:27 AM, alessia matano wrote:
 Fine, it worked. I will try in this way.

 Just the last question and I won't bother you further today. My
 machine right now has just 6 giga of RAM (it will be increased to 16
 in a few days), and I see that with this experimental version
 memory.limit is 6135.

 How is the command to increase the memory usage until the maximum I
 can (5 giga?). If I am writing memory.limit(5000) it still gives me
 the error:

 don't be silly! Your machine has a 4Gb address limit

 which is quite odd.

 Many thanks
 Best
 A.

 2010/1/12 alessia matano alexis@gmail.com:
   
 ok, perfect!
 I will try with it...many many thanks. Have you got there also the
 quantreg package, which has actually the same problem of sparseM
 (32bit version)?

 best
 alessia

 2010/1/12 Uwe Ligges lig...@statistik.tu-dortmund.de:
 

 On 12.01.2010 12:09, alessia matano wrote:
   
 I am sorry, I know it is an experimental version, and I have been
 misleading saying a new version.

 Therefore, I will wait for when they will be available officially,
 since it is just a few days.
 
 Or just use today my private repository I indicated in the other mail.

 Uwe Ligges


   
 However, I tried also to go to the cran pages and download them and
 insert into the library. For quantreg it worked, for sparseM it did
 not probably because it's a win32 version, as you said.



 2010/1/12 Prof Brian Ripleyrip...@stats.ox.ac.uk:
 
 On Tue, 12 Jan 2010, alessia matano wrote:

   
 Dear all,

 I just download and set this new version of R. I am now trying to
 download the packages I need which are sperseM and quantreg. I
 downloaded and insert into the library file the quantreg pacjkage and
 it seems to work. However, when I try to do the same with sparseM I
 get the following error message:

 Loading required package: SparseM
 Error in inDL(x, as.logical(local), as.logical(now), ...) :
  unable to load shared library
 'C:/PROGRA~1/R/R-211~1.0DE/library/SparseM/libs/SparseM.dll':
  LoadLibrary failure:  %1 non è un'applicazione di Win32 valida.


 Any help for it?
 
 Please do refer to the posting referred to in that thread (and Henrique,
 please do not post just the URL without the explanations).

 https://stat.ethz.ch/pipermail/r-devel/2010-January/056301.html

 You cannot mix 32-bit Windows binary packages with this experimental port
 (it is not a 'new version'): you need to install from the package
 sources.
  If that is too difficult for you, please do not try to use unsupported
 experimental builds (and Uwe Ligges may have some binary packages
 available
 for test in a few days).


   
 Thanks a lot
 alessia

 2010/1/11 Henrique Dallazuannawww...@gmail.com:
 
 Try this version (beta of development version):

 http://www.stats.ox.ac.uk/pub/RWin/Win64/R-2.11.0dev-win64.exe

 On Mon, Jan 11, 2010 at 2:29 PM, alessia matanoalexis@gmail.com
 wrote:
   
 Dear all,

 do you know if there is any particular version of R to implement with
 windows 64 bit, in such a way to increase the amount of memory it can
 use?

 How should I increase the memory, and more importantly to set a higher
 max vector size? It still stops me saying Could not allocate vector
 of size 145

 thanks to all
 alessia

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 


 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O

   
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
   
 
   
 
 __
 R-help@r-project.org mailing list

Re: [R] Drop last numeral

In addition to the substring and regular expression solutions, if you are 
certain that everything will be numeric (and integer as in your examples), then 
you could just convert to numeric, divide by 10, and then drop the decimal 
(floor or as.integer).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of LCOG1
 Sent: Tuesday, January 12, 2010 11:37 AM
 To: r-help@r-project.org
 Subject: [R] Drop last numeral
 
 
 Hello all,
   Frustrated and i know you can help
 
 I need to drop the last numeral of each of my values in my data set.
 So for
 the following i have tried the ?substring but since i have to specify
 the
 length, but because my data are of varying lengths it doenst work so
 well
 
 Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,,
 2241 ,2242,414342 ,414371 ,414372)
 Bldgid-substring(as.character(Data),1,3)
 
 returns:
 113 113 173 173 182 182 222 222 224 224 414 414
 414
 
 but i want
 
 113, 113, 173 ,173 ,182 ,182, 222 ,222, 224
 ,224,41434
 ,41437 ,41437)
 
 The values thats have more than 4 numerals are whats messing things up.
 Tried ?formatC as well but couldn't get it to coerce things correctly.
 Thanks for the help
 
 JR
 --
 View this message in context: http://n4.nabble.com/Drop-last-numeral-
 tp1012347p1012347.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drop last numeral

2010-01-12 Thread Steve Taylor

Try this:

substr(Data,1,nchar(Data)-1)
Steve

From: LCOG1 jr...@lcog.org
To:r-help@r-project.org
Date: 13/Jan/2010 9:15 a.m.
Subject: [R]  Drop last numeral

Hello all, 
  Frustrated and i know you can help 

I need to drop the last numeral of each of my values in my data set.  So for
the following i have tried the ?substring but since i have to specify the
length, but because my data are of varying lengths it doenst work so well

Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,,
2241 ,2242,414342 ,414371 ,414372)
Bldgid-substring(as.character(Data),1,3)

returns:
113 113 173 173 182 182 222 222 224 224 414 414
414

but i want

113, 113, 173 ,173 ,182 ,182, 222 ,222, 224 ,224,41434
,41437 ,41437)

The values thats have more than 4 numerals are whats messing things up. 
Tried ?formatC as well but couldn't get it to coerce things correctly. 
Thanks for the help

JR
-- 
View this message in context: 
http://n4.nabble.com/Drop-last-numeral-tp1012347p1012347.html 
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R ( http://www.r/ 
)-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Making routine faster by using apply instead of for-loop

2010-01-12 Thread William Dunlap

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Etienne Stockhausen
 Sent: Tuesday, January 12, 2010 10:59 AM
 To: r-help@r-project.org
 Subject: [R] Making routine faster by using apply instead of for-loop

 Hey everybody,

 I have a small problem with a routine, which prepares some data for
 plotting.
 I've made a small example:

 c=10
 mat=data.frame(matrix(1:(c*c),c,c))
 row.names(mat)=seq(c,1,length=c)
 names(mat)=c(seq(2,c,length=c/2),seq(c,2,length=c/2))
 v=as.numeric(row.names(mat))
 w=as.numeric(names(mat))
 for(i in 1:c)
 { for(j in 1:c)
 {
 if(v[j]+w[i]=c)(mat[i,j]=NA)
 }}

 This produces exactly the data I need to go on, but if I increase the
 constant c ,to for instance 500 , it takes a very long time 
 to set the NA's.

The first problem is that random (element-by-element)
access to a data.frame is much slower than the equivalent
access to a matrix.  Rewriting your code a bit to
use a matrix speeds up the c=500 case by a factor of 750.
f0 - function (c = 10)  {
mat = matrix(1:(c * c), c, c)
rownames(mat) = seq(c, 1, length = c)
colnames(mat) = c(seq(2, c, length = c/2), seq(c, 2, length = c/2))
v = as.numeric(rownames(mat))
w = as.numeric(colnames(mat))
for (i in 1:c) {
for (j in 1:c) {
if (v[j] + w[i] = c) {
mat[i, j] = NA
}
}
}
mat
}
Rewriting that to insert the NA's one operation speeds it up by
another factor of 10 (in the c=500 case)
f1 - function (c = 10) {
v - seq(c, 1, length = c)
w - c(seq(2, c, length = c/2), seq(c, 2, length = c/2))
mat - matrix(1:(c * c), nrow = c, ncol = c, dimnames = list(v, 
w))
mat[outer(w, v, `+`) = c] - NA
mat
}

If you really want a matrix, pass the output of these functions
into data.frame (with check.names=FALSE since the column
names are not considered legal on data.frame: the contain
duplicates and look numeric).

By the way, it is generally a bad idea to use apply() on
a data.frame.  It is meant for matrices.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 I've heard there is a much faster way to set the NA's using 
 the command
 apply( ), but I don't know how.
 I'm looking forward for any ideas or hints, that might help me.

 Best regards

 Etienne

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Making routine faster by using apply instead of for-loop


Your code is doing too many needless things.
The following takes about one second on my slow Vista laptop.

n - 500
mat - matrix(1:(n*n), n)
v - n:1
z - 2*1:(n/2)
w - c(z, rev(z))
for(i in seq_len(n)){
  for(j in seq_len(n)){
if(v[j] + w[i] = n)(mat[i,j] - NA)
  }
}
rownames(mat) - v
colnames(mat) - w

str(mat)

You end up with matrix, but if you really want a data.frame
with duplicate names, that's easy to get. Do you actually
want those row/col names or are they just used to identify
the cells that get NA?

Depending on what you really need, the following may be
good enough; takes about 0.1 seconds.

n - 500
mat - matrix(1:(n*n), n)
for(i in 1:(n/2)){mat[i, -(1:(2*i))] - mat[n+1-i, -(1:(2*i))] - NA}

 -Peter Ehlers

Etienne Stockhausen wrote:

Hey everybody,

I have a small problem with a routine, which prepares some data for
plotting.
I've made a small example:

   c=10
   mat=data.frame(matrix(1:(c*c),c,c))
   row.names(mat)=seq(c,1,length=c)
   names(mat)=c(seq(2,c,length=c/2),seq(c,2,length=c/2))
   v=as.numeric(row.names(mat))
   w=as.numeric(names(mat))
   for(i in 1:c)
   { for(j in 1:c)
   {
   if(v[j]+w[i]=c)(mat[i,j]=NA)
   }}

This produces exactly the data I need to go on, but if I increase the
constant c ,to for instance 500 , it takes a very long time to set the 
NA's.

I've heard there is a much faster way to set the NA's using the command
apply( ), but I don't know how.
I'm looking forward for any ideas or hints, that might help me.

Best regards

Etienne

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




--
Peter Ehlers
University of Calgary
403.202.3921

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] expand.grid game

How trivial is probably subjective, I don't think it is much above trivial.  I 
would not have been surprised to see this question on an exam in my 
undergraduate (300 or junior level) probability course (the hard part was 
remembering the details from that class from over 20 years ago).  My favorite 
test question of all time came from that course: You have a deck of poker 
cards with the 3's removed (and jokers), you deal yourself 5 cards at random, 
what is the probability of getting a straight (not including straight flushes)?

This problem is simpler.  Just think of the 8 places in the number as urns, and 
the 17 1's as balls to be put into the urns.  One ball has to go in the first 
urn, so you have 16 left, there are choose(16+8-1,8-1) ways to distribute 16 
undistinguishable balls among 8 distinguishable urns. But that includes some 
solutions with more than 9 balls in an urn which violates the digits 
restriction, so subtract off the illegal counts.  If we place 10 balls in the 
first urn, then we have 7 remaining balls to distribute between the 8 urns or 
choose( 7+8-1, 7), If we place 1 ball in the first urn and 10 balls in one of 
the 7 other urns (7*), then there are choose( 6+8-1, 7 ) ways to distribute the 
remaining 6 balls in the 8 urns.  Not too complicated once you remember (or 
look up) the formula for urns and balls.  

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: baptiste auguie [mailto:baptiste.aug...@googlemail.com]
 Sent: Tuesday, January 12, 2010 12:20 PM
 To: Greg Snow
 Cc: r-help
 Subject: Re: [R] expand.grid game
 
 Nice --- am I missing something or was this closed form solution not
 entirely trivial to find?
 
 I ought to compile the various clever solutions given in this thread
 someday, it's fascinating!
 
 Thanks,
 
 baptiste
 
 2010/1/12 Greg Snow greg.s...@imail.org:
  This also has a closed form solution:
 
  choose(16+8-1,7) - choose(7+8-1, 7) - 7*choose(6+8-1,7)
  [1] 229713
 
 
  --
  Gregory (Greg) L. Snow Ph.D.
  Statistical Data Center
  Intermountain Healthcare
  greg.s...@imail.org
  801.408.8111
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
  project.org] On Behalf Of Brian Diggs
  Sent: Thursday, December 31, 2009 3:08 PM
  To: baptiste auguie; David Winsemius
  Cc: r-help
  Subject: Re: [R] expand.grid game
 
  baptiste auguie wrote:
   2009/12/19 David Winsemius dwinsem...@comcast.net:
   On Dec 19, 2009, at 9:06 AM, baptiste auguie wrote:
  
   Dear list,
  
   In a little numbers game, I've hit a performance snag and I'm
 not
  sure
   how to code this in C.
  
   The game is the following: how many 8-digit numbers have the sum
 of
   their digits equal to 17?
   And are you considering the number 0089 to be in the
  acceptable set?
   Or is the range of possible numbers in 1079:9800 ?
  
  
   The latter, the first digit should not be 0. But if you have an
   interesting solution for the other case, let me know anyway.
  
   I should also stress that this is only for entertainment and
  curiosity's sake.
  
   baptiste
  
 
  I realize I'm late coming to this, but I was reading it in my post-
  vacation catch-up and it sounded interesting so I thought I'd give
 it a
  shot.
 
  After coding a couple of solutions that were exponential in time
 (for
  the number of digits), I rearranged things and came up with
 something
  that is linear in time (for the number of digits) and gives the
 count
  of numbers for all sums at once:
 
  library(plyr)
  nsum3 - function(digits) {
    digits - as.integer(digits)[[1L]]
    if (digits==1) {
      rep(1,9)
    } else {
      dm1 - nsum3(digits-1)
      Reduce(+, llply(0:9, function(x) {c(rep(0,x),dm1,rep(0,9-
 x))}))
    }
  }
 
  nsums - llply(1:8, nsum3)
  nsums[[5]][17]
  # [1] 3675
  nsums[[8]][17]
  # [1] 229713
 
  The whole thing runs in well under a second on my machine (a several
  years old dual core Windows machine).  In the results of nsum3, the
 i-
  th element is the number of numbers whose digits sum to i.  The
 basic
  idea is recursion on the number of digits; if n_{t,d} is the number
 of
  d-digit numbers that sum to t, then n_{t,d} = \sum_{i\in(0,9)}
 n_{t-
  i,d-1}. (Adding the digit i to each of those numbers makes their
 sum
  t and increases the digits to d).  When digits==1, then 0 isn't a
 valid
  choice and that also implies the sum of digits can't be 0, which
 fits
  well with the 1 indexing of arrays.
 
  --
  Brian Diggs, Ph.D.
  Senior Research Associate, Department of Surgery, Oregon Health 
  Science University
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.

Re: [R] LD50 and SE in GLMM (lmer)

2010-01-12 Thread Linda Bürgi


Thank you very much for the code Bill! I only had to change very few things to 
make it work for probit and lmer (instead of glmmPQL) and it works perfectly! 

Here's my code (I had some trouble with the output style glm.dose, so I just 
have it come out as an ugly list now, which isn't a problem since I only have 
to put it into a table).


model4 - lmer (y~time + (1|blc/instar),REML=FALSE, 
family=binomial(link=probit))
summary(model4)

 

dose.p.glmm - function(model4, cf = 1:2, p = 0.5) {
eta - probit(p)
b - fixef(model4)[cf]
k - (eta - b[1])/b[2]
names(k) - paste(p = , format(p), :, sep = )
pd - -cbind(1, k)/b[2]
SE - sqrt(((pd %*% vcov(model4)[cf,cf]) * pd) %*% c(1, 1))
list(k, SE)
}
dose.p.glmm (model4, cf=1:2, p=0.5)

 

With this, I don't even need the pnorm- probit transformation.

 

Thank you very much!

 

Linda
 
 Date: Mon, 11 Jan 2010 10:21:50 -0500
 Subject: Re: [R] LD50 and SE in GLMM (lmer)
 From: billpikou...@gmail.com
 To: patili_bue...@hotmail.com
 CC: r-help@r-project.org
 
 Sorry for the delay in response. I had a somewhat similar need
 recently with the difference that I used a logit link for a bioassay.
 The design had different dose-response replicates that I modeled as
 blocks. It looks like you are concentrating on estimation of fixed
 effects and thus the population / marginal LD50 estimate. If so, then
 there is a function called dose.p in the MASS package, courtesy of
 Venables and Ripley, which is used in the context of an example on 190
 - 194 of the 4th edition of their book (2002), 4th ediiion, that I
 think would be very helpful to study. The example code can also be
 found in the ch07.R file in the scripts sub-directory/folder of the
 MASS package directory/folder. The example illustrates the use of GLM
 with a logit link. To adapt it for use with a GLMM, I came up with the
 following, which is nearly identical to how dose.p is defined in R
 2.10.0
 
 dose.p.glmm - function(obj, cf = 1:2, p = 0.5) {
 eta - obj$family$linkfun(p)
 b - fixef(obj)[cf]
 x.p - (eta - b[1L])/b[2L]
 names(x.p) - paste(p = , format(p), :, sep = )
 pd - -cbind(1, x.p)/b[2L]
 SE - sqrt(((pd %*% vcov(obj)[cf, cf]) * pd) %*% c(1, 1))
 res - structure(x.p, SE = SE, p = p)
 class(res) - glm.dose
 res
 }
 
 Essentially only the fixef() call in the 2nd line of the body was
 needed to replace the coef() call. Please also note that I used this
 for a glmmPQL() call from the MASS package, not lmer().
 
 
  And one more question: is it correct to use pnorm (where John Maindonald 
  used exp(hat)/(1+exp(hat)))?
 
 
 Unfortunately I don't know offhand, and do not have a reference handy
 to check to be sure, so perhaps you can find a local statistician to
 help? I myself always have a preference to use the logit / logistic
 over probit, as they are both symmetric around 0.5 and are often
 reported to provide similar results.
 
 Hope that helps,
 Bill
 
 ###
 
 Bill Pikounis
 Statistician
 
 
 
 2010/1/7 Linda Bürgi patili_bue...@hotmail.com:
 
  Hi All!
 
 
 
  I am desperately needing some help figuring out how to calculate LD50 with 
  a GLMM (probit link) or, more importantly, the standard error of the LD50.
 
 
 
  I conducted a cold temperature experiment and am trying to assess after how 
  long 50% of the insects had died (I had 3 different instars (non 
  significant fixed effect) and several different blocks (I did 4 replicates 
  at a time)= random effect).
 
 
 
  Since there is no predict function for lmer, I used the following to get 
  predicted values (thanks to a post by John Maindonald (I'll attach his post 
  below)):
 
 
 
 
  model4 - lmer (y~time + (1|blc/instar),family=binomial(link=probit))
  summary(model4)
 
 
 
   b - fixef(model4)
   X - (model.matrix(terms(model4),zerotest))
   hat - X%*%b
   pxal - pnorm(hat)# probit link, for logit it would be: pval - 
  exp(hat)/(1+exp(hat))
   pval
 
 
  Once I get the pval, I see where the 0.5 predicted value lies and I adjust 
  the x's in zerotest to be more detailed in that range, eg. x: 1-420hours, I 
  see that 0.5 is in the 320hours area, so I adjust x to be 320.1, 320.2, 
  320.3, etc. to get the precise 0.500. Very clumsy but I guess it's correct?
 
 
 
  Now my biggest problem: how do I get the SE?
 
 
 
  John Maindonald goes on to do this:
 
  U - chol(as.matrix(summary(model4)@vcov))
 
  se - sqrt(apply(X%*%t(U), 1, function(x)sum(x^2)))
 
  list(hat=hat, se=se, x=X[,xcol])
 
 
 
  Unfortunately, I could not figure out what the chol(as.matrix...) part is 
  about (chol does what?) and therefore I have no idea, how to use this code 
  to get my LD50 SE (I would need the SE to be expressed in terms of x).
 
  Could anybody help me with this?
 
 
 
  And one more question: is it correct to use pnorm (where John Maindonald 
  used exp(hat)/(1+exp(hat)))?
 
 
 
  Thanks so much in advance!
 
 
 
  Linda
 
 
 
 
 
  Previous post by John Maindonald:
 
 
 
 
  ciplot -  function(obj=model4,

[R] [Solved][Code Snippets] Dropping Empty Regressors

2010-01-12 Thread Idgarad

To make a long story short I was doing some in-sample testing in which some
dynamically created regressors would end up either all true or all false
based on the validation portion. In my case a new mainframe configuration
(this is a crappy way to handle a level shift but I do what I can.) So here
is the code snippet that finally let me pre-check my regressors and drop any
of them that were all true or all false.

First the automagic STL outlier grabber that caused part of the problem:


# tsSource being my time Series source.
# sh2 is a table of all my regessors that have been previously pulled in
# this has historic and future values in it also, it gets sliced later.
# the EOM is the regessor holding weeks that contain an 'End of Month'
#
# This appends the found IOs to the regressor table. Stepwise tends to
# remove them later on. I needed a programtic way of removing useless
# regressors for model verification since I would not know their names
# if any are found

tsSourceDiag - stl(tsSource,s.window=per, robust=TRUE)
#
tsSourceIO - which(tsSourceDiag $ weights   1e-8)
#
# This is how to append run-time regessors
for(z in tsSourceIO) {
tmpname -paste(PreIO,z,sep=)
#COPY EOM AS A TEMPLATE
sh2[[tmpname]] - sh2[[EOM]]
#SET IT ALL TO 0
sh2[[tmpname]][]-FALSE
#SET The Proper Indice to TRUE
sh2[[tmpname]][z]- TRUE
}


So to get rid of them (those empty useless regressors) I cooked up this:

###
#Prune Empty Regressors (All false or all true)
# the newmcReg you see is a copy of the sh2 from earlier
# newmcReg = New Model Current Regressors
# sh2 later became cReg.
#
# Yes it makes my eyes bleed. in short we count all the trues
# and all the false and if they happen to be the same number
# as the length we know they are all true or false.
#
# the trick I finally found was that you could in fact -c()
# a list (e.g. ask for everything but the following) but you
# can't apparently do that inline so we just make a list of
# regressors that get shown the door then after hunting
# them down we give em the boot. This mess is soley
# so my in-sample Arima doesn't choke on xreg=newmcReg
# in which one of the newmcReg happen to be all true or false.
#
# God I wish I had taken more then a Trig course. Where was I?
#
# Yes that phantom 'i' you see is that this is all in a big loop
# for 6 possible models
# lm1 = all regressors w/ intercept
# lm2 = lm1 stepwise removal
# lm3 = all regressors wo/ intercept
# lm4 = lm3 stepwise removal
# lm5 = Hand Tuned
# lm6 = lm5 stepwise removal
###
toPurge=c()
for(k in names(newmcReg[[i]])) {
 print (paste(check to see if,k,is a useless regressors for model,i))
 if(sum(newmcReg[[i]][k][,1])==length(newmcReg[[i]][k][,1])) {
  print(paste(All of,k,are TRUE))
  getLost=which(names(newmcReg[[i]])==k)
  toPurge=c(toPurge,getLost)
  print(paste(k, has been added to the purge list for model, i,!))
 }
 if(sum(newmcReg[[i]][k][,1]==FALSE)==length(newmcReg[[i]][k][,1])) {
  print(paste(All of,k,are FALSE))
  getLost=which(names(newmcReg[[i]])==k)
  toPurge=c(toPurge,getLost)
  print(paste(k, has been added to the purge list for model, i,!))
 }
}
toPurge
# Do this only if there are any or R will beat you senseless and
# steal all your MMs!
if(length(toPurge)!=0) {
names(newmcReg[[i]])
names(newmcReg[[i]][-c(toPurge)])
newmcReg[[i]] - newmcReg[[i]][-c(toPurge)]
newmfReg[[i]] - newmfReg[[i]][-c(toPurge)]
names(newmcReg[[i]])
}
##
# End Regressor Pruning
##


Big thanks to the help so far. Now about those darn transfer functions...
hmm and pulse detection...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] expand.grid game

2010-01-12 Thread Rolf Turner



On 13/01/2010, at 9:19 AM, Greg Snow wrote:

How trivial is probably subjective, I don't think it is much above  
trivial.  I would not have been surprised to see this question on  
an exam in my undergraduate (300 or junior level) probability  
course (the hard part was remembering the details from that class  
from over 20 years ago).  My favorite test question of all time  
came from that course: You have a deck of poker cards with the 3's  
removed (and jokers), you deal yourself 5 cards at random, what is  
the probability of getting a straight (not including straight  
flushes)?


This problem is simpler.  Just think of the 8 places in the number  
as urns, and the 17 1's as balls to be put into the urns.  One ball  
has to go in the first urn, so you have 16 left, there are choose(16 
+8-1,8-1) ways to distribute 16 undistinguishable balls among 8  
distinguishable urns. But that includes some solutions with more  
than 9 balls in an urn which violates the digits restriction, so  
subtract off the illegal counts.  If we place 10 balls in the first  
urn, then we have 7 remaining balls to distribute between the 8  
urns or choose( 7+8-1, 7), If we place 1 ball in the first urn and  
10 balls in one of the 7 other urns (7*), then there are choose( 6 
+8-1, 7 ) ways to distribute the remaining 6 balls in the 8 urns.   
Not too complicated once you remember (or look up) the formula for  
urns and balls.


Sorry to be a thicko --- but doesn't the foregoing solution *leave  
in* the possibility
of putting all 17 balls in the first urn?  Or 3 balls in the first  
urn, 12 in the second,
and the remaining 2 in any of the other six urns?  Etc.  I.e. don't  
more terms have to

be subtracted?

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drop last numeral

2010-01-12 Thread LCOG1


The Below worked best for my purposes.  Thanks everyone.
Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,,
2241 ,2242,414342 ,414371 ,414372)
substr(Data,1,nchar(Data)-1) 







LCOG1 wrote:
 
 Hello all, 
   Frustrated and i know you can help 
 
 I need to drop the last numeral of each of my values in my data set.  So
 for the following i have tried the ?substring but since i have to specify
 the length, but because my data are of varying lengths it doenst work so
 well
 
 Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,,
 2241 ,2242,414342 ,414371 ,414372)
 Bldgid-substring(as.character(Data),1,3)
 
 returns:
 113 113 173 173 182 182 222 222 224 224 414 414
 414
 
 but i want
 
 113, 113, 173 ,173 ,182 ,182, 222 ,222, 224
 ,224,41434 ,41437 ,41437)
 
 The values thats have more than 4 numerals are whats messing things up. 
 Tried ?formatC as well but couldn't get it to coerce things correctly. 
 Thanks for the help
 
 JR
 

-- 
View this message in context: 
http://n4.nabble.com/Drop-last-numeral-tp1012347p1012492.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] some help regarding combining columns from different files

2010-01-12 Thread Harikrishnadhar

Hi Jim,

I am want to merge two files into one file :

Here is my code . But the problem with this is that I am getting the 2nd
file appended to the first when i write temp3 in my code to the text file. I
am not sure what mistake I am doing .

also find the test files to run the code .

Please help me with this !!!

temp1 - NULL
temp2 - NULL
x.col.names -c(genesymbol,geneDescription,orgSymbol,orgName)
y.col.names - c(genesymbol,geneDescription,orgSymbol,orgName)
for (i in 1:length(list1.bp.files.names)){
temp1 -
read.table(list1.bp.files.names[i],sep=\t,header=T,stringsAsFactors=F,quote=\)
  for (j in 1:length(list2.bp.files.names)){
 temp2 -
read.table(list2.bp.files.names[j],sep=\t,header=T,stringsAsFactors=F,quote=\)
temp3 - merge(temp1,temp2,by.x = x.col.names,by.y=y.col.names,all=T)
myfile-gsub(( ), , paste(1_,merge.bp.files.names[i],.txt))
write.table(temp3,file=myfile,sep=\t,quote=FALSE,row.names=F)
  }
 }
Thanks
--Hari--
genesymbol  geneDescription orgSymbol   orgName
E2f5 e2f transcription factor 5  RG  Rattus norvegicus
Msh2muts homolog 2 (e. coli)RG  Rattus norvegicus
Kpna2   karyopherin (importin) alpha 2  RG  Rattus norvegicus
Gtpbp4  gtp binding protein 4   RG  Rattus norvegicus
Dtymk_predicted deoxythymidylate kinase (predicted) RG  Rattus 
norvegicus
Ruvbl1  ruvb-like protein 1 RG  Rattus norvegicus
Cetn2   centrin 2   RG  Rattus norvegicus
Foxm1   forkhead box m1 RG  Rattus norvegicus
Abtb1   ankyrin repeat and btb (poz) domain containing 1RG  Rattus 
norvegicus
Myc myelocytomatosis viral oncogene homolog (avian) RG  Rattus 
norvegicus
Il1binterleukin 1 beta  RG  Rattus norvegicus
Cdc20   cell division cycle 20 homolog (s. cerevisiae)  RG  Rattus 
norvegicus
Cdc25a  cell division cycle 25 homolog a (s. cerevisiae)RG  Rattus 
norvegicus
Kifc1   kinesin family member c1RG  Rattus norvegicus
Fancd2  fanconi anemia d2 protein   RG  Rattus norvegicus
Rhobrhob gene   RG  Rattus norvegicus
Clp1cardiac lineage protein 1   RG  Rattus norvegicus
Psmd1   proteasome (prosome, macropain) 26s subunit, non-atpase, 1  RG  
Rattus norvegicus
Mad2l1_predictedmad2 (mitotic arrest deficient, homolog)-like 1 (yeast) 
(predicted) RG  Rattus norvegicus
Dhcr24  24-dehydrocholesterol reductase RG  Rattus norvegicus
Ahr aryl hydrocarbon receptor   RG  Rattus norvegicus
Rnd3ras homolog gene family, member e   RG  Rattus norvegicus
Acvr1b  activin a receptor, type 1b RG  Rattus norvegicus
Mcm2_predicted  minichromosome maintenance deficient 2 mitotin (s. cerevisiae) 
(predicted)  RG  Rattus norvegicus
Mapre3  microtubule-associated protein, rp/eb family, member 3  RG  Rattus 
norvegicus
Mapre1  microtubule-associated protein, rp/eb family, member 1  RG  Rattus 
norvegicus
Tardbp  tar dna binding protein RG  Rattus norvegicus
Cdca3   cell division cycle associated 3RG  Rattus norvegicus
Ccnb1   cyclin b1   RG  Rattus norvegicus
Npm1nucleophosmin 1 RG  Rattus norvegicus
Pcafp300/cbp-associated factor  RG  Rattus norvegicus
Cdc2a   cell division cycle 2 homolog a (s. pombe)  RG  Rattus 
norvegicus
Dnajc2  dnaj (hsp40) homolog, subfamily c, member 2 RG  Rattus 
norvegicus
Dab2ip  disabled homolog 2 (drosophila) interacting protein RG  Rattus 
norvegicus
Id2 inhibitor of dna binding 2, dominant negative helix-loop-helix protein  
RG  Rattus norvegicus
Kif23_predicted kinesin family member 23 (predicted)RG  Rattus 
norvegicus
Nek6nima (never in mitosis gene a)-related expressed kinase 6   RG  
Rattus norvegicus
Pola1   polymerase (dna directed), alpha 1  RG  Rattus norvegicus
Il1ainterleukin 1 alpha RG  Rattus norvegicus
Ccnccyclin cRG  Rattus norvegicus
Ccnb2   cyclin b2   RG  Rattus norvegicus
Pbef1   pre-b-cell colony enhancing factor 1RG  Rattus norvegicus
Rad17   rad17 homolog (s. pombe)RG  Rattus norvegicus
Racgap1_predicted   rac gtpase-activating protein 1 (predicted) RG  
Rattus norvegicus
Ccna2   cyclin a2   RG  Rattus norvegicus
Cdca8   cell division cycle associated 8RG  Rattus norvegicus
Sesn1_predicted sestrin 1 (predicted)   RG  Rattus norvegicus
Tpx2_predicted  tpx2, microtubule-associated protein homolog (xenopus laevis) 
(predicted)   RG  Rattus norvegicus
Dmtf1   cyclin d binding myb-like transcription factor 1RG  Rattus 
norvegicus
Chek1   checkpoint kinase 1 homolog (s. pombe)  RG  Rattus norvegicus
Mlh1mutl homolog 1 (e. coli)RG  Rattus norvegicus
Cgref1  cell growth regulator with ef hand domain 1 RG  Rattus 
norvegicus

[R] Strange behavior when trying to piggyback off of fitdistr

2010-01-12 Thread Adler, Avraham

Hello.

I am not certain even how to search the archives for this particular question, 
so if there is an obvious answer, please smack me with a large halibut and send 
me to the URLs.

I have been experimenting with fitting curves by using both maximum likelihood 
and maximum spacing estimation techniques. Originally, I have been writing 
distribution-specific functions in 'R' which work rather well. As the procedure 
is identical for all distributions, other than the actual distribution function 
itself, I thought I would try to build a single function that accepted the 
distribution as an input and returned the results. Never having played with 
calls, formals, and arguments before, I figured I would rip off, cough, cough, 
be inspired by the venerable fitdistr in MASS, which accepts distribution 
inputs.

After a few hours, I actually got it working decently (although unfinished). 
However, I am finding something very weird. At its core, the technique requires 
the difference between the value of the cumulative distribution function at 
neighboring evaluations. I implement this by running p($DIST) on the vector of 
sorted losses (call it SP), creating two new vectors, one c(0, SP) and one 
c(SP, 1), and then taking the latter minus the former. If there happen to be 
two instances of the same value, unless it is known rounding error, one 
substitutes the density at that point for the difference in the cumulative 
distribution (which would be 0, as the CDF of two identical values is the 
same). So, I run d($DIST), add a 0 in front to make it the same length, and 
return a new vector equal to pmax.int(DIFFERECEN, DENSITY), with the idea that 
the density is always 0 and always less than the difference in cumulative 
distributions, so it will only be max in the case of DIFFERENCE being trul!
 y 0. I then take negative the sum of the log of the differences and that is 
the function passed to optim.

What is weird is when I leave out the density correction (which is safe 
99.% of the time as the chances of two identical losses is almost 0 
(assuming no clustering/capping) ), I get a very similar result to my 
distribution-customized function which calls the proper plnorm or pgenpareto 
directly. When I add in the correction, the value is orders of magnitude 
higher, which not only affects the fit (slightly) but also affects the goodness 
of fit statistics. I have no idea why this happens, although in theory, if the 
function is pulling too many density values, it would return a higher value as 
the densities are much closer to 0 so the neg-log is a larger number.

In the code pasted below, if spacing returns -sum(log(SP2), it works fine. If 
it returns -(sum(log(SP3)), it gives strange results.

I do not have the S programming language book (perhaps I should invest in it) 
and the online help wasn't that helpful to me, so I would very much appreciate 
any responses y'all may have.

Thank you very much,

--Avi

#
Code (Unfinished)

MSEFit - function (x, distfun, start, ...)
{
require (MASS); require (actuar);
Call - match.call(expand.dots = TRUE)
if (missing(start))
start - NULL
dots - names(list(...))
if (missing(x) || length(x) == 0L || mode(x) != numeric) 
stop('x' must be a non-empty numeric vector)
if (any(!is.finite(x))) 
stop('x' contains missing or infinite values)
if (missing(distfun) || !(is.function(distfun) || is.character(distfun))) 
stop('density' must be supplied as a function or name)
n - length(x)
if (is.character(distfun)) {
distname - tolower(distfun)
densfun - switch(distname, exp = dexp, exponential = dexp, gamma = 
dgamma,
`log-normal` = dlnorm, lnorm = dlnorm, lognormal = dlnorm, weibull 
= dweibull,
pareto = dpareto, loglogistic = dllogis, transbeta = dtrbeta,
`transformed beta` = dtrbera, burr = dburr, paralogistic = 
dparalogis,
genpareto = dgenpareto, generalizedpareto = dgenpareto,
`generalized pareto` = dgenpareto, invburr = dinvburr, 
`inverse burr` = dinvburr, invpareto = dinvpareto, `inverse pareto` 
= dinvpareto,
invparalogistic = dinvparalogis, `inverse paralogistic` = 
dinvparalogis,
transgamma = dtrgamma, `transformed gamma` = dtrgamma, invexp = 
dinvexp,
`inverse exponential` = dinvexp, invtransgamma = dinvtrgamma,
`inverse transformed gamma` = dinvtrgamma, invgamma = dinvgamma,
`inverse gamma` = dinvgamma, invweibull = dinvweibull, `inverse 
weibull` = dinvweibull,
loggamma = dlgamma, genbeta = dgenbeta, `generalized beta` = 
dgenbeta, NULL)
if (is.null(densfun)) 
stop(unsupported distribution)
distfun - switch(distname, exp = pexp, exponential = pexp, gamma = 
pgamma,
`log-normal` = plnorm, lnorm = plnorm, lognormal = plnorm, weibull 
=

Re: [R] trouble with installing SJava

2010-01-12 Thread Martin Morgan

Jiiindo wrote:
 Colleagues, 
 How i can solve this error when i install SJava package

A more recent version of SJava is available with

   source('http://bioconductor.org/biocLite.R')
   biocLite('SJava')

rJava is an alternative.

Martin

 Thanks
 
 R CMD INSTALL -c /usr/local/lib/R/SJava_0.69-0.tar.gz
 
 * installing to library ‘/usr/local/lib/R/site-library’
 * installing *source* package ‘SJava’ ...
 checking for java... /usr/lib/jvm/java-6-sun/bin/java
 Java VM /usr/lib/jvm/java-6-sun/bin/java
 checking for javah... /usr/lib/jvm/java-6-sun/bin/javah
 Looking in /usr/lib/jvm/java-6-sun/include
 Looking in /usr/lib/jvm/java-6-sun/include/linux
 checking for g++... g++
 checking for C++ compiler default output... a.out
 checking whether the C++ compiler works... yes
 checking whether we are cross compiling... no
 checking for suffix of executables... 
 checking for suffix of object files... o
 checking whether we are using the GNU C++ compiler... yes
 checking whether g++ accepts -g... yes
 checking for gcc... gcc
 checking whether we are using the GNU C compiler... yes
 checking whether gcc accepts -g... yes
 checking for gcc option to accept ANSI C... none needed
 checking for Rf_initEmbeddedR in -lR... no
 No R shared library found
 configure: creating ./config.status
 config.status: creating Makevars
 config.status: creating src/Makevars
 config.status: creating src/RSJava/Makefile
 config.status: creating Makefile_rules
 config.status: creating inst/scripts/RJava.bsh
 config.status: creating inst/scripts/RJava.csh
 config.status: creating R/zzz.R
 config.status: creating cleanup
 config.status: creating inst/scripts/RJava
 Copying the cleanup script to the scripts/ directory
 Building libRSNativeJava.so in
 /tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava
 if  test ! -d /usr/local/lib/R/site-library/SJava/libs ; then \
   mkdir /usr/local/lib/R/site-library/SJava/libs ; \
   fi
 gcc -std=gnu99 -g -O2 -D_R_ -I/usr/local/lib/R/include
 -I/usr/local/lib/R/include/R_ext
 -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava  -I.
 -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include 
 -I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux 
 -c CtoJava.c
 CtoJava.cweb:148: error: expected '=', ',', ';', 'asm' or '__attribute__'
 before 'vm1_args'
 CtoJava.cweb:215: error: static declaration of 'std_env' follows non-static
 declaration
 CtoJava.cweb:195: error: previous declaration of 'std_env' was here
 CtoJava.cweb: In function 'create_Java_vm':
 CtoJava.cweb:256: error: 'vm1_args' undeclared (first use in this function)
 CtoJava.cweb:256: error: (Each undeclared identifier is reported only once
 CtoJava.cweb:256: error: for each function it appears in.)
 make: *** [CtoJava.o] Error 1
 Generating JNI header files from Java classes.
RForeignReference, RManualFunctionActionListener, ROmegahatInterpreter 
 REvaluator
 *
 Warning:
 At present, to use the library you must set the 
 LD_LIBRARY_PATH environment variable
 to
  
 /usr/local/lib/R/site-library/SJava/libs:/usr/lib/jvm/java-6-sun-1.6.0.14/jre/lib/i386/server:/usr/lib/jvm/java-6-sun-1.6.0.14/jre/lib/i386:/usr/lib/jvm/java-6-sun-1.6.0.14/jre/../lib/i386::/usr/java/packages/lib/i386:/lib:/usr/lib
 or use one of the RJava.bsh or RJava.csh scripts
 *
 ** libs
 gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include
 -I/usr/local/lib/R/include/R_ext
 -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava  -I.
 -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include  -IRSJava
 -I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux
 -I/usr/local/include-fpic  -g -O2 -c ConverterExamples.c -o
 ConverterExamples.o
 ConverterExamples.cweb: In function ‘RS_JAVA_setFunctionConverter’:
 ConverterExamples.cweb:213: warning: assignment discards qualifiers from
 pointer target type
 ConverterExamples.cweb: In function ‘RS_JAVA_toJavaFunctionConverter’:
 ConverterExamples.cweb:312: warning: passing argument 1 of
 ‘getOmegahatReferenceValue’ discards qualifiers from pointer target type
 gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include
 -I/usr/local/lib/R/include/R_ext
 -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava  -I.
 -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include  -IRSJava
 -I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux
 -I/usr/local/include-fpic  -g -O2 -c Converters.c -o Converters.o
 Converters.cweb: In function ‘RS_JAVA_removeConverter’:
 Converters.cweb:399: warning: assignment discards qualifiers from pointer
 target type
 gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include
 -I/usr/local/lib/R/include/R_ext
 -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava  -I.
 -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include  -IRSJava
 -I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux
 -I/usr/local/include

Re: [R] expand.grid game --- never mind, I figured it out!

2010-01-12 Thread Rolf Turner



I re-read the solution that you posted and realized where my thinking
was going wrong.  Sorry (again!) for being a thicko.

cheers,

Rolf Turner

On 13/01/2010, at 9:19 AM, Greg Snow wrote:

How trivial is probably subjective, I don't think it is much above  
trivial.  I would not have been surprised to see this question on  
an exam in my undergraduate (300 or junior level) probability  
course (the hard part was remembering the details from that class  
from over 20 years ago).  My favorite test question of all time  
came from that course: You have a deck of poker cards with the 3's  
removed (and jokers), you deal yourself 5 cards at random, what is  
the probability of getting a straight (not including straight  
flushes)?


This problem is simpler.  Just think of the 8 places in the number  
as urns, and the 17 1's as balls to be put into the urns.  One ball  
has to go in the first urn, so you have 16 left, there are choose(16 
+8-1,8-1) ways to distribute 16 undistinguishable balls among 8  
distinguishable urns. But that includes some solutions with more  
than 9 balls in an urn which violates the digits restriction, so  
subtract off the illegal counts.  If we place 10 balls in the first  
urn, then we have 7 remaining balls to distribute between the 8  
urns or choose( 7+8-1, 7), If we place 1 ball in the first urn and  
10 balls in one of the 7 other urns (7*), then there are choose( 6 
+8-1, 7 ) ways to distribute the remaining 6 balls in the 8 urns.   
Not too complicated once you remember (or look up) the formula for  
urns and balls.


--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111



-Original Message-
From: baptiste auguie [mailto:baptiste.aug...@googlemail.com]
Sent: Tuesday, January 12, 2010 12:20 PM
To: Greg Snow
Cc: r-help
Subject: Re: [R] expand.grid game

Nice --- am I missing something or was this closed form solution not
entirely trivial to find?

I ought to compile the various clever solutions given in this thread
someday, it's fascinating!

Thanks,

baptiste

2010/1/12 Greg Snow greg.s...@imail.org:

This also has a closed form solution:


choose(16+8-1,7) - choose(7+8-1, 7) - 7*choose(6+8-1,7)

[1] 229713


--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
project.org] On Behalf Of Brian Diggs
Sent: Thursday, December 31, 2009 3:08 PM
To: baptiste auguie; David Winsemius
Cc: r-help
Subject: Re: [R] expand.grid game

baptiste auguie wrote:

2009/12/19 David Winsemius dwinsem...@comcast.net:

On Dec 19, 2009, at 9:06 AM, baptiste auguie wrote:


Dear list,

In a little numbers game, I've hit a performance snag and I'm

not

sure

how to code this in C.

The game is the following: how many 8-digit numbers have the sum

of

their digits equal to 17?

And are you considering the number 0089 to be in the

acceptable set?

Or is the range of possible numbers in 1079:9800 ?



The latter, the first digit should not be 0. But if you have an
interesting solution for the other case, let me know anyway.

I should also stress that this is only for entertainment and

curiosity's sake.


baptiste



I realize I'm late coming to this, but I was reading it in my post-
vacation catch-up and it sounded interesting so I thought I'd give

it a

shot.

After coding a couple of solutions that were exponential in time

(for

the number of digits), I rearranged things and came up with

something

that is linear in time (for the number of digits) and gives the

count

of numbers for all sums at once:

library(plyr)
nsum3 - function(digits) {
  digits - as.integer(digits)[[1L]]
  if (digits==1) {
rep(1,9)
  } else {
dm1 - nsum3(digits-1)
Reduce(+, llply(0:9, function(x) {c(rep(0,x),dm1,rep(0,9-

x))}))

  }
}

nsums - llply(1:8, nsum3)
nsums[[5]][17]
# [1] 3675
nsums[[8]][17]
# [1] 229713

The whole thing runs in well under a second on my machine (a  
several

years old dual core Windows machine).  In the results of nsum3, the

i-

th element is the number of numbers whose digits sum to i.  The

basic

idea is recursion on the number of digits; if n_{t,d} is the number

of

d-digit numbers that sum to t, then n_{t,d} = \sum_{i\in(0,9)}

n_{t-

i,d-1}. (Adding the digit i to each of those numbers makes their

sum

t and increases the digits to d).  When digits==1, then 0 isn't a

valid

choice and that also implies the sum of digits can't be 0, which

fits

well with the 1 indexing of arrays.

--
Brian Diggs, Ph.D.
Senior Research Associate, Department of Surgery, Oregon Health 
Science University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal,

[R] parsing protocol of states

2010-01-12 Thread Andreas Wittmann


Dear R-users,

actually i try to parse some state protocols for my work. i an easy 
stetting the code below works fine, if states are reached only once. in 
harder settings it could be possible that one state gets visited more 
times. in this case for me its interesting to see how much waiting time 
lies between to states on the whole.


by the way i didn't use R as a parsing tool so far, so any advice for 
doing this more effectivly are quite welcome.


str01 - 2007-10-12 11:50:05 state B. ,2007-10-12 11:50:05 state C. 
,2007-10-12 13:23:24 state D. ,2007-10-12 13:23:43 state E. ,2007-10-14 
15:43:19 state F. ,2007-10-14 15:43:20 state E. ,2007-10-14 15:43:25 
state G. ,2007-10-14 15:43:32 state H. ,2007-10-14 15:43:41 state I. 
,2007-10-14 15:43:47 state F. ,2007-10-14 15:43:47 state G. ,2007-10-14 
15:48:08 state H. ,2007-10-16 10:10:20 state J. ,2007-10-19 11:12:54 
state K ,2007-10-19 11:17:37 state D. ,2007-10-19 11:17:42 state E. 
,2007-10-19 11:17:49 state F. ,2007-10-19 11:17:51 state E. ,2007-10-19 
11:17:58 state H. ,2007-10-19 11:18:05 state J. ,2007-10-19 11:21:45 
state L.


str02 - unlist(strsplit(str01, \\,))

x1 - grep(state B, str02)
x2 - grep(state C, str02)
x3 - grep(state D, str02)
x4 - grep(state E, str02)
x5 - grep(state F, str02)
x6 - grep(state G, str02)
x7 - grep(state H, str02)
x8 - grep(state I, str02)
x9 - grep(state J, str02)
x10 - grep(state K, str02)
x11 - grep(state L, str02)

t1 - substr(str02[x1], 1, 19)
t1 - as.POSIXct(strptime(t1, %Y-%m-%d %H:%M:%S))
t2 - substr(str02[x2], 1, 19)
t2 - as.POSIXct(strptime(t2, %Y-%m-%d %H:%M:%S))
t3 - substr(str02[x3], 1, 19)
t3 - as.POSIXct(strptime(t3, %Y-%m-%d %H:%M:%S))
t4 - substr(str02[x4], 1, 19)
t4 - as.POSIXct(strptime(t4, %Y-%m-%d %H:%M:%S))
t5 - substr(str02[x5], 1, 19)
t5 - as.POSIXct(strptime(t5, %Y-%m-%d %H:%M:%S))
t6 - substr(str02[x6], 1, 19)
t6 - as.POSIXct(strptime(t6, %Y-%m-%d %H:%M:%S))
t7 - substr(str02[x7], 1, 19)
t7 - as.POSIXct(strptime(t7, %Y-%m-%d %H:%M:%S))
t8 - substr(str02[x8], 1, 19)
t8 - as.POSIXct(strptime(t8, %Y-%m-%d %H:%M:%S))
t9 - substr(str02[x9], 1, 19)
t9 - as.POSIXct(strptime(t9, %Y-%m-%d %H:%M:%S))
t10 - substr(str02[x10], 1, 19)
t10 - as.POSIXct(strptime(t10, %Y-%m-%d %H:%M:%S))
t11 - substr(str02[x11], 1, 19)
t11 - as.POSIXct(strptime(t11, %Y-%m-%d %H:%M:%S))

as.numeric(difftime(t11, t1, units=days))

## waiting times between state E and F
sum(as.numeric(difftime(t5, t4, units=days)))


best regards

Andreas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R for windows 64 bit