Re: [R] [R-SIG-Finance] AsOf join in R

2011-10-06 Thread Ulrich Staudinger

A bit late,  but here is what I always do:

m = merge(bid, ask, tick)
m- interpNA(m, method=before)

intrepNA can also interpolate NAs in different ways, for example linearly.


Hth,
Ulrich


Am 06.10.2011 03:08, schrieb Robert A'gata:

Hi Roupell,

Yes I am aware of RTAQ function matchTradesQuotes. But my time series
does not follow the TAQ format like they suggest. So I gave it a try
and find that it doesn't work. In particular, my time series contain
full level 2 order book and trades. I want to do asof join of the book
to the trade. Is there any better way? Or I misunderstand anything
about the RTAQ's function? Thank you.

Robert

On Wed, Oct 5, 2011 at 1:59 AM, Roupell, Darkodarko.roup...@cba.com.au  wrote:

na.locf {zoo}, should do a job, also if you look in RTAQ they wrote a function 
that looks on previous tick and carries forward value in case its not available.

__
Commonwealth Bank
Darko Roupell
Associate Quantitative Analyst
Institutional Banking  Markets
Equities Research
Darling Park Tower 1
Level 23, 201 Sussex Street
Sydney, NSW 200
P:  +61 2 9117 1254
F:  +61 2 9118 1000
M: +61 400 170 515
E: darko.roup...@cba.com.au
Our vision is to be Australia's finest financial services organisation through 
excelling in customer service.

Email Security
This email is sent solely for informational purposes. Hoax emails, commonly 
referred to as phishing, can appear to be from the Commonwealth Bank and ask 
you to update or confirm details such as client numbers, passwords, personal 
identification questions, contact details or account numbers. The Commonwealth 
Bank will never send you an email asking you to confirm, update or reveal your 
confidential banking information.
Important Information
Produced by Global Markets Research, a business unit of Commonwealth Bank of 
Australia ABN 48 123 123 124 - AFSL 234945 (Commonwealth Bank). This 
publication is based on information available at the time of publishing.  We 
believe that the information in this communication is correct and any opinions, 
conclusions or recommendations are reasonably held or made as at the time of 
its compilation, but no warranty is made as to accuracy, reliability or 
completeness.  To the extent permitted by law, neither Commonwealth Bank nor 
any of its subsidiaries accept liability to any person for loss or damage 
arising from the use of this communication. This communication does not purport 
to be a complete statement or summary.
The information provided has been prepared without considering your objectives, 
financial situation or needs, and before acting on the information, you should 
consider its appropriateness to your circumstances. No person should act on the 
basis of this report without considering and if necessary taking appropriate 
professional advice upon their own particular circumstances.
Commonwealth Bank of Australia, as a provider of investment, borrowing and 
other financial services undertakes financial transactions with many corporate 
entities in Australia. This may include any corporate issuer referred to in 
this communication. Commonwealth Bank and its subsidiaries have effected or may 
effect transactions for their own account in any investments or related 
investments referred to herein. In the case of certain securities Commonwealth 
Bank is or may be the only market maker.

-Original Message-
From: r-sig-finance-boun...@r-project.org 
[mailto:r-sig-finance-boun...@r-project.org] On Behalf Of Robert A'gata
Sent: Wednesday, 5 October 2011 2:41 PM
To: r-help@r-project.org; r-sig-fina...@stat.math.ethz.ch
Subject: [R-SIG-Finance] AsOf join in R

Hi,

I tried to google for any solution for asof join operator in R. But I
couldn't find one. The asof join operator AsOf(A,B) merges 2 time
series by looking for latest available value of B prior to each time
point in A. For example,

A- xts(c(10,15,20,25),
order.by=as.POSIXct(c(2011-09-01,2011-09-09,2011-09-10,2011-09-15))

B- xts(c(1.1,1.5,1.3,1.7),
order.by=as.POSIXct(c(2011-08-31,2011-09-09,2011-09-11,2011-09-12))

AsOf(A,B) should return

A   B
2011-09-0110 1.1
2011-09-0915 1.1 #  (because latest value B prior to
2011-09-09 is 1.1)
2011-09-1020 1.5
2011-09-1525 1.7

How do I write the above AsOf function in R? The merge function does
not do what I want because it will align points that have the same
time stamp together while what I want is actually latest value prior
to timestamp in A. Any example would be greatly appreciated. Thank
you.

Cheers,

Robert

___
r-sig-fina...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should 
go.

** IMPORTANT MESSAGE *
This e-mail message is 

Re: [R] any way to convert back to DateTime class when accidental conversion to numeric?

2011-10-06 Thread Prof Brian Ripley
A more portable way (that function only works in some versions of R) 
is


as.POSIXct(1317857320, origin=1970-01-01)

possibly with a 'tz' argument if you need to restore the timezone.

On Wed, 5 Oct 2011, jim holtman wrote:


Here is what I use:

unix2POSIXct(1317857320)
[1] 2011-10-05 19:28:40 EDT


unix2POSIXct  -  function (time) structure(time, class = c(POSIXt,
POSIXct))


On Wed, Oct 5, 2011 at 7:38 PM, Mike Williamson this.is@gmail.com wrote:

Hi,

   In short, I would like to know if there is any way to convert a numeric
into a date, similar to how strptime() can convert a string to a date time
class?

   There are some functions, etc. which don't work well with dates, and
tend to force them into numerics.  I understand that the number it spits
back is the number of seconds since the beginning of 1970 (see the first few
sentences of the Details portion of ?DateTimeClasses).
   However, it's a bit of a hassle to convert that by hand.  I can create a
function to do this, and it isn't so hard, but I found it hard to believe
such a function didn't already exist, so I wanted to ask the community.

   As an example, today (Oct 5th 2011 at approximately 4:30pm, Pacific
time) is approximately 1317857320 as a numeric, but I would like to know how
to go from that number back to the 2011-10-05 16:28:39 PDT date time class
which originally generated it.

                       Thanks!
                             Mike

---
XKCD http://www.xkcd.com

       [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mean(s) from values in different row?

2011-10-06 Thread SML
Hello:

Is there a way to get a mean from values stored in different rows? 

The data looks like this:
  YEAR-1, JAN, FEB, ..., DEC
  YEAR-2, JAN, FEB, ..., DEC
  YEAR-3, JAN, FEB, ..., DEC

What I want is the mean(s) for just the consecutive winter months:
  YEAR-1.DEC, YEAR-2.JAN, YEAR-2.FEB
  YEAR-2.DEC, YEAR-3.JAN, YEAR-3.FEB
  etc.

Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Concecutive zeros and ones

2011-10-06 Thread Alaios
Dear all,
I have  a data series (might be vector or matrix) which is composed only from 
zeros and ones like the following example

0 0 0 1 1 0 1 0 0 1 1 1 1 0 0 0

I want to be able to return back the length of concecutive zeros and the length 
of concecutive ones.

For that I want to have something like that returned:


zeros= [3 1 2 3];
ones=[2 1 4];

How I can do that simply in R?

I would like to thank you in advance for your help


B.R
Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Concecutive zeros and ones

2011-10-06 Thread Dimitris Rizopoulos

Check function rle().


I hope it helps.

Best,
Dimitris



On 10/6/2011 9:18 AM, Alaios wrote:

Dear all,
I have  a data series (might be vector or matrix) which is composed only from 
zeros and ones like the following example

0 0 0 1 1 0 1 0 0 1 1 1 1 0 0 0

I want to be able to return back the length of concecutive zeros and the length 
of concecutive ones.

For that I want to have something like that returned:


zeros= [3 1 2 3];
ones=[2 1 4];

How I can do that simply in R?

I would like to thank you in advance for your help


B.R
Alex

[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Populate a matrix

2011-10-06 Thread fernando.cabrera
This last solution is what I was looking for, I was trying to avoid loops. 
Thanks!

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Rainer Schuermann
Sent: 5. oktober 2011 18:29
To: r-help@r-project.org
Subject: Re: [R] Populate a matrix

m - matrix( rep( y, length( x ) ), length( y ), length( x ) )


On Wednesday 05 October 2011 18:11:18 fernando.cabr...@nordea.com wrote:
 Hi guys
 
 I have vectors x - c(1,2,3,4) and y - c(4,3,9) and would like to generate a 
 matrix which has 3 rows (length(y)) and 4 
columns (length(x)), and each row is the corresponding y element repeated 
length(x) times.
 
 4,4,4,4
 3,3,3,3
 9,9,9,9
 
 Thanks.
 
 Fernando Álvarez
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kriging shapefiles

2011-10-06 Thread Paul Hiemstra
On 10/05/2011 09:07 AM, Leynnard Rey Matillano wrote:
 Hi! Im new to R and I need to interpolate a shapefile using kriging. I've 
 been able to plot/read the shapefile using the package maptools or rgdal. 
 I've searched the internet for sample codes but most of the kriging codes 
 that I've found done in R is done using txtfiles or CSVs.  An example could 
 be of great help. Thanks.
   [[alternative HTML version deleted]]



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

...you could also take a look at the automap package (some blatant self
promotion ;)). This builds on gstat...

Paul

-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reporting multiple objects out of a function

2011-10-06 Thread Paul Hiemstra
On 10/05/2011 09:14 PM, andrewH wrote:
 Thanks, Sina! This is very helpful and informative, but still not quite what
 I want.

 So, here is the thing: When a function returns an object, that object is
 available in the calling environment.  If it is returned inside a function,
 it is available in the function, but not outside of the function.  What I
 want to do is simply to return more than one object in the usual sense in
 which functions return objects.

Hi,

As I understand it, you want to return multiple arguments without
returning them explicitly as an object. This can probably be done, by I
would advice against it because it makes your code harder to read. You
dump something in the calling environment, and a new user (maybe
yourself in a few months) has to do a lot of reasoning of what is
happening under the hood, which object is dumped in which environment. I
would just return a list. Alternatively, take a look at object oriented
programming like Gabor suggested. This, however, still involves
returning an object...

Again, I would recommend doing this the standard R way

cheers,
Paul

 Here is a test to see if a function fun does this, at least to the depth of
 1.

 obj1 - 1
 obj2 - 2

 cat(obj1 in global=, obj1)
 cat(obj2 in global=, obj2)

 wrapFun - function(fun) {
obj1 - 3
obj2 - 4
cat(obj1 in calling=, obj1)
cat(obj2 in calling=, obj2)
fun()
cat(obj in calling=, obj)
cat(obj1 in calling=, obj1)
cat(obj2 in calling=, obj2)
 }

 cat(obj1 in global=, obj1)
 cat(obj2 in global=, obj2)


 Suppose the function fun assigns the values 5 and 6 to obj1 and obj2.  If
 the function does what I want, this code should print:
 obj1 in global=  1
 obj2 in global=  2
 obj1 in calling= 3
 obj2 in calling= 4
 obj1 in calling= 5
 obj2 in calling= 6
 obj1 in global=  1
 obj2 in global=  2

 I turned Paul’s and Sina’s code into functions as follows:
 paulFun - function() {
 obj1 - 5; 
 obj2 - 6; 
 }

 sinaFun - function() {
 attach(what = NULL, name = my_env)
 assign(obj1, 5, envir = as.environment(my_env))
 assign(obj1, 5, envir = as.environment(my_env))
 }

 Running these two functions in the code above yields:

 paulFun:
 obj1 in global= 1
 obj2 in global= 2
 obj1 in calling= 3
 obj2 in calling= 4
 obj1 in calling= 3
 obj2 in calling= 4
 obj1 in global= 5
 obj2 in global= 6

 So paulFun puts the objects in the global environment but not in the calling
 environment. Let’s try sinaFun:

 sinaFun:
 obj1 in global= 1
 obj2 in global= 2 
 obj1 in calling= 3
 obj2 in calling= 4
 obj1 in calling= 3
 obj2 in calling= 4 
 obj1 in global= 1
 obj2 in global= 2

 sinaFun puts the objects in the new environment it defines, but they are
 available in neither the calling nor the global environment.  However, I was
 immediately convinced that Sina had given me the tool I was missing: the
 assign function. (Thanks, Sina!)  But I was wrong (or used it wrong), and
 now I am even more deeply confused.  Here is a function that I thought would
 do what I want:

 andrewFun - function() {
 assign(obj1, 5, pos = sys.parent(n = 1))
 assign(obj2, 6, pos = sys.parent(n = 1))
 NULL
 }

 However, when I tried it, my results were the same as paulFun: assigned in
 the global environment, but not in the calling environment.  Setting n = 0
 seemed to limit the assignment to the interior of andrewFun: none of the
 printed obj values were affected.

 Help?

 andrewH


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/reporting-multiple-objects-out-of-a-function-tp3873380p3876201.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] apply and functions with many arguments

2011-10-06 Thread Alaios
Dear all,
I would like to use the following function

fitdist(data, distr, method=c(mle, mme, qme, mge),
start=NULL, fix.arg=NULL, ...)


for many different distr values like distr=c(norm,lnorm,pois) (just a 
small example)
and take back into a list the parameter name which is what is inside distr plus 
what the function fitdist returns (another list).

How can I do that ?  


B.R
Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] anova.rq {quantreg) - Why do different level of nesting changes the P values?!

2011-10-06 Thread Tal Galili
Hello dear R help members.

I am trying to understand the anova.rq, and I am finding something which I
can not explain (is it a bug?!):

The example is for when we have 3 nested models.  I run the anova once on
the two models, and again on the three models.  I expect that the p.value
for the comparison of model 1 and model 2 would remain the same, whether or
not I add a third model to be compared with.
However, the P values change, and I do not understand why.

Here is an example code (following with it's input):

data(barro)
fit0 - rq(y.net ~  lgdp2 + fse2 , data = barro)
fit1 - rq(y.net ~  lgdp2 + fse2 + gedy2 , data = barro)
fit2 - rq(y.net ~  lgdp2 + fse2 + gedy2 + Iy2 , data = barro)
anova(fit0,fit1,fit2, R = 1000)
anova(fit0,fit1, R = 1000)


Output:

 data(barro)
 fit0 - rq(y.net ~  lgdp2 + fse2 , data = barro)
 fit1 - rq(y.net ~  lgdp2 + fse2 + gedy2 , data = barro)
 fit2 - rq(y.net ~  lgdp2 + fse2 + gedy2 + Iy2 , data = barro)
 anova(fit0,fit1,fit2, R = 1000)
Quantile Regression Analysis of Deviance Table

Model 1: y.net ~ lgdp2 + fse2 + gedy2 + Iy2
Model 2: y.net ~ lgdp2 + fse2 + gedy2
Model 3: y.net ~ lgdp2 + fse2
  Df Resid Df F valuePr(F)
1  1  156  29.494 2.110e-07 ***
2  2  156  18.194 7.901e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 anova(fit0,fit1, R = 1000)
Quantile Regression Analysis of Deviance Table

Model 1: y.net ~ lgdp2 + fse2 + gedy2
Model 2: y.net ~ lgdp2 + fse2
  Df Resid Df F value  Pr(F)
1  1  157  3.9532 0.04852 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=Hebrew_Israel.1255  LC_CTYPE=Hebrew_Israel.1255
[3] LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C
[5] LC_TIME=Hebrew_Israel.1255

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods
base

other attached packages:
 [1] rms_3.3-1Hmisc_3.8-3
 survival_2.36-9
 [4] colorspace_1.1-0 quantreg_4.71SparseM_0.89

 [7] PerformanceAnalytics_1.0.3.2 xts_0.8-2zoo_1.7-4

[10] reporttools_1.0.6xtable_1.5-6

loaded via a namespace (and not attached):
[1] cluster_1.14.0  grid_2.13.1 lattice_0.19-33 tools_2.13.1






Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed up this algorithm (apply-fuction / 4D array)

2011-10-06 Thread Claudia Beleites

here's another one - which is easier to generalize:

x - array(rnorm(50 * 50 * 50 * 91, 0, 2), dim=c(50, 50, 50, 91))
y - x [,,,1:90] # decide yourself what to do with slice 91, but
 # 91 is not divisible by 3
system.time ({
dim (y) - c (50, 50, 50, 3, 90 %/% 3)
y - aperm (y, c (4, 1:3, 5))
v2 - colMeans (y)
})
   User  System verstrichen
   0.320.080.40

(my computer is a bit slower than Bill's:)
 system.time (v1 - f1 (x))
   User  System verstrichen
  0.360   0.030   0.396

Claudia


Am 05.10.2011 20:24, schrieb William Dunlap:

I corrected your code a bit and put it into a function, f0, to
make testing easier.  I also made a small dataset to make
testing easier.  Then I made a new function f1 which does
what f0 does in a vectorized manner:

   x- array(rnorm(50 * 50 * 50 * 91, 0, 2), dim=c(50, 50, 50, 91))
   xsmall- array(log(seq_len(2 * 2 * 2 * 91)), dim=c(2, 2, 2, 91))

   f0- function(x) {
   data_reduced- array(0, dim=c(dim(x)[1:3], trunc(dim(x)[4]/3)))
   reduce- seq(1, dim(x)[4]-1, by=3)
   for( i in 1:length(reduce) ) {
   data_reduced[ , , , i]- apply(x[ , , , reduce[i] : (reduce[i]+2) ], 
1:3, mean)
  }
  data_reduced
   }

   f1- function(x) {
  reduce- seq(1, dim(x)[4]-1, by=3)
  data_reduced- (x[, , , reduce] + x[, , , reduce+1] + x[, , , reduce+2]) 
/ 3
  data_reduced
   }

The results were:

 system.time(v1- f1(x))
  user  system elapsed
 0.280   0.040   0.323
 system.time(v0- f0(x))
  user  system elapsed
73.760   0.060  73.867
 all.equal(v0, v1)
   [1] TRUE


I thought apply would already vectorize, rather than loop over every 
coordinate.

No, you have that backwards.  Use *apply functions when you cannot figure
out how to vectorize.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Martin Batholdy
Sent: Wednesday, October 05, 2011 10:40 AM
To: R Help
Subject: [R] speed up this algorithm (apply-fuction / 4D array)

Hi,


I have this sample-code (see above) and I was wondering wether it is possible 
to speed things up.



What this code does is the following:

x is 4D array (you can imagine it as x, y, z-coordinates and a time-coordinate).

So x contains 50x50x50 data-arrays for 91 time-points.

Now I want to reduce the 91 time-points.
I want to merge three consecutive time points to one time-points by calculating 
the mean of this three
time-points for every x,y,z coordinate.

The reduce-sequence defines which time-points should get merged.
And the apply-function in the for-loop calculates the mean of the three 
3D-Arrays and puts them into a
new 4D array (data_reduced).



The problem is that even in this example it takes really long.
I thought apply would already vectorize, rather than loop over every coordinate.

But for my actual data-set it takes a really long time ... So I would be really 
grateful for any
suggestions how to speed this up.




x- array(rnorm(50 * 50 * 50 * 90, 0, 2), dim=c(50, 50, 50, 91))



data_reduced- array(0, dim=c(50, 50, 50, 90/3))

reduce- seq(1,90, 3)



for( i in 1:length(reduce) ) {

data_reduced[ , , , i]-apply(x[ , , , reduce[i] : (reduce[i]+3) ], 
1:3, mean)
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Claudia Beleites
Spectroscopy/Imaging
Institute of Photonic Technology
Albert-Einstein-Str. 9
07745 Jena
Germany

email: claudia.belei...@ipht-jena.de
phone: +49 3641 206-133
fax:   +49 2641 206-399

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply and functions with many arguments

2011-10-06 Thread Paul Hiemstra
On 10/06/2011 08:03 AM, Alaios wrote:
 Dear all,
 I would like to use the following function

 fitdist(data, distr, method=c(mle, mme, qme, mge),
 start=NULL, fix.arg=NULL, ...)


 for many different distr values like distr=c(norm,lnorm,pois) (just a 
 small example)
 and take back into a list the parameter name which is what is inside distr 
 plus what the function fitdist returns (another list).

 How can I do that ?  


 B.R
 Alex

   [[alternative HTML version deleted]]



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

Hi,

Take a look at 'expand.grid' and 'ddply' from the plyr package or
foreach from the foreach package.

cheers,
Paul

-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mean(s) from values in different row?

2011-10-06 Thread Tal Galili
One way for doing it would be to combine the columns using paste and then
use tapply to get the means.

For example:

set.seed(32341)
a1 = sample(c(a,b), 100,replace = T)
a2 = sample(c(a,b), 100,replace = T)
y = rnorm(100)
tapply(y,paste(a1,a2), mean)



Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Thu, Oct 6, 2011 at 8:40 AM, SML paral...@lafn.org wrote:

 Hello:

 Is there a way to get a mean from values stored in different rows?

 The data looks like this:
  YEAR-1, JAN, FEB, ..., DEC
  YEAR-2, JAN, FEB, ..., DEC
  YEAR-3, JAN, FEB, ..., DEC

 What I want is the mean(s) for just the consecutive winter months:
  YEAR-1.DEC, YEAR-2.JAN, YEAR-2.FEB
  YEAR-2.DEC, YEAR-3.JAN, YEAR-3.FEB
  etc.

 Thanks.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] expression set (Bioconductor) problem

2011-10-06 Thread Clayton K Collings
Hello R people,

dim(exprs(estrogenrma)

I have an expressionSet with 8 samples and 12695 features (genes)


 estrogenrma$estrogen
  present present absent absent present present absent absent
 estrogenrma$time.h
  10 10 10 10 48 48 48 48

present - grep(present, as.character(estrogenrma$estrogen))
absent  - grep(absent, as.character(estrogenrma$estrogen))
ten - grep(10, as.character(estrogenrma$time.h))
fortyeight  - grep(48, as.character(estrogenrma$time.h))

present.10 - estrogenrma[, intersect(present, ten)]
present.48 - estrogenrma[, intersect(present, fortyeight)]
absent.10 - estrogenrma[, intersect(absent, ten)]
absent.48 - estrogenrma[, intersect(absent, fortyeight)]


present.10, present.48, absent.10, and absent.48 are four expression sets
with two samples and 12695 features.

How can I make a new 2 new expressionsets, each have 12695 features and one 
sample
where 
expressionset1 = (present.10 + present.48) / 2
expressionset2 = (absent.10 + absent.48) / 2
?

Thanks,
Clayton

- Original Message -
From: Tal Galili tal.gal...@gmail.com
To: SML paral...@lafn.org
Cc: r-help@r-project.org
Sent: Thursday, October 6, 2011 4:09:43 AM
Subject: Re: [R] Mean(s) from values in different row?

One way for doing it would be to combine the columns using paste and then
use tapply to get the means.

For example:

set.seed(32341)
a1 = sample(c(a,b), 100,replace = T)
a2 = sample(c(a,b), 100,replace = T)
y = rnorm(100)
tapply(y,paste(a1,a2), mean)



Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Thu, Oct 6, 2011 at 8:40 AM, SML paral...@lafn.org wrote:

 Hello:

 Is there a way to get a mean from values stored in different rows?

 The data looks like this:
  YEAR-1, JAN, FEB, ..., DEC
  YEAR-2, JAN, FEB, ..., DEC
  YEAR-3, JAN, FEB, ..., DEC

 What I want is the mean(s) for just the consecutive winter months:
  YEAR-1.DEC, YEAR-2.JAN, YEAR-2.FEB
  YEAR-2.DEC, YEAR-3.JAN, YEAR-3.FEB
  etc.

 Thanks.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fitting parabolic function to data

2011-10-06 Thread Henri Mone
Dear R users and experts,

I want to fit a shifted parabolic function with the following
functional form to my data:

f(x)=a0*(x+a1)^2+a2

(a0, a1 and a2 are scaling factors.)
What is standard approach to do this in R? I tried the lm function
in R but I got problems getting the above functional form.

Any help is welcome :) .

Greetings,
Henri

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R CMD check

2011-10-06 Thread Uwe Ligges



On 05.10.2011 23:56, Jeff Breiwick wrote:

Richard M. Heibergerrmhat  temple.edu  writes:



The next thing to check is this item from doc/manual/R-exts.html

 Quoted strings within R-like text are handled specially...

My guess is that the problem is occuring in the .Rd file, not in the .R
file.

Remove the line, or double the \ characters.

Rich


Yes, the error appears to be in the .Rd file so I will modify that. Thanks.


Then you have not deleted the function from its example section where 
backslashes have to be escaped once more.


Uwe Ligges





Jeff

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Does it exist a function for this?

2011-10-06 Thread Jim Lemon

On 10/05/2011 07:42 PM, lunarossa wrote:

I have this kind of matrix, with thousands of cases.

A 2 apple
A 2 peach
A 3 peach
B 1 pear
B 4 peach
B 4 beef
B 7 beef
C 1 peach
D 2 apple
D 5 peach

I have to distinguish, from the other rows, the rows with peach and this
is not a problem.

I also have to discriminate the rows with peach like the second one
(associated with the same two cells A and 2 to apple, see first row)
from the row like the 3rd or the 8th ones, when the first two cells are
unique (A and 3 or C and 1).


Hi lunarossa,
I may be on the wrong track, but you could just stick the three 
components together:


alphanumfruit[,4]-paste(alphanumfruit[,1],
 alphanumfruit[,2],alphanumfruit[,3],sep=)

and the fourth column of your object (which I suspect is a data frame) 
will have elements that can be tested for matching or non-matching. More 
complicated conditions can be accommodated by pasting different 
combinations of the columns together.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting parabolic function to data

2011-10-06 Thread Duncan Murdoch

On 11-10-06 4:59 AM, Henri Mone wrote:

Dear R users and experts,

I want to fit a shifted parabolic function with the following
functional form to my data:

f(x)=a0*(x+a1)^2+a2

(a0, a1 and a2 are scaling factors.)
What is standard approach to do this in R? I tried the lm function
in R but I got problems getting the above functional form.

Any help is welcome :) .


That can be expanded into a regular quadratic:

(a0*a1^2 + a2) + 2*a0*a1*x + a0*x^2

So fit a regular quadratic, and then solve for a0, a1, a2 from the 
resulting coefficients.  The only tricky bit will be computing errors on 
the a's.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] break.axis all range of data

2011-10-06 Thread Jim Lemon

On 10/06/2011 12:31 AM, Heverkuhn Heverkuhn wrote:

..all the point from 8 to 13.

On Wed, Oct 5, 2011 at 8:28 AM, Heverkuhn Heverkuhnheverk...@gmail.comwrote:


The problem with that function is that it does not really separate the
2parts of the graph but it inserts , when style is gap, a blank strip that
cover axis and points. So for example if a insert it at 8 and I set the gap
of length 5 , it would cancel al the point from 8 to 10.

 ...
 I would like to increase the distance between x tick-marks 8 and 9, 
 and not connect the points  x=8 and x=9.


Ah, I think I see what you want. You want an axis like this:

axis(1,at=c(1:8,10:37),labels=1:36)

and to get your points right, you would have to do something like:

plot(c(1:8,10:37),1:36,xaxt=n)
lines(1:8,1:8)
lines(10:37,9:36)

first. This is more or less the reverse of the gap.* functions in the 
plotrix package.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply and functions with many arguments

2011-10-06 Thread Alaios
Hello Paul
I have also tried this one
distrList-c(norm,lnorm,pois,exp,gamma,nbinom,geom,logis); 
    return (lapply(distrList, function(distrList) { fitdist(x1,distrList)}));

which seems to work.
I am not sure though if there is a strict performance penalty.

B.R
Alex




From: Paul Hiemstra paul.hiems...@knmi.nl

Cc: R-help@r-project.org R-help@r-project.org
Sent: Thursday, October 6, 2011 10:07 AM
Subject: Re: [R] apply and functions with many arguments


On 10/06/2011 08:03 AM, Alaios wrote: 
Dear all,
I would like to use the following function fitdist(data, distr, method=c(mle, 
mme, qme, mge),
start=NULL, fix.arg=NULL, ...) for many different distr values like 
distr=c(norm,lnorm,pois) (just a small example)
and take back into a list the parameter name which is what is inside distr plus 
what the function fitdist returns (another list). How can I do that ?   B.R
Alex [[alternative HTML version deleted]] 

__ R-help@r-project.org mailing 
list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting 
guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code. 
Hi,

Take a look at 'expand.grid' and 'ddply' from the plyr package or
foreach from the foreach package.

cheers,
Paul


-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova.rq {quantreg) - Why do different level of nesting changes the P values?!

2011-10-06 Thread Eik Vettorazzi
Hi Tal,
you are comparing different things. The Details-section of ?anova.qr
states test the hypothesis that smaller models are adequate relative to
the largest specified model. So in your first anova you compare fit2
with fit1 and fit0, in your second attempt fit1 with fit0, so you have
different base-models to compare with, and consequently different
p-values.

hth.

Am 06.10.2011 10:05, schrieb Tal Galili:
 Hello dear R help members.
 
 I am trying to understand the anova.rq, and I am finding something which I
 can not explain (is it a bug?!):
 
 The example is for when we have 3 nested models.  I run the anova once on
 the two models, and again on the three models.  I expect that the p.value
 for the comparison of model 1 and model 2 would remain the same, whether or
 not I add a third model to be compared with.
 However, the P values change, and I do not understand why.
 
 Here is an example code (following with it's input):
 
 data(barro)
 fit0 - rq(y.net ~  lgdp2 + fse2 , data = barro)
 fit1 - rq(y.net ~  lgdp2 + fse2 + gedy2 , data = barro)
 fit2 - rq(y.net ~  lgdp2 + fse2 + gedy2 + Iy2 , data = barro)
 anova(fit0,fit1,fit2, R = 1000)
 anova(fit0,fit1, R = 1000)
 
 
 Output:
 
 data(barro)
 fit0 - rq(y.net ~  lgdp2 + fse2 , data = barro)
 fit1 - rq(y.net ~  lgdp2 + fse2 + gedy2 , data = barro)
 fit2 - rq(y.net ~  lgdp2 + fse2 + gedy2 + Iy2 , data = barro)
 anova(fit0,fit1,fit2, R = 1000)
 Quantile Regression Analysis of Deviance Table
 
 Model 1: y.net ~ lgdp2 + fse2 + gedy2 + Iy2
 Model 2: y.net ~ lgdp2 + fse2 + gedy2
 Model 3: y.net ~ lgdp2 + fse2
   Df Resid Df F valuePr(F)
 1  1  156  29.494 2.110e-07 ***
 2  2  156  18.194 7.901e-08 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 anova(fit0,fit1, R = 1000)
 Quantile Regression Analysis of Deviance Table
 
 Model 1: y.net ~ lgdp2 + fse2 + gedy2
 Model 2: y.net ~ lgdp2 + fse2
   Df Resid Df F value  Pr(F)
 1  1  157  3.9532 0.04852 *
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 sessionInfo()
 R version 2.13.1 (2011-07-08)
 Platform: i386-pc-mingw32/i386 (32-bit)
 
 locale:
 [1] LC_COLLATE=Hebrew_Israel.1255  LC_CTYPE=Hebrew_Israel.1255
 [3] LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C
 [5] LC_TIME=Hebrew_Israel.1255
 
 attached base packages:
 [1] splines   stats graphics  grDevices utils datasets  methods
 base
 
 other attached packages:
  [1] rms_3.3-1Hmisc_3.8-3
  survival_2.36-9
  [4] colorspace_1.1-0 quantreg_4.71SparseM_0.89
 
  [7] PerformanceAnalytics_1.0.3.2 xts_0.8-2zoo_1.7-4
 
 [10] reporttools_1.0.6xtable_1.5-6
 
 loaded via a namespace (and not attached):
 [1] cluster_1.14.0  grid_2.13.1 lattice_0.19-33 tools_2.13.1
 
 
 
 
 
 
 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |  972-52-7275845
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
 www.r-statistics.com (English)
 --
 
   [[alternative HTML version deleted]]
 
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Eik Vettorazzi

Department of Medical Biometry and Epidemiology
University Medical Center Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790

--
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; 
Gerichtsstand: Hamburg

Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. 
Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subsetting a data frame with multiple values and exclusions.

2011-10-06 Thread natalie.vanzuydam
Thanks.  Such a short and sweet answer that does what it should.

-
Natalie Van Zuydam

PhD Student
University of Dundee
nvanzuy...@dundee.ac.uk
--
View this message in context: 
http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3877472.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] The Sets package and element equality test

2011-10-06 Thread Johnny Paulo
Hi,

I tried to use the sets package yesterday, which seems to be a very powerful
package: thanks to the developers. I wanted to define sets of lists where 2
lists are considered equal if they have the same length.
So, I implemented:

match.2.list - function(l1,l2){
  length(l1)==length(l2)
}

and then defined my cset as:

s - set(list(1,2),list(3,4))
lset - cset(s,matchfun(match.2.list))

so if I now do:
y - list(3,4)
y %e% lset

I get the correct answer, which is TRUE.

But if I do:
x - list(1,8)
x %e% lset

I now get FALSE, even though x is a list of length 2, and should thus match
any of the 2 lists in lset.

I must be doing something wrong; I checked with the doc, but I don't
understand.

Thanks to anyone who can help.

Regards

Johnny

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with .C

2011-10-06 Thread Grigory Alexandrovich

Hello,

first thank you for your answers.

I did not read the whole pdf Writing R Extension, but I read this 
strongly shortened introduction to this subject:


http://www.math.kit.edu/stoch/~lindner/media/.c.call%20extensions.pdf

I get the same error with this C-function:

void test(double * b, int l)
{
 int i;
 for(i=0; i  l ; i++) b[i] +=i;
}



I call it from R like this:

parameter = c(0,0,1,1,1,0,1.5,0.7,0,1.2,0.3);
.C(test, as.double(parameter), as.integer(11))

The programm crashes even in this simple case.
Where can be the error?

Thanks
Grigory Alexandrovich







Answer 1

Without knowing that C code, we cannot know. Have you read Writing R Extensions 
carefully? I.e. take care with memory allocation and printing as mentioned in 
the manual.

Uwe Ligges



Answer 2

This looks like a classic case of not reading the manual, and then compounding it by not 
reading the posting guide. The manual would be the Writing R Extensions pdf 
that comes with R or you can google it. The posting guide is referenced at the bottom of 
this and every other posting on this mailing list.
There are nearly an infinite variety of errors that can lead to a crash, so 
it is really unreasonable of you to pose this question this way and expect constructive 
assistance.
---
Jeff Newmiller The . . Go Live...
DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---
Sent from my phone. Please excuse my brevity.



Answer 3


It's impossible to say, with such minimal information, but a reasonable
guess is that there is a problem with the declaration of x and y in
foo.c.  These would (I think) need to be declared as double *, not double,
when foo is called from .C().

cheers,

Rolf Turner



Answer 4


Hi,

As other have said, it's very difficult to help you without an example
+ code to know what you are talking about.

That having been said, it seems as if you are just getting your feet
wet in this R -- C bridge, and I'd recommend you checkout the Rcpp
and inline package to help make your life a lot easier ...

-steve










On 04.10.2011 14:04, Grigory Alexandrovich wrote:

Hello,

I wrote a function in C, which works fine if called from the
main-function in C.

But as soon as I try to call this function from R like .C('foo',
as.double(x), as.integer(y)), the programm crashes.

I created a dll with the cmd command R --arch x64 CMD SHLIB foo.c and
loaded it into R with dyn.load().

What can be the cause of such behaviour?
Again, the C-funcion itself works, but not if called from R.

Thanks
Grigory Alexandrovich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] non-cumulative hazard in Cox model with time-dependent covariates

2011-10-06 Thread koshihaku
Dear all,
Is there a way to calculate the non-cumulative hazard (instantaneous
hazard), which is the product of baseline hazard and exp{beta*covariate} ?
I knew in survfit, we can get the estimator of cumulative baseline hazard,
but how can we get the non-cumulative one?

Thank you very much!

Koshihaku

--
View this message in context: 
http://r.789695.n4.nabble.com/non-cumulative-hazard-in-Cox-model-with-time-dependent-covariates-tp3877744p3877744.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About stepwise regression problem

2011-10-06 Thread pigpigmeow
If I used mixed model ( linear + smoothing function)

How can I used p-value to perform stepwise regression???
Is it similar of Pr(|t|) of linear term and P-value of smoothing term but T
value of linear and F  of smoothing term? 
please help!


--
View this message in context: 
http://r.789695.n4.nabble.com/About-stepwise-regression-problem-tp3870217p3877259.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] error messages

2011-10-06 Thread Joyce Flynt
Im doing a project using Rcommander. I have a dataset with 5 variables, BMI 
being one of them. I would like to find the variance, standard deviation, 
range, median, etc of BMI. However, everytime I type a command into the script 
window, (such as var(BMI$BMI,na.rm=TRUE) )an error message pops up saying 
Object BMI not found. I can graph it, work it, do t-test, etc,. It's listed 
as one of the variables in the data set. So I don't understand why it cant find 
BMI. Thanks
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About stepwise regression problem

2011-10-06 Thread Chris Mcowen
Why not use AIC / BIC based model selection which in this case makes more sense 
than artificial P-Values, and eliminates the problems of stepwise regression. 
For background read Burnham and anderson - multimodel selection.

Chris.


On 6 Oct 2011, at 08:14, pigpigmeow wrote:

If I used mixed model ( linear + smoothing function)

How can I used p-value to perform stepwise regression???
Is it similar of Pr(|t|) of linear term and P-value of smoothing term but T
value of linear and F  of smoothing term? 
please help!


--
View this message in context: 
http://r.789695.n4.nabble.com/About-stepwise-regression-problem-tp3870217p3877259.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Concecutive zeros and ones

2011-10-06 Thread Alaios
Great :)




From: Dimitris Rizopoulos d.rizopou...@erasmusmc.nl

Cc: R-help@r-project.org R-help@r-project.org
Sent: Thursday, October 6, 2011 9:22 AM
Subject: Re: [R] Concecutive zeros and ones

Check function rle().


I hope it helps.

Best,
Dimitris



On 10/6/2011 9:18 AM, Alaios wrote:
 Dear all,
 I have  a data series (might be vector or matrix) which is composed only from 
 zeros and ones like the following example

 0 0 0 1 1 0 1 0 0 1 1 1 1 0 0 0

 I want to be able to return back the length of concecutive zeros and the 
 length of concecutive ones.

 For that I want to have something like that returned:


 zeros= [3 1 2 3];
 ones=[2 1 4];

 How I can do that simply in R?

 I would like to thank you in advance for your help


 B.R
 Alex

     [[alternative HTML version deleted]]




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reg : read missing values from database using RJDBC

2011-10-06 Thread Uwe Ligges



On 05.10.2011 17:10, Raji wrote:

Hi All,

  This seems to be a bug with RJDBC package and it has been fixed in the
latest RJDBC_0.1-6 version. I would like to try out RJDBC_0.1-6. can you
please guide me to a link where i can find the 64-bit RJDBC_0.1-6.zip . I
could find the 32-bit version at
http://cran.sixsigmaonline.org/bin/windows/contrib/2.11/ ?


RJDBC_0.2-0 seesm to be recent and binaries are available for recent 
versions of R on all CRAN mirrors. If you need a binary for an ancient 
version of R, you have to compile it yourself.


Uwe Ligges






Thanks in advance.

--
View this message in context: 
http://r.789695.n4.nabble.com/Reg-read-missing-values-from-database-using-RJDBC-tp3257766p3874809.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova.rq {quantreg) - Why do different level of nesting changes the P values?!

2011-10-06 Thread Tal Galili
Thank you Eik - this makes much more sense now.



Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Thu, Oct 6, 2011 at 1:21 PM, Eik Vettorazzi e.vettora...@uke.de wrote:

 Hi Tal,
 you are comparing different things. The Details-section of ?anova.qr
 states test the hypothesis that smaller models are adequate relative to
 the largest specified model. So in your first anova you compare fit2
 with fit1 and fit0, in your second attempt fit1 with fit0, so you have
 different base-models to compare with, and consequently different
 p-values.

 hth.

 Am 06.10.2011 10:05, schrieb Tal Galili:
  Hello dear R help members.
 
  I am trying to understand the anova.rq, and I am finding something which
 I
  can not explain (is it a bug?!):
 
  The example is for when we have 3 nested models.  I run the anova once on
  the two models, and again on the three models.  I expect that the p.value
  for the comparison of model 1 and model 2 would remain the same, whether
 or
  not I add a third model to be compared with.
  However, the P values change, and I do not understand why.
 
  Here is an example code (following with it's input):
 
  data(barro)
  fit0 - rq(y.net ~  lgdp2 + fse2 , data = barro)
  fit1 - rq(y.net ~  lgdp2 + fse2 + gedy2 , data = barro)
  fit2 - rq(y.net ~  lgdp2 + fse2 + gedy2 + Iy2 , data = barro)
  anova(fit0,fit1,fit2, R = 1000)
  anova(fit0,fit1, R = 1000)
 
 
  Output:
 
  data(barro)
  fit0 - rq(y.net ~  lgdp2 + fse2 , data = barro)
  fit1 - rq(y.net ~  lgdp2 + fse2 + gedy2 , data = barro)
  fit2 - rq(y.net ~  lgdp2 + fse2 + gedy2 + Iy2 , data = barro)
  anova(fit0,fit1,fit2, R = 1000)
  Quantile Regression Analysis of Deviance Table
 
  Model 1: y.net ~ lgdp2 + fse2 + gedy2 + Iy2
  Model 2: y.net ~ lgdp2 + fse2 + gedy2
  Model 3: y.net ~ lgdp2 + fse2
Df Resid Df F valuePr(F)
  1  1  156  29.494 2.110e-07 ***
  2  2  156  18.194 7.901e-08 ***
  ---
  Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
  anova(fit0,fit1, R = 1000)
  Quantile Regression Analysis of Deviance Table
 
  Model 1: y.net ~ lgdp2 + fse2 + gedy2
  Model 2: y.net ~ lgdp2 + fse2
Df Resid Df F value  Pr(F)
  1  1  157  3.9532 0.04852 *
  ---
  Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
  sessionInfo()
  R version 2.13.1 (2011-07-08)
  Platform: i386-pc-mingw32/i386 (32-bit)
 
  locale:
  [1] LC_COLLATE=Hebrew_Israel.1255  LC_CTYPE=Hebrew_Israel.1255
  [3] LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C
  [5] LC_TIME=Hebrew_Israel.1255
 
  attached base packages:
  [1] splines   stats graphics  grDevices utils datasets  methods
  base
 
  other attached packages:
   [1] rms_3.3-1Hmisc_3.8-3
   survival_2.36-9
   [4] colorspace_1.1-0 quantreg_4.71
  SparseM_0.89
 
   [7] PerformanceAnalytics_1.0.3.2 xts_0.8-2zoo_1.7-4
 
  [10] reporttools_1.0.6xtable_1.5-6
 
  loaded via a namespace (and not attached):
  [1] cluster_1.14.0  grid_2.13.1 lattice_0.19-33 tools_2.13.1
 
 
 
 
 
 
  Contact
  Details:---
  Contact me: tal.gal...@gmail.com |  972-52-7275845
  Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
  www.r-statistics.com (English)
 
 --
 
[[alternative HTML version deleted]]
 
 
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


 --
 Eik Vettorazzi

 Department of Medical Biometry and Epidemiology
 University Medical Center Hamburg-Eppendorf

 Martinistr. 52
 20246 Hamburg

 T ++49/40/7410-58243
 F ++49/40/7410-57790

 --
 Pflichtangaben gemäß Gesetz über elektronische Handelsregister und
 Genossenschaftsregister sowie das Unternehmensregister (EHUG):

 Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen
 Rechts; Gerichtsstand: Hamburg

 Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden),
 Dr. Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Titles changing when a plot is redrawn

2011-10-06 Thread John Nolan
Thank you for telling me a fix.  

But I still don't know if this behavior is what is intended.  I used 
bquote(...) because the plotmath(...) help page refers to bquote and gives an 
example like this.  I suspect most users will be baffled by this kind of 
behavior, especially since it does not occur when there is one plot.  By this I 
mean that I can draw one plot and title it with the same string using bquote( 
).  If I change the value of i, and redraw the graph, the redrawn graph has the 
original value of i in the title, not the updated value. So in this case, an 
unevaluated expression is not re-evaluated at draw time?

John 


-xieyi...@gmail.com wrote: -
To: John Nolan jpno...@american.edu
From: Yihui Xie 
Sent by: xieyi...@gmail.com
Date: 10/05/2011 11:49PM
Cc: r-help@r-project.org
Subject: Re: [R] Titles changing when a plot is redrawn

I think the problem is your str1 is an unevaluated expression and will
change with the value of i. You should be able to get a fixed title by
this:

par(mfrow = c(2, 1))
for (i in 1:2) {
x - 1:100
rmse - sin(x/5)  # fake data
plot(x, rmse, main = substitute(list(RMSE(theta), i == z), list(z = i)))
}

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA



On Wed, Oct 5, 2011 at 10:01 PM, John Nolan jpno...@american.edu wrote:
 I ran into a problem with titles on graphs.  I wanted a graph with
 multiple subplots, with each having a title that involved both
 a Greek letter and an identifier for each graph.  Below is a
 simplified version of code to do this.  The graph appears fine,
 with the first graph having i=1 in the title, and the second
 graph having i=2 in the title.  However, when I resize the graph,
 the plot titles change, with both showing i=2. The titles also
 change when I save the plot to a file using the File menu,
 then Save as in Windows.  Is this what should happen?  I
 always thought that titles are static once the graph is
 drawn, and couldn't change.

 The problem occurs on some version of R, but not on others.
 It does occur with the latest version of R:
 str(R.Version())
 List of 13
  $ platform  : chr i386-pc-mingw32
  $ arch  : chr i386
  $ os: chr mingw32
  $ system: chr i386, mingw32
  $ status: chr 
  $ major : chr 2
  $ minor : chr 13.2
  $ year  : chr 2011
  $ month : chr 09
  $ day   : chr 30
  $ svn rev   : chr 57111
  $ language  : chr R
  $ version.string: chr R version 2.13.2 (2011-09-30)

 The problem also occurs on:  R 2.13.0 on Win32
  and Mac (R 2.12.0, x86_64-apple-darwin9.8.0)
 The problem DOES NOT occur under R 2.10.0 on Win32.

 If the code below is bracketed with pdf(test.pdf)
 and dev.off(), the correct labels appear in the file.
 This behavior doesn't seem to appear if there is only
 one plot.

 My guess is that the titles are being reevaluated when
 the plot is redrawn, and since the value of i is 2 when
 the redraw occurs, both labels get set to i=2.  I guess
 Save as forces a redraw because a dialog box pops up?

 If could be that this behavior is what is intended, and that
 somewhere between R 2.10.0 and R 2.13.2 an old bug was fixed.
 Or this behavior is not what was intended, and a bug was
 introduced.  If the former, this should be explained to the user
 somewhere.  If the latter, can someone track it down and fix?

 John Nolan

 #-
 par(mfrow=c(2,1))
 for (i in 1:2) {
  x - 1:100
  rmse - sin(x/5)  # fake data
  plot(x,rmse)
  str1 - bquote( paste(RMSE(,theta,), ,i==.(i)  ))
  title( str1 )
 }
 #-


  ...

  John P. Nolan
  Math/Stat Department
  227 Gray Hall
  American University
  4400 Massachusetts Avenue, NW
  Washington, DC 20016-8050

  jpno...@american.edu
  202.885.3140 voice
  202.885.3155 fax
  http://academic2.american.edu/~jpnolan
  ...
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with .C

2011-10-06 Thread Jan van der Laan
An obvious reason might be that your second argument should be a  
pointer to int.


As others have mentioned, you might want to have a look at Rccp and/or  
inline. The documentation is good and I find it much easier to work  
with.


For example, your example could be written as:

library(Rcpp)
library(inline)

test - cxxfunction(signature(x = numeric ) , '
Rcpp::NumericVector v(x);
Rcpp::NumericVector result(v.length());
for (int i = 0; i  v.length(); ++i) {
result[i] = v[i] + i;
}
return(result);
', plugin = Rcpp )


HTH,

Jan


Quoting Grigory Alexandrovich alexandrov...@mathematik.uni-marburg.de:


Hello,

first thank you for your answers.

I did not read the whole pdf Writing R Extension, but I read this
strongly shortened introduction to this subject:

http://www.math.kit.edu/stoch/~lindner/media/.c.call%20extensions.pdf

I get the same error with this C-function:

void test(double * b, int l)
{
 int i;
 for(i=0; i  l ; i++) b[i] +=i;
}



I call it from R like this:

parameter = c(0,0,1,1,1,0,1.5,0.7,0,1.2,0.3);
.C(test, as.double(parameter), as.integer(11))

The programm crashes even in this simple case.
Where can be the error?

Thanks
Grigory Alexandrovich







Answer 1
Without knowing that C code, we cannot know. Have you read Writing   
R Extensions carefully? I.e. take care with memory allocation and   
printing as mentioned in the manual.


Uwe Ligges



Answer 2
This looks like a classic case of not reading the manual, and then   
compounding it by not reading the posting guide. The manual would   
be the Writing R Extensions pdf that comes with R or you can   
google it. The posting guide is referenced at the bottom of this   
and every other posting on this mailing list.
There are nearly an infinite variety of errors that can lead to a   
crash, so it is really unreasonable of you to pose this question   
this way and expect constructive assistance.

---
Jeff Newmiller The . . Go Live...
DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---
Sent from my phone. Please excuse my brevity.



Answer 3


It's impossible to say, with such minimal information, but a reasonable
guess is that there is a problem with the declaration of x and y in
foo.c.  These would (I think) need to be declared as double *, not double,
when foo is called from .C().

   cheers,

   Rolf Turner



Answer 4


Hi,

As other have said, it's very difficult to help you without an example
+ code to know what you are talking about.

That having been said, it seems as if you are just getting your feet
wet in this R -- C bridge, and I'd recommend you checkout the Rcpp
and inline package to help make your life a lot easier ...

-steve










On 04.10.2011 14:04, Grigory Alexandrovich wrote:

Hello,

I wrote a function in C, which works fine if called from the
main-function in C.

But as soon as I try to call this function from R like .C('foo',
as.double(x), as.integer(y)), the programm crashes.

I created a dll with the cmd command R --arch x64 CMD SHLIB foo.c and
loaded it into R with dyn.load().

What can be the cause of such behaviour?
Again, the C-funcion itself works, but not if called from R.

Thanks
Grigory Alexandrovich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Running *slow*

2011-10-06 Thread Thomas
Anyone got any hints on how to make this code more efficient? An early  
version (which to be fair did more than this one is) ran for 330 hours  
and produced no output.


I have a two column table, Dat, with 12,000,000 rows and I want to  
produce a lookup table, ltable, in a 1 dimensional matrix with one  
copy of each of the values in Dat:


for (i in 1:nrow(Dat))
{
for (j in 1:2)
{
#If next value is already in ltable, do nothing
if (is.na(match(Dat[i,j], ltable))){ltable - rbind(ltable,Dat[i,j])}
}
}

but it takes forever to produce anything.

Any advice gratefully received.

Thomas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Re: Party extract BinaryTree from cforest?

2011-10-06 Thread Torsten Hothorn



-- Forwarded message --
Date: Wed, 5 Oct 2011 21:09:41 +
From: Chris christopher.a.h...@gmail.com
To: r-h...@stat.math.ethz.ch
Subject: Re: [R] Party extract BinaryTree from cforest?

I found an internal workaround to this to support printing and plot type 
simple,


tt-party:::prettytree(cf at ensemble[[1]], names(cf at data at 
get(input)))

npt - new(BinaryTree)
npt@tree-tt
plot(npt)

Error in terminal_panel(S4 object of class BinaryTree) :
 âctreeobjâ is not a regression tree


library(party)
cf - cforest(Species ~ ., data = iris)
pt - party:::prettytree(cf@ensemble[[1]], names(cf@data@get(input)))
pt
nt - new(BinaryTree)
nt@tree - pt
nt@data - cf@data
nt@responses - cf@responses
nt
plot(nt)

will do.

Torsten


plot(npt, type=simple)


Any additional help is appreciated.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error messages

2011-10-06 Thread Ben Bolker
Joyce Flynt dr.joyceflynt at yahoo.com writes:

 
 Im doing a project using Rcommander. I have a dataset with 5 variables, 
 BMI being one of them. I would like to
 find the variance, standard deviation, range, median, etc of BMI. 
 However, everytime I type a command
 into the script window, (such as var(BMI$BMI,na.rm=TRUE) )
 an error message pops up saying Object BMI
 not found. I can graph it, work it, do t-test, etc,.
  It's listed as one of the variables in the data set.
  So I don't understand why it cant find BMI. Thanks


 It's hard to solve without a *reproducible* example.
Is your data set really called 'BMI'?  (i.e. should you
be saying BMI$BMI, or just BMI (if your data frame is
attached), or mydata$BMI? If your data set *is* called
BMI, does it make a difference if you rename it to e.g. BMIdat?

  What are the results of

ls()

?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with .C

2011-10-06 Thread Uwe Ligges

On 06.10.2011 14:51, Jan van der Laan wrote:

An obvious reason might be that your second argument should be a pointer
to int.

As others have mentioned, you might want to have a look at Rccp and/or
inline. The documentation is good and I find it much easier to work with.

For example, your example could be written as:

library(Rcpp)
library(inline)

test - cxxfunction(signature(x = numeric ) , '
Rcpp::NumericVector v(x);
Rcpp::NumericVector result(v.length());
for (int i = 0; i  v.length(); ++i) {
result[i] = v[i] + i;
}
return(result);
', plugin = Rcpp )




Oh, come on, this is now really too much of overkill.

Just make the original source


void test(double *b, int *l)
{
 int i;
 for(i=0; i  *l ; i++) b[i] += i;
}


which you would have know after reading the Wriiting R Extensions manual.

Best,
Uwe Ligges





HTH,

Jan


Quoting Grigory Alexandrovich alexandrov...@mathematik.uni-marburg.de:


Hello,

first thank you for your answers.

I did not read the whole pdf Writing R Extension, but I read this
strongly shortened introduction to this subject:

http://www.math.kit.edu/stoch/~lindner/media/.c.call%20extensions.pdf

I get the same error with this C-function:

void test(double * b, int l)
{
int i;
for(i=0; i  l ; i++) b[i] +=i;
}



I call it from R like this:

parameter = c(0,0,1,1,1,0,1.5,0.7,0,1.2,0.3);
.C(test, as.double(parameter), as.integer(11))

The programm crashes even in this simple case.
Where can be the error?

Thanks
Grigory Alexandrovich







Answer 1

Without knowing that C code, we cannot know. Have you read Writing R
Extensions carefully? I.e. take care with memory allocation and
printing as mentioned in the manual.

Uwe Ligges



Answer 2

This looks like a classic case of not reading the manual, and then
compounding it by not reading the posting guide. The manual would be
the Writing R Extensions pdf that comes with R or you can google
it. The posting guide is referenced at the bottom of this and every
other posting on this mailing list.
There are nearly an infinite variety of errors that can lead to a
crash, so it is really unreasonable of you to pose this question
this way and expect constructive assistance.
---

Jeff Newmiller The . . Go Live...
DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---

Sent from my phone. Please excuse my brevity.



Answer 3


It's impossible to say, with such minimal information, but a reasonable
guess is that there is a problem with the declaration of x and y in
foo.c. These would (I think) need to be declared as double *, not
double,
when foo is called from .C().

cheers,

Rolf Turner



Answer 4


Hi,

As other have said, it's very difficult to help you without an example
+ code to know what you are talking about.

That having been said, it seems as if you are just getting your feet
wet in this R -- C bridge, and I'd recommend you checkout the Rcpp
and inline package to help make your life a lot easier ...

-steve










On 04.10.2011 14:04, Grigory Alexandrovich wrote:

Hello,

I wrote a function in C, which works fine if called from the
main-function in C.

But as soon as I try to call this function from R like .C('foo',
as.double(x), as.integer(y)), the programm crashes.

I created a dll with the cmd command R --arch x64 CMD SHLIB foo.c and
loaded it into R with dyn.load().

What can be the cause of such behaviour?
Again, the C-funcion itself works, but not if called from R.

Thanks
Grigory Alexandrovich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to install a locally built package

2011-10-06 Thread Bond, Stephen
Is there a way to put all R code in a single file? 

I have too many small files now, and it is inconvenient to edit (I still have 
to put everything in one buffer) and when I add just one new func I have to go 
through the process of manually editing all help files one by one. When I put 
all the code in one file only the first func is loaded.

Thanks everybody

Stephen B

-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] 
Sent: Friday, September 16, 2011 3:43 AM
To: Bond, Stephen
Cc: r-help@r-project.org
Subject: Re: [R] how to install a locally built package



On 15.09.2011 20:38, Bond, Stephen wrote:
 I got it working by typing a string in getdfv.Rd after \title{

 \title{ getdf
 %%  ~~function to do ... ~~
 }

 Strange why the skeleton would not do that given it did \name{getdfv} 
 \alias{getdfv}

One reason is that we try to force people to deal with the documentation which 
we feel is very important, even if the stuff is just used by yourself.

Best,
Uwe Ligges



 Anyway, I'm happy now.

 Stephen Bond

 -Original Message-
 From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
 Sent: Thursday, September 15, 2011 12:15 PM
 To: Bond, Stephen
 Cc: r-help@r-project.org
 Subject: Re: [R] how to install a locally built package



 On 15.09.2011 17:47, Bond, Stephen wrote:
 Uwe,

 That gave me the same error like CMD install
 install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, 
 type=source)

 install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, 
 type=source) Loading required package: stats Loading required 
 package: utils Loading required package: graphics Loading required 
 package: splines
 * installing *source* package 'mypack' ...
 ** R
 ** preparing package for lazy loading
 ** help
 Warning: 
 C:/DOCUME~1/BondStep/LOCALS~1/Temp/Rtmpw5N7dm/R.INSTALL6ccc14e6/mypac
 k/man/mypack-package.Rd:33: All text must be in a section
 *** installing help indices
 Error in Rd_info(db[[i]]) : Rd files must have a non-empty \title.
 See chapter 'Writing R documentation' in manual 'Writing R Extensions'.
 * removing 'c:/PROGRA~1/R/R-212~1.1/library/mypack'
 * restoring previous 'c:/PROGRA~1/R/R-212~1.1/library/mypack'
 Warning message:
 In install.packages(C:/Temp/mypack_1.0.tar.gz, repos = NULL, type = 
 source) :
 installation of package 'C:/Temp/mypack_1.0.tar.gz' had non-zero 
 exit status

 is there a way to skip the Rd part? This is for private use only and there 
 is no help or data files.


 Then you have to delete the Rd file that has been generated by 
 package.skeleton.
 Please read the manuals Writing R Extensions and R Installation and 
 Administration.

 Best,
 Uwe Ligges



 Thank you.

 Stephen Bond

 -Original Message-
 From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
 Sent: Thursday, September 15, 2011 11:43 AM
 To: Bond, Stephen
 Cc: r-help@r-project.org
 Subject: Re: [R] how to install a locally built package



 On 15.09.2011 17:32, Bond, Stephen wrote:
 Hello useRs,

 I am trying to put my funcs in a package to avoid clutter in my workspace 
 as the funcs are now in .Rprofile.
 All plain R code no compilations. Using win XP with a full cygwin 
 install and R2.12 I first did

 package.skeleton(mypack,list=getdfv, namespace=T) # just a 
 single func

 which created a folder with the required stuff in it.
 Calling R CMD build creates a tar.gz file with warnings about DOS paths.

 Trying to install from the R GUI complains
 install.packages(mypack,repos=C:/Temp/mypack_1.0.tar.gz)
 Warning: unable to access index for repository 
 C:/Temp/mypack_1.0.tar.gz/bin/windows/contrib/2.12
 Warning message:
 In getDependencies(pkgs, dependencies, available, lib) :
  package 'mypack' is not available


 install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, 
 type=source)

 seems to be what you want.

 Uwe Ligges


 putting the mypack folder in zip and then trying to install from the 
 zip produces no error message but then

 library(mypack)
 Error in library(mypack) : 'mypack' is not a valid installed 
 package

 Just putting the folder in the library folder of R did not work 
 either

 Finally
 R CMD Install complains about missing title in the Rd file. I do not have a 
 Rd file as the skeleton did not create it.

 Please, advise how to make simple R code available to be called 
 without showing in ls()

 Thank you.
 Stephen Bond


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to install a locally built package

2011-10-06 Thread Uwe Ligges



On 06.10.2011 15:10, Bond, Stephen wrote:

Is there a way to put all R code in a single file?

I have too many small files now, and it is inconvenient to edit (I still have 
to put everything in one buffer) and when I add just one new func I have to go 
through the process of manually editing all help files one by one. When I put 
all the code in one file only the first func is loaded.


He? You really do not need to edit anything when you add a function.
Just don't use package skeleton but just add the function.

Uwe Ligges





Thanks everybody

Stephen B

-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
Sent: Friday, September 16, 2011 3:43 AM
To: Bond, Stephen
Cc: r-help@r-project.org
Subject: Re: [R] how to install a locally built package



On 15.09.2011 20:38, Bond, Stephen wrote:

I got it working by typing a string in getdfv.Rd after \title{

\title{ getdf
%%  ~~function to do ... ~~
}

Strange why the skeleton would not do that given it did \name{getdfv}
\alias{getdfv}


One reason is that we try to force people to deal with the documentation which 
we feel is very important, even if the stuff is just used by yourself.

Best,
Uwe Ligges




Anyway, I'm happy now.

Stephen Bond

-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
Sent: Thursday, September 15, 2011 12:15 PM
To: Bond, Stephen
Cc: r-help@r-project.org
Subject: Re: [R] how to install a locally built package



On 15.09.2011 17:47, Bond, Stephen wrote:

Uwe,

That gave me the same error like CMD install

install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL,
type=source)


install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL,
type=source) Loading required package: stats Loading required
package: utils Loading required package: graphics Loading required
package: splines
* installing *source* package 'mypack' ...
** R
** preparing package for lazy loading
** help
Warning:
C:/DOCUME~1/BondStep/LOCALS~1/Temp/Rtmpw5N7dm/R.INSTALL6ccc14e6/mypac
k/man/mypack-package.Rd:33: All text must be in a section
*** installing help indices
Error in Rd_info(db[[i]]) : Rd files must have a non-empty \title.
See chapter 'Writing R documentation' in manual 'Writing R Extensions'.
* removing 'c:/PROGRA~1/R/R-212~1.1/library/mypack'
* restoring previous 'c:/PROGRA~1/R/R-212~1.1/library/mypack'
Warning message:
In install.packages(C:/Temp/mypack_1.0.tar.gz, repos = NULL, type = source) 
:
 installation of package 'C:/Temp/mypack_1.0.tar.gz' had non-zero
exit status

is there a way to skip the Rd part? This is for private use only and there is 
no help or data files.



Then you have to delete the Rd file that has been generated by
package.skeleton.
Please read the manuals Writing R Extensions and R Installation and
Administration.

Best,
Uwe Ligges




Thank you.

Stephen Bond

-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
Sent: Thursday, September 15, 2011 11:43 AM
To: Bond, Stephen
Cc: r-help@r-project.org
Subject: Re: [R] how to install a locally built package



On 15.09.2011 17:32, Bond, Stephen wrote:

Hello useRs,

I am trying to put my funcs in a package to avoid clutter in my workspace as 
the funcs are now in .Rprofile.
All plain R code no compilations. Using win XP with a full cygwin
install and R2.12 I first did

package.skeleton(mypack,list=getdfv, namespace=T) # just a
single func

which created a folder with the required stuff in it.
Calling R CMD build creates a tar.gz file with warnings about DOS paths.

Trying to install from the R GUI complains

install.packages(mypack,repos=C:/Temp/mypack_1.0.tar.gz)

Warning: unable to access index for repository
C:/Temp/mypack_1.0.tar.gz/bin/windows/contrib/2.12
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
  package 'mypack' is not available



install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL,
type=source)

seems to be what you want.

Uwe Ligges



putting the mypack folder in zip and then trying to install from the
zip produces no error message but then


library(mypack)

Error in library(mypack) : 'mypack' is not a valid installed
package

Just putting the folder in the library folder of R did not work
either

Finally
R CMD Install complains about missing title in the Rd file. I do not have a Rd 
file as the skeleton did not create it.

Please, advise how to make simple R code available to be called
without showing in ls()

Thank you.
Stephen Bond


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

Re: [R] Running *slow*

2011-10-06 Thread R. Michael Weylandt
?unique

x - matrix(c(1:6, 6:1),ncol=2)

x.temp - x
dim(x.temp) - NULL
unique(x.temp)

Michael


On Thu, Oct 6, 2011 at 8:37 AM, Thomas chesney@gmail.com wrote:
 Anyone got any hints on how to make this code more efficient? An early
 version (which to be fair did more than this one is) ran for 330 hours and
 produced no output.

 I have a two column table, Dat, with 12,000,000 rows and I want to produce a
 lookup table, ltable, in a 1 dimensional matrix with one copy of each of the
 values in Dat:

 for (i in 1:nrow(Dat))
 {
 for (j in 1:2)
 {
 #If next value is already in ltable, do nothing
 if (is.na(match(Dat[i,j], ltable))){ltable - rbind(ltable,Dat[i,j])}
 }
 }

 but it takes forever to produce anything.

 Any advice gratefully received.

 Thomas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] break.axis all range of data

2011-10-06 Thread Heverkuhn Heverkuhn
Thank you Jim
It seems exactly what I was looking for :)

Claudio

On Thu, Oct 6, 2011 at 5:11 AM, Jim Lemon j...@bitwrit.com.au wrote:

 On 10/06/2011 12:31 AM, Heverkuhn Heverkuhn wrote:

 ..all the point from 8 to 13.

 On Wed, Oct 5, 2011 at 8:28 AM, Heverkuhn Heverkuhnheverk...@gmail.com*
 *wrote:

  The problem with that function is that it does not really separate the
 2parts of the graph but it inserts , when style is gap, a blank strip
 that
 cover axis and points. So for example if a insert it at 8 and I set the
 gap
 of length 5 , it would cancel al the point from 8 to 10.

  ...

  I would like to increase the distance between x tick-marks 8 and 9, 
 and not connect the points  x=8 and x=9.

 Ah, I think I see what you want. You want an axis like this:

 axis(1,at=c(1:8,10:37),labels=**1:36)

 and to get your points right, you would have to do something like:

 plot(c(1:8,10:37),1:36,xaxt=**n)
 lines(1:8,1:8)
 lines(10:37,9:36)

 first. This is more or less the reverse of the gap.* functions in the
 plotrix package.

 Jim


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Usng MCMCpack, error is initial value in vmmin is not finite

2011-10-06 Thread Uwe Ligges



On 05.10.2011 20:38, yiy83102 wrote:


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gamm: problems with corCAR1()

2011-10-06 Thread Ben Bolker
Karel V karel.viaene at ugent.be writes:

 
 Dear all,
 
 I’m analyzing this dataset containing biodiversity indices, measured over
 time (Week), and at various contaminant concentrations (Treatment). We have
 two replicates (Replicate) per treatment. 
 I’m looking for the effects of time (Week) and contaminant concentration
 (Treatment) on diversity indices (e.g. richness).


 [snip snip snip snip]

  You might do better with this question on the r-sig-mixed-models
mailing list (r-sig-mixed-mod...@r-project.org)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-SIG-Finance] AsOf join in R

2011-10-06 Thread Gabor Grothendieck
On Thu, Oct 6, 2011 at 2:15 AM, Ulrich Staudinger ustaudin...@gmail.com wrote:
 A bit late,  but here is what I always do:

 m = merge(bid, ask, tick)
 m- interpNA(m, method=before)

 intrepNA can also interpolate NAs in different ways, for example linearly.


Its not clear precisely what sort of objects bid, ask and tick were
intended to be in your example but none of the tries below produce 1.1
for Sep 9th as asked for:

 library(xts)
 library(timeSeries)
 A - xts(c(10,15,20,25),
+ order.by=as.POSIXct(c(2011-09-01,2011-09-09,2011-09-10,2011-09-15)))

 B - xts(c(1.1,1.5,1.3,1.7),
+ order.by=as.POSIXct(c(2011-08-31,2011-09-09,2011-09-11,2011-09-12)))

 #1
 interpNA(merge(A, B), method = before)
A   B
2011-08-31 NA 1.1
2011-09-01 10 1.1
2011-09-09 15 1.5
2011-09-10 20 1.5
2011-09-11 20 1.3
2011-09-12 20 1.7
2011-09-15 25  NA

 #2
 interpNA(merge(as.timeSeries(A), as.timeSeries(B)), method = before)
GMT
   TS.1
2011-08-31  1.1
2011-09-01 10.0
2011-09-09  1.5
2011-09-09 15.0
2011-09-10 20.0
2011-09-11  1.3
2011-09-12  1.7
2011-09-15 25.0

 #3
 interpNA(as.timeSeries(merge(A, B)), method = before)
GMT
A   B
2011-08-31 NA 1.1
2011-09-01 10 1.1
2011-09-09 15 1.5
2011-09-10 20 1.5
2011-09-11 20 1.3
2011-09-12 20 1.7
2011-09-15 25  NA

interpNA seems to use approx underneath so its functionality is
similar to na.approx in zoo/xts but with different defaults.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Wilcox Test / Mann Whitney U Test

2011-10-06 Thread Sam Stewart
Hello List,

I'm trying to prepare some lecture notes on non parametric methods,
and I can't manually reproduce the results of the wilcox.test function
for ordinal data.

The data I'm using are from David Howell's website, available here

http://www.uvm.edu/~dhowell/StatPages/More_Stuff/OrdinalChisq/OrdinalChiSq.html

If I run the wilcox.test function on the data I get a p-value of
.0407, but when I do it myself I get a p-value of 0.0530.  It's not so
much the jump across 0.05, but the fact that I thought I knew what the
function was doing.

I know from the R help page that there is some controversy about how
exactly to calculate the test statistic, but that's not what is
causing the problem, as I can get the same W value.  Am I calculating
the test statistic incorrectly?

Thanks, sample code below
Sam Stewart

#Ordinal example
dropouts = c(rep(0,25),rep(3,10),rep(2,9),rep(1,13),rep(4,6))
remain = c(rep(0,31),rep(3,2),rep(2,6),rep(1,21),rep(4,3))
tab2 = rbind(table(dropouts),table(remain))
ordTest = wilcox.test(x=dropouts,y=remain,correct=FALSE,exact=FALSE)
cumsum(colSums(tab2))
W = 
max(c(sum(rank(cbind(dropouts,remain))[1:length(dropouts)]),sum(rank(cbind(dropouts,remain))[-(1:length(dropouts))])))
n1 = length(dropouts)
n2 = length(remain)
testStat = (S-n1*(n1+n2+1)/2)/(sqrt(n1*n2*(n1+n2+1)/12))
2*(1-pnorm(testStat))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with .C

2011-10-06 Thread Steve Lianoglou
Hi,

2011/10/6 Uwe Ligges lig...@statistik.tu-dortmund.de:
 On 06.10.2011 14:51, Jan van der Laan wrote:

 An obvious reason might be that your second argument should be a pointer
 to int.

 As others have mentioned, you might want to have a look at Rccp and/or
 inline. The documentation is good and I find it much easier to work with.

 For example, your example could be written as:

 library(Rcpp)
 library(inline)

 test - cxxfunction(signature(x = numeric ) , '
 Rcpp::NumericVector v(x);
 Rcpp::NumericVector result(v.length());
 for (int i = 0; i  v.length(); ++i) {
 result[i] = v[i] + i;
 }
 return(result);
 ', plugin = Rcpp )



 Oh, come on, this is now really too much of overkill.

I don't agree that it's overkill -- you get to sidestep the whole `R
CMD SHLIB ...` and `dyn.load` dance this way while you experiment with
C(++) code 'live using the inline package.

It's really handy.

 Just make the original source


 void test(double *b, int *l)
 {
     int i;
     for(i=0; i  *l ; i++) b[i] += i;
 }


 which you would have know after reading the Wriiting R Extensions manual.

I agree that this step is unavoidable no matter which avenue (Rcpp or
otherwise) one decides to take.

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to install a locally built package

2011-10-06 Thread Bond, Stephen
Uwe,

Are u saying 
1) I can add the new func in one of the existing .R files in the R dir?? 
Or 
2) add a new .R file in the same dir and ignore the lack of a matching .Rd 
file? 

Thank you.
Stephen Bond | Senior Analyst | Treasury Analytics | 416-956-3092

-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] 
Sent: Thursday, October 06, 2011 9:12 AM
To: Bond, Stephen
Cc: r-help@r-project.org
Subject: Re: [R] how to install a locally built package



On 06.10.2011 15:10, Bond, Stephen wrote:
 Is there a way to put all R code in a single file?

 I have too many small files now, and it is inconvenient to edit (I still have 
 to put everything in one buffer) and when I add just one new func I have to 
 go through the process of manually editing all help files one by one. When I 
 put all the code in one file only the first func is loaded.

He? You really do not need to edit anything when you add a function.
Just don't use package skeleton but just add the function.

Uwe Ligges




 Thanks everybody

 Stephen B

 -Original Message-
 From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
 Sent: Friday, September 16, 2011 3:43 AM
 To: Bond, Stephen
 Cc: r-help@r-project.org
 Subject: Re: [R] how to install a locally built package



 On 15.09.2011 20:38, Bond, Stephen wrote:
 I got it working by typing a string in getdfv.Rd after \title{

 \title{ getdf
 %%  ~~function to do ... ~~
 }

 Strange why the skeleton would not do that given it did \name{getdfv} 
 \alias{getdfv}

 One reason is that we try to force people to deal with the documentation 
 which we feel is very important, even if the stuff is just used by yourself.

 Best,
 Uwe Ligges



 Anyway, I'm happy now.

 Stephen Bond

 -Original Message-
 From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
 Sent: Thursday, September 15, 2011 12:15 PM
 To: Bond, Stephen
 Cc: r-help@r-project.org
 Subject: Re: [R] how to install a locally built package



 On 15.09.2011 17:47, Bond, Stephen wrote:
 Uwe,

 That gave me the same error like CMD install
 install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL,
 type=source)

 install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL,
 type=source) Loading required package: stats Loading required
 package: utils Loading required package: graphics Loading required
 package: splines
 * installing *source* package 'mypack' ...
 ** R
 ** preparing package for lazy loading
 ** help
 Warning:
 C:/DOCUME~1/BondStep/LOCALS~1/Temp/Rtmpw5N7dm/R.INSTALL6ccc14e6/mypa
 c
 k/man/mypack-package.Rd:33: All text must be in a section
 *** installing help indices
 Error in Rd_info(db[[i]]) : Rd files must have a non-empty \title.
 See chapter 'Writing R documentation' in manual 'Writing R Extensions'.
 * removing 'c:/PROGRA~1/R/R-212~1.1/library/mypack'
 * restoring previous 'c:/PROGRA~1/R/R-212~1.1/library/mypack'
 Warning message:
 In install.packages(C:/Temp/mypack_1.0.tar.gz, repos = NULL, type = 
 source) :
  installation of package 'C:/Temp/mypack_1.0.tar.gz' had 
 non-zero exit status

 is there a way to skip the Rd part? This is for private use only and there 
 is no help or data files.


 Then you have to delete the Rd file that has been generated by 
 package.skeleton.
 Please read the manuals Writing R Extensions and R Installation 
 and Administration.

 Best,
 Uwe Ligges



 Thank you.

 Stephen Bond

 -Original Message-
 From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
 Sent: Thursday, September 15, 2011 11:43 AM
 To: Bond, Stephen
 Cc: r-help@r-project.org
 Subject: Re: [R] how to install a locally built package



 On 15.09.2011 17:32, Bond, Stephen wrote:
 Hello useRs,

 I am trying to put my funcs in a package to avoid clutter in my workspace 
 as the funcs are now in .Rprofile.
 All plain R code no compilations. Using win XP with a full cygwin 
 install and R2.12 I first did

 package.skeleton(mypack,list=getdfv, namespace=T) # just a 
 single func

 which created a folder with the required stuff in it.
 Calling R CMD build creates a tar.gz file with warnings about DOS paths.

 Trying to install from the R GUI complains
 install.packages(mypack,repos=C:/Temp/mypack_1.0.tar.gz)
 Warning: unable to access index for repository
 C:/Temp/mypack_1.0.tar.gz/bin/windows/contrib/2.12
 Warning message:
 In getDependencies(pkgs, dependencies, available, lib) :
   package 'mypack' is not available


 install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL,
 type=source)

 seems to be what you want.

 Uwe Ligges


 putting the mypack folder in zip and then trying to install from 
 the zip produces no error message but then

 library(mypack)
 Error in library(mypack) : 'mypack' is not a valid installed 
 package

 Just putting the folder in the library folder of R did not work 
 either

 Finally
 R CMD Install complains about missing title in the Rd file. I do not have 
 a Rd file as the skeleton did not create it.

 Please, advise how to make simple R 

Re: [R] how to install a locally built package

2011-10-06 Thread Uwe Ligges



On 06.10.2011 15:45, Bond, Stephen wrote:

Uwe,

Are u saying
1) I can add the new func in one of the existing .R files in the R dir??


Yes, sure, or add an additional file.


Or
2) add a new .R file in the same dir and ignore the lack of a matching .Rd file?


Yes. You can also generate a single Rd file using prompt() on the function.


Uwe Ligges




Thank you.
Stephen Bond | Senior Analyst | Treasury Analytics | 416-956-3092

-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
Sent: Thursday, October 06, 2011 9:12 AM
To: Bond, Stephen
Cc: r-help@r-project.org
Subject: Re: [R] how to install a locally built package



On 06.10.2011 15:10, Bond, Stephen wrote:

Is there a way to put all R code in a single file?

I have too many small files now, and it is inconvenient to edit (I still have 
to put everything in one buffer) and when I add just one new func I have to go 
through the process of manually editing all help files one by one. When I put 
all the code in one file only the first func is loaded.


He? You really do not need to edit anything when you add a function.
Just don't use package skeleton but just add the function.

Uwe Ligges





Thanks everybody

Stephen B

-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
Sent: Friday, September 16, 2011 3:43 AM
To: Bond, Stephen
Cc: r-help@r-project.org
Subject: Re: [R] how to install a locally built package



On 15.09.2011 20:38, Bond, Stephen wrote:

I got it working by typing a string in getdfv.Rd after \title{

\title{ getdf
%%  ~~function to do ... ~~
}

Strange why the skeleton would not do that given it did \name{getdfv}
\alias{getdfv}


One reason is that we try to force people to deal with the documentation which 
we feel is very important, even if the stuff is just used by yourself.

Best,
Uwe Ligges




Anyway, I'm happy now.

Stephen Bond

-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
Sent: Thursday, September 15, 2011 12:15 PM
To: Bond, Stephen
Cc: r-help@r-project.org
Subject: Re: [R] how to install a locally built package



On 15.09.2011 17:47, Bond, Stephen wrote:

Uwe,

That gave me the same error like CMD install

install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL,
type=source)


install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL,
type=source) Loading required package: stats Loading required
package: utils Loading required package: graphics Loading required
package: splines
* installing *source* package 'mypack' ...
** R
** preparing package for lazy loading
** help
Warning:
C:/DOCUME~1/BondStep/LOCALS~1/Temp/Rtmpw5N7dm/R.INSTALL6ccc14e6/mypa
c
k/man/mypack-package.Rd:33: All text must be in a section
*** installing help indices
Error in Rd_info(db[[i]]) : Rd files must have a non-empty \title.
See chapter 'Writing R documentation' in manual 'Writing R Extensions'.
* removing 'c:/PROGRA~1/R/R-212~1.1/library/mypack'
* restoring previous 'c:/PROGRA~1/R/R-212~1.1/library/mypack'
Warning message:
In install.packages(C:/Temp/mypack_1.0.tar.gz, repos = NULL, type = source) 
:
  installation of package 'C:/Temp/mypack_1.0.tar.gz' had
non-zero exit status

is there a way to skip the Rd part? This is for private use only and there is 
no help or data files.



Then you have to delete the Rd file that has been generated by
package.skeleton.
Please read the manuals Writing R Extensions and R Installation
and Administration.

Best,
Uwe Ligges




Thank you.

Stephen Bond

-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
Sent: Thursday, September 15, 2011 11:43 AM
To: Bond, Stephen
Cc: r-help@r-project.org
Subject: Re: [R] how to install a locally built package



On 15.09.2011 17:32, Bond, Stephen wrote:

Hello useRs,

I am trying to put my funcs in a package to avoid clutter in my workspace as 
the funcs are now in .Rprofile.
All plain R code no compilations. Using win XP with a full cygwin
install and R2.12 I first did

package.skeleton(mypack,list=getdfv, namespace=T) # just a
single func

which created a folder with the required stuff in it.
Calling R CMD build creates a tar.gz file with warnings about DOS paths.

Trying to install from the R GUI complains

install.packages(mypack,repos=C:/Temp/mypack_1.0.tar.gz)

Warning: unable to access index for repository
C:/Temp/mypack_1.0.tar.gz/bin/windows/contrib/2.12
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
   package 'mypack' is not available



install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL,
type=source)

seems to be what you want.

Uwe Ligges



putting the mypack folder in zip and then trying to install from
the zip produces no error message but then


library(mypack)

Error in library(mypack) : 'mypack' is not a valid installed
package

Just putting the folder in the library folder of R did not work
either

Finally
R CMD Install complains about missing title in the Rd file. I do not have a Rd 

Re: [R] Problem with .C

2011-10-06 Thread Uwe Ligges



On 06.10.2011 15:41, Steve Lianoglou wrote:

Hi,

2011/10/6 Uwe Liggeslig...@statistik.tu-dortmund.de:

On 06.10.2011 14:51, Jan van der Laan wrote:


An obvious reason might be that your second argument should be a pointer
to int.

As others have mentioned, you might want to have a look at Rccp and/or
inline. The documentation is good and I find it much easier to work with.

For example, your example could be written as:

library(Rcpp)
library(inline)

test- cxxfunction(signature(x = numeric ) , '
Rcpp::NumericVector v(x);
Rcpp::NumericVector result(v.length());
for (int i = 0; i  v.length(); ++i) {
result[i] = v[i] + i;
}
return(result);
', plugin = Rcpp )




Oh, come on, this is now really too much of overkill.


I don't agree that it's overkill -- you get to sidestep the whole `R
CMD SHLIB ...` and `dyn.load` dance this way while you experiment with
C(++) code 'live using the inline package.



You need two additional packages now where you have to rely on the fact 
those are available. Moreover, you have to get used to that syntax, and 
part of it seems to be C++ now? At least I do not know why the above 
should work at all, while I know the simple C function does.


Uwe



It's really handy.


Just make the original source


void test(double *b, int *l)
{
 int i;
 for(i=0; i  *l ; i++) b[i] += i;
}


which you would have know after reading the Wriiting R Extensions manual.


I agree that this step is unavoidable no matter which avenue (Rcpp or
otherwise) one decides to take.

-steve



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wilcox Test / Mann Whitney U Test

2011-10-06 Thread Sam Stewart
So I checked it with the wilcox_test in the coin library, and got the
same result.  That makes me more confident that I made a mistake, but
still doesn't help me find it

d = 
data.frame(value=c(dropouts,remain),group=c(rep(dropout,length(dropouts)),rep(remain,length(remain
wilcox_test(value~group,data=d)

Sam

On Thu, Oct 6, 2011 at 10:35 AM, Sam Stewart rhelp.st...@gmail.com wrote:
 Hello List,

 I'm trying to prepare some lecture notes on non parametric methods,
 and I can't manually reproduce the results of the wilcox.test function
 for ordinal data.

 The data I'm using are from David Howell's website, available here

 http://www.uvm.edu/~dhowell/StatPages/More_Stuff/OrdinalChisq/OrdinalChiSq.html

 If I run the wilcox.test function on the data I get a p-value of
 .0407, but when I do it myself I get a p-value of 0.0530.  It's not so
 much the jump across 0.05, but the fact that I thought I knew what the
 function was doing.

 I know from the R help page that there is some controversy about how
 exactly to calculate the test statistic, but that's not what is
 causing the problem, as I can get the same W value.  Am I calculating
 the test statistic incorrectly?

 Thanks, sample code below
 Sam Stewart

 #Ordinal example
 dropouts = c(rep(0,25),rep(3,10),rep(2,9),rep(1,13),rep(4,6))
 remain = c(rep(0,31),rep(3,2),rep(2,6),rep(1,21),rep(4,3))
 tab2 = rbind(table(dropouts),table(remain))
 ordTest = wilcox.test(x=dropouts,y=remain,correct=FALSE,exact=FALSE)
 cumsum(colSums(tab2))
 W = 
 max(c(sum(rank(cbind(dropouts,remain))[1:length(dropouts)]),sum(rank(cbind(dropouts,remain))[-(1:length(dropouts))])))
 n1 = length(dropouts)
 n2 = length(remain)
 testStat = (S-n1*(n1+n2+1)/2)/(sqrt(n1*n2*(n1+n2+1)/12))
 2*(1-pnorm(testStat))


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: break.axis all range of data

2011-10-06 Thread Heverkuhn Heverkuhn
But in this way for moving also the point have to add 2 at all the x point
after 8.

axis(1,at=c(1:8,12:39),labels=1:36)
plot(1:36,1:36,xaxt=n)
I increased the gap, and if I don increase the x for the points at X 9, 10
they will appear in the gap.

HC



On Thu, Oct 6, 2011 at 8:20 AM, Heverkuhn Heverkuhn heverk...@gmail.comwrote:

 Thank you Jim
 It seems exactly what I was looking for :)

 Claudio


 On Thu, Oct 6, 2011 at 5:11 AM, Jim Lemon j...@bitwrit.com.au wrote:

 On 10/06/2011 12:31 AM, Heverkuhn Heverkuhn wrote:

 ..all the point from 8 to 13.

 On Wed, Oct 5, 2011 at 8:28 AM, Heverkuhn Heverkuhnheverk...@gmail.com
 **wrote:

  The problem with that function is that it does not really separate the
 2parts of the graph but it inserts , when style is gap, a blank strip
 that
 cover axis and points. So for example if a insert it at 8 and I set the
 gap
 of length 5 , it would cancel al the point from 8 to 10.

  ...

  I would like to increase the distance between x tick-marks 8 and 9, 
 and not connect the points  x=8 and x=9.

 Ah, I think I see what you want. You want an axis like this:

 axis(1,at=c(1:8,10:37),labels=**1:36)

 and to get your points right, you would have to do something like:

 plot(c(1:8,10:37),1:36,xaxt=**n)
 lines(1:8,1:8)
 lines(10:37,9:36)

 first. This is more or less the reverse of the gap.* functions in the
 plotrix package.

 Jim




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unique possible bug

2011-10-06 Thread Uwe Ligges



On 05.10.2011 22:15, Patrick McCann wrote:

Hi,

I am trying to read in a rather large list of transactions using the
arules library.


You mean the arules package?



It seems in the coerce method into the dgCmatrix, it
somewhere calls unique. Unique.c throws an error when  n  536870912;
however, when 4*n was modified to 2*n in 2004, the overflow protection
should have changed from 2^29 to 2^30, right? If so, how would I
change it in my copy? Do I have to recompile everything?


Yes.

There is also the way to ask the maintainer to improve it, but it won't 
work without reinstallation of the changed package sources.


Uwe Ligges




Thanks,
Patrick McCann


Here is a simple to reproduce example:

runif(2^29+5)-a
sum(unique(a))-b

Error in unique.default(a) : length 536870917 is too large for hashing

traceback()

3: unique.default(a)
2: unique(a)
1: unique(a)

unique.default

function (x, incomparables = FALSE, fromLast = FALSE, ...)
{
 z- .Internal(unique(x, incomparables, fromLast))
 if (is.factor(x))
 factor(z, levels = seq_len(nlevels(x)), labels = levels(x),
 ordered = is.ordered(x))
 else if (inherits(x, POSIXct))
 structure(z, class = class(x), tzone = attr(x, tzone))
 else if (inherits(x, Date))
 structure(z, class = class(x))
 else z
}
environment: namespace:base


From http://svn.r-project.org/R/trunk/src/main/unique.c I see:



/*
  Choose M to be the smallest power of 2
  not less than 2*n and set K = log2(M).
  Need K= 1 and hence M= 2, and 2^M= 2^31 -1, hence n= 2^29.

  Dec 2004: modified from 4*n to 2*n, since in the worst case we have
  a 50% full table, and that is still rather efficient -- see
  R. Sedgewick (1998) Algorithms in C++ 3rd edition p.606.
*/
static void MKsetup(int n, HashData *d)
{
int n4 = 2 * n;

if(n  0 || n  536870912) /* protect against overflow to -ve */
error(_(length %d is too large for hashing), n);
d-M = 2;
d-K = 1;
while (d-M  n4) {
d-M *= 2;
d-K += 1;
}
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wilcox Test / Mann Whitney U Test

2011-10-06 Thread Sam Stewart
And I figured it out, sorry to bother the list.

The normal approximation I was using is not accurate in the presence of ties.

Sam

On Thu, Oct 6, 2011 at 10:56 AM, Sam Stewart rhelp.st...@gmail.com wrote:
 So I checked it with the wilcox_test in the coin library, and got the
 same result.  That makes me more confident that I made a mistake, but
 still doesn't help me find it

 d = 
 data.frame(value=c(dropouts,remain),group=c(rep(dropout,length(dropouts)),rep(remain,length(remain
 wilcox_test(value~group,data=d)

 Sam

 On Thu, Oct 6, 2011 at 10:35 AM, Sam Stewart rhelp.st...@gmail.com wrote:
 Hello List,

 I'm trying to prepare some lecture notes on non parametric methods,
 and I can't manually reproduce the results of the wilcox.test function
 for ordinal data.

 The data I'm using are from David Howell's website, available here

 http://www.uvm.edu/~dhowell/StatPages/More_Stuff/OrdinalChisq/OrdinalChiSq.html

 If I run the wilcox.test function on the data I get a p-value of
 .0407, but when I do it myself I get a p-value of 0.0530.  It's not so
 much the jump across 0.05, but the fact that I thought I knew what the
 function was doing.

 I know from the R help page that there is some controversy about how
 exactly to calculate the test statistic, but that's not what is
 causing the problem, as I can get the same W value.  Am I calculating
 the test statistic incorrectly?

 Thanks, sample code below
 Sam Stewart

 #Ordinal example
 dropouts = c(rep(0,25),rep(3,10),rep(2,9),rep(1,13),rep(4,6))
 remain = c(rep(0,31),rep(3,2),rep(2,6),rep(1,21),rep(4,3))
 tab2 = rbind(table(dropouts),table(remain))
 ordTest = wilcox.test(x=dropouts,y=remain,correct=FALSE,exact=FALSE)
 cumsum(colSums(tab2))
 W = 
 max(c(sum(rank(cbind(dropouts,remain))[1:length(dropouts)]),sum(rank(cbind(dropouts,remain))[-(1:length(dropouts))])))
 n1 = length(dropouts)
 n2 = length(remain)
 testStat = (S-n1*(n1+n2+1)/2)/(sqrt(n1*n2*(n1+n2+1)/12))
 2*(1-pnorm(testStat))



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with wireframe graphics problem (newbie)

2011-10-06 Thread Uwe Ligges



On 06.10.2011 00:50, David Wiley wrote:

All,

I've read several tutorials re: generating wireframes, but am clearly
missing something. I have data along the lines of:


tbl [1:10,]

Visits Activity Course.Grade
1  17218.31
2   7   1120.67
3   9   1724.69
4  28   7138.72
5  43  10745.46
6  14547.77
7  25   5157.81
8  32   2759.46
9  60   4364.39
10 31   4666.26



Looks like you are after a 3D scatterplot rather than a wireframe - the 
latter only works on a regular grid of Vistits and Activity.


Uwe ligges



When I run the command:

wireframe(Course.Grade ~ Visits * Activity, data = tbl)

I get an appropriately labeled, but empty, cube. Can anyone tell me
why there's no data in my cube?

Thanks in advance,

David

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SPlus to R

2011-10-06 Thread Scott Raynaud
I did some testing and I believe the program is operating properly.  It takes 
some time to finish, especially as sample sizes get larger, but I seem to be 
able to reproduce the results from the original paper.  Right now I'm most 
interested in method 3.  I set nc=40 and d=.2 as in the paper.  The results are 
below.  The program iterates to find a solution and the more subjects (nc), the 
more iterations it needs.  The initial pass uses estimates from the randomized 
method as starting values and this is plotted in black on a graph.  The next 
pass uses the algorithm to generate sample size estimates which are plotted in 
gray on the first pass.  Each new estimate is plotted in dark blue as the 
previous one turns black.  This reults in several curves being plotted but one 
can get an idea of the relevant one by looking for the blue plot and examining 
the $ne vector below.
 
Reading the graph is more troublesome as nc increases.  It takes many more 
iterations to complete and the graph gets quite crowded.  I'm not sure what the 
best solution is.  Which would be easiest?
 
1) Plot ne against pc for the final iteration.  The problem here is pc is not 
saved in a vector.
2) Cut and paste $ne into Excel and re-create the corresponding pc's.
 
It's not clear to me how pc is incremented.  Seems like it's in the code block 
below:
 
### for method 3
if (method==3) {
if (tol1  tol2/10) tol1-tol2/10
ncstar-(1-d)*nc
pc-(0:ncstar)/nc
ne-rep(NA,ncstar + 1)
for (i in (0:ncstar))
{ ne[i+1]-ss.rand(i,nc,d,alpha=.05,power=.8,tol=.01)
}
plot(pc,ne,type='l',ylim=c(0,max(ne)*1.5))
ans-c.searchd(nc, d, ne, alpha, power, cc, tol1)

 
 sshc()
 [1] 0.8000 0.7889 0.7825 0.7786 0.7763 0.7751 0.7747 0.7750 0.7759 0.7772
[11] 0.7790 0.7815 0.7852 0.7899 0.7943 0.7968 0.7979 0.7989 0.8002 0.8021
[21] 0.8045 0.8072 0.8100 0.8123 0.8137 0.8137 0.8124 0.8105 0.8095 0.8103
[31] 0.8134 0.8180 0.8244
 [1]  11.64  19.62  27.50  36.44  46.54  58.19  71.81  87.93 107.30 130.38
[11] 158.47 192.34 235.47 291.93 370.48 471.26 569.17 626.36 626.18 588.94
[21] 529.27 459.67 389.30 323.70 263.50 210.58 164.26 124.25  90.49  63.23
[31]  42.13  26.23   8.13
[1]  old.abs.dev= 0.434571909427754
[1]  abs.dev= 0.259398042835904
 [1] 0.8000 0.8031 0.8063 0.8094 0.8120 0.8140 0.8152 0.8156 0.8152 0.8141
[11] 0.8122 0.8097 0.8073 0.8055 0.8045 0.8037 0.8025 0.8015 0.8006 0.7995
[21] 0.7982 0.7969 0.7956 0.7946 0.7939 0.7930 0.7916 0.7897 0.7879 0.7872
[31] 0.7876 0.7886 0.7887
 [1]  11.64  20.91  30.51  42.13  56.19  73.71  95.59 123.29 159.20 204.83
[11] 263.66 337.90 424.41 520.76 623.55 727.31 804.98 810.60 733.27 621.58
[21] 504.83 397.60 310.92 242.78 190.93 151.72 120.81  95.24  72.67  52.84
[31]  36.01  22.42   6.32
[1]  old.abs.dev= 0.259398042835904
[1]  abs.dev= 0.160699574601494
 [1] 0.8000 0.7985 0.7972 0.7958 0.7944 0.7930 0.7917 0.7907 0.7903 0.7907
[11] 0.7917 0.7929 0.7941 0.7951 0.7964 0.7977 0.7982 0.7984 0.7989 0.7995
[21] 0.8000 0.8006 0.8013 0.8023 0.8036 0.8048 0.8054 0.8055 0.8055 0.8063
[31] 0.8081 0.8103 0.8125
 [1]  11.64  20.56  29.46  39.64  51.27  64.86  80.67  99.46 122.30 149.90
[11] 184.62 230.35 286.04 352.93 429.63 508.43 588.14 635.70 616.86 561.50
[21] 489.16 408.17 333.63 269.21 215.86 173.66 139.24 110.43  84.50  61.28
[31]  41.38  25.49   7.42
[1]  old.abs.dev= 0.160699574601494
[1]  abs.dev= 0.0893104146067729
 [1] 0.8000 0.8006 0.8013 0.8021 0.8029 0.8036 0.8043 0.8048 0.8050 0.8051
[11] 0.8049 0.8045 0.8037 0.8024 0.8019 0.8019 0.8015 0.8012 0.8010 0.8009
[21] 0.8007 0.8002 0.7995 0.7990 0.7987 0.7985 0.7980 0.7970 0.7958 0.7950
[31] 0.7950 0.7950 0.7939
 [1]  11.64  20.72  29.92  40.75  53.42  68.99  87.79 111.53 141.92 179.12
[11] 226.54 290.16 365.88 456.60 556.96 640.04 722.83 757.33 711.48 624.30
[21] 522.26 423.34 335.91 264.10 206.62 162.86 128.85 101.60  77.86  56.72
[31]  38.30  23.41   6.49
[1]  old.abs.dev= 0.0893104146067729
[1]  abs.dev= 0.0506515395975703
 [1] 0.8000 0.7996 0.7994 0.7990 0.7986 0.7982 0.7978 0.7974 0.7970 0.7969
[11] 0.7970 0.7975 0.7981 0.7979 0.7982 0.7990 0.7991 0.7990 0.7990 0.7992
[21] 0.7994 0.7996 0.7996 0.7998 0.8002 0.8009 0.8015 0.8015 0.8014 0.8016
[31] 0.8025 0.8035 0.8041
 [1]  11.64  20.65  29.71  40.22  52.31  66.93  84.02 104.94 131.25 161.98
[11] 200.06 251.89 310.71 386.59 476.45 545.33 622.06 664.45 636.92 570.09
[21] 486.12 402.82 326.45 262.37 208.03 165.78 131.88 104.87  81.05  59.38
[31]  40.18  24.54   6.99
$ne
 [1]  11.637555  20.651336  29.710258  40.223468  52.314568  66.928887
 [7]  84.022673 104.936774 131.254491 161.981416 200.060785 251.890694
[13] 310.710449 386.591379 476.449601 545.327456 622.056417 664.447078
[19] 636.923065 570.091413 486.117837 402.817857 326.451672 262.370265
[25] 208.029592 165.777369 131.875077 104.873469  81.051410  59.379349
[31]  40.182140  24.544626   6.990464
$Ep
 [1] 0.741 0.7996387 0.7993605 0.7990384 0.7986411 0.7982104 0.7978227
 [8] 0.7974117 0.7970255 0.7969174 0.7970263 0.7974569 0.7980592 

Re: [R] distance coefficient for amatrix with ngative valus

2011-10-06 Thread R. Michael Weylandt
Did you read any of the comments I made regarding working examples,
meaningful question asking, or replying to the entire list?

If you look at the code, you'll see pco is just a very elementary
wrapper for cmdscale, the author of which is active on this list and
could have seen your question and replied to it with a hundredfold
more speed and knowledge than myself, ... had you replied to the
entire list.

Looking further into cmdscale, you can see that it is not designed to
return variable loadings directly (to be honest, I'm not particularly
familiar with PCO and I'm not sure that the method provides such
loadings but I'm assuming you have reason to at least think they
exist) so you'll have to calculate them directly. Read the
documentation of cmdscale to find understand what the various things
being returned are. If another function in the labsdv package seems to
calculate them, perhaps you can pilfer some code from there.

Michael

On Tue, Oct 4, 2011 at 1:58 PM, dilshan benaragama
benaraga...@yahoo.com wrote:
 Hi,
 As you mentioned I was able to run the pco  ignoring the warning massege of
 negative values. The nest problem I have is how to get the loadings for each
 variable as it will not give the summary out put or loadings as we get for
 pca.
 Thanks.
 From: R. Michael Weylandt michael.weyla...@gmail.com
 To: dilshan benaragama benaraga...@yahoo.com; r-help
 r-help@r-project.org
 Sent: Monday, October 3, 2011 11:05:19 PM
 Subject: Re: [R] distance coefficient for amatrix with ngative valus

 Comments inline:

 On Mon, Oct 3, 2011 at 11:27 PM, dilshan benaragama
 benaraga...@yahoo.com wrote:
 Yes I think you did not get my problem.

 No, you did not state your problem. I have replied to everything you
 have actually included to this point. Admittedly, I have failed to
 reply to things you did not say...

  Actualy I want run PCO with
 (labdsv). To do that I I am trying to get the distance metrix using
 following fuctions with library (vegan).

 This is now the 7th email in this chain. You should mention the
 packages and functions you are using in the FIRST email of the chain.
 This is mentioned in the posting guide which you apparently have still
 not yet read.


 pca.gower- vegdist(envt[,2:9],method=gower)
 pca.eucl-vegdist(envt[,2:9],method=euclidean)
 pca.chi-vegdist(envt[,2:9],method=chi.square)
 pca.mahal-vegdist(envt[,2:9],method=mahal)
 pca.bray-vegdist(envt,method=bray)

 However none of the functions work

 They all work for any data I put in. This is perhaps when that minimal
 working example, which you also should have included, is necessary.
 The append at the end of each of the 7 emails in this chain that tells
 you to read the posting guide also asks for this, as did I explicitly.

 (gives an error saying that is not
 working due to negatve values)

 No, they each give warnings. Warnings are not errors. They are
 warnings and they say warning. Perhaps unsurprisingly, errors say
 error. If you are using an old version of vegan that throws an
 error, you should always update before seeking help.Not surprisingly,
 a certain document suggests this.

 except euclidean distance for the raw data
 set as the raw data has negative values for some variables. It is no point
 of using euclidean metrix with PCO as we can do the same thing from PCA.
 So
 I need to find a way I can run PCO with a different dissimilarity metrix
 for this data. It will be a great help if you can help me on this

 Actually read the warning message: it warns you that you have given
 negative data to an ecological function and suggests this might be a
 point you look into as this usually suggests a user-end problem. It
 does not fail to work in any sense of the word as evidence by the
 output of distances. If  negative data is nonsense, you should heed
 this warning; if you know its not, disregard it.

 More importantly, as I said in my initial response, any distance
 metric worth its salt is translation invariant. To wit,

 x - matrix(rnorm(50),5)

 d1 = vegdist(x, method=gower)
 d2 = vegdist(x + abs(min(x))*3, method=gower)

 all.equal(as.numeric(d1), as.numeric(d2))
 TRUE

 In fairness, I'll admit this does not seem to work for the bray
 distance. I am not an ecologist and I do not know why this would be --
 it does leave me somewhat confused as to what sort of space motivates
 the bray metric, but that's a discussion for another time and place --
 but the function still returns a valid dist object for both d1 and d2.


 Thanks,
 From: R. Michael Weylandt michael.weyla...@gmail.com
 To: dilshan benaragama benaraga...@yahoo.com; r-help
 r-help@r-project.org

 You will note that I include the r-help list on each email on this
 chain while you have not; this is mentioned in the posting guide.

 Sent: Monday, October 3, 2011 10:00:53 PM
 Subject: Re: [R] distance coefficient for amatrix with ngative valus

 You still haven't explained what's wrong with *almost every metric
 there is*, but if you want other distance 

Re: [R] Dealing with proportions

2011-10-06 Thread David Winsemius


On Oct 5, 2011, at 4:08 PM, Sam wrote:


Dear list,

I have very little experience in dealing with proportions, i am sure  
this is a very simple question


Sometimes making that claim in a group of statisticians can provoke an  
extended discussion to which there will be no eventual agreement.  
Discussions about testing proportions in groups with small numbers is  
one such excellent example.


but i could find no suitable answer beyond doing a chi-sq test  and  
then using the Marascuilo procedure as a post-hoc analysis.


I am simply wanting to know if the proportions ( i.e the number of  
Yes / No) significantly differ between the cases and if so which  
cases are significantly high or low?



proportion - structure(list(Case = structure(1:11, .Label = c(A,  
B, C,
D, E, F, G, H, I, J, K), class = factor), Yes =  
c(18L,

2L, 1L, 2L, 44L, 27L, 2L, 15L, 13L, 3L, 34L), No = c(171L, 11L,
5L, 8L, 146L, 80L, 5L, 30L, 22L, 5L, 42L), Num = c(189L, 13L,
6L, 10L, 190L, 107L, 7L, 45L, 35L, 8L, 76L)), .Names = c(Case,
Yes, No, Num), class = data.frame, row.names = c(NA,
-11L))



I would probably have turned to logistic regression and used p.adjust  
to deal with the multiple comparisons issues. By default, the  
Intercept p-value estimate for a simple model with a factor variable  
as the predictor tests for the Intercept = 0 tests for of a   
proportion  to 0.50 (since that is the proportion at which the  
log(odds) = 0). It then tests for equality to that odds for all the  
other levels (the same as odds ratios being 1 or their  
coefficients=0). (The first way I set this up I ended up with your A- 
level being the reference and the hypotheses being tested didn't seem  
reasonable or interesting namely that the lowest proportion was  
equal to 50% and that all the rest of the proportions were equal to  
that lowest proportion.)


as.matrix(proportion$Yes/proportion$Num)
   [,1]
 [1,] 0.0952381
 [2,] 0.1538462
 [3,] 0.167
 [4,] 0.200
 [5,] 0.2315789
 [6,] 0.2523364
 [7,] 0.2857143
 [8,] 0.333
 [9,] 0.3714286
[10,] 0.375
[11,] 0.4473684

So I took your data and made your K level the reference value since  
its value (0.44) is the one closest to 50% and then looked at the p- 
values from the others as information about their similarity to the K  
level.


proportion$Case - relevel(proportion$Case, K)
casefit - glm(cbind(Yes, Num) ~ Case, data= proportion,  
family=binomial)

summary(casefit)
summary(casefit)$coefficients
p.adjust( summary(casefit)$coefficients[ , 4] )

That set of hypotheses gives very few significant results, but most  
of your groups are pretty sparse so that should be expected. Another  
(and I think better) approach resulting in even fewer significant  
results is to use logistic regression with an offset term equal to the  
log(odds) for the entire group:


log(  with(proportion, sum(Yes)/sum(No) ) )   # returns [1] -1.181994
This corresponds to an expected proportion of 0.2346939

casefit - glm(cbind(Yes, Num) ~ 0+ Case, data= proportion,
   offset=rep( log( with(proportion, sum(Yes)/ 
sum(No) ) ), 11),   # each level gets the same offset.

   family=binomial)

I used the 0+ on the RHS of the formula to force a labeled estimate  
for each level, since the p-values are now in reference to the overall  
proportion. Since only one level has a significant p-value and it is  
highly significant (due to the combination of higher numbers of  
events and distance from the NULL estimate) you will not see much  
change if you adjust its p-value.


cbind( round( summary(casefit)$coefficients, 3),
   Prpns =round( (proportion$Yes/proportion$Num)[c(11,1:10)], 3),
   N_Yes=proportion$Yes[c(11,1:10)] )

  Estimate Std. Error z value Pr(|z|) Prpns N_Yes
CaseK0.378  0.206   1.8300.067 0.44734
CaseA   -1.169  0.247  -4.7410.000 0.09518
CaseB   -0.690  0.760  -0.9080.364 0.154 2
CaseC   -0.610  1.080  -0.5650.572 0.167 1
CaseD   -0.427  0.775  -0.5520.581 0.200 2
CaseE   -0.281  0.167  -1.6790.093 0.23244
CaseF   -0.195  0.215  -0.9050.365 0.25227
CaseG   -0.071  0.802  -0.0880.930 0.286 2
CaseH0.083  0.298   0.2800.780 0.33315
CaseI0.192  0.325   0.5900.555 0.37113
CaseJ0.201  0.677   0.2970.766 0.375 3

Comparing the numbers and the observed proportions to the overall  
proportion (0.2346939) shows not many (only one actually) groups with  
strong evidence of being different. I think you need larger numbers.


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Limitations of R

2011-10-06 Thread Alaios
Dear all,
I have a few binary files like 9Gb or even of 50Gb.
I would like to ask you what are the known limits of the R for the data 
processing part, I have a system with a lot of RAM (500Gb) but I am not sure 
about the internal limitations of the R. How long for example a vector can 
be? 


Could you please inform me for these internal R limitations?

I would like to thank you in advance for your help

B.R
Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dealing with proportions

2011-10-06 Thread David Winsemius


On Oct 6, 2011, at 10:33 AM, David Winsemius wrote:



On Oct 5, 2011, at 4:08 PM, Sam wrote:


Dear list,

I have very little experience in dealing with proportions, i am  
sure this is a very simple question


Sometimes making that claim in a group of statisticians can provoke  
an extended discussion to which there will be no eventual agreement.  
Discussions about testing proportions in groups with small numbers  
is one such excellent example.


but i could find no suitable answer beyond doing a chi-sq test  and  
then using the Marascuilo procedure as a post-hoc analysis.


I am simply wanting to know if the proportions ( i.e the number of  
Yes / No) significantly differ between the cases and if so which  
cases are significantly high or low?



proportion - structure(list(Case = structure(1:11, .Label = c(A,  
B, C,
D, E, F, G, H, I, J, K), class = factor), Yes =  
c(18L,

2L, 1L, 2L, 44L, 27L, 2L, 15L, 13L, 3L, 34L), No = c(171L, 11L,
5L, 8L, 146L, 80L, 5L, 30L, 22L, 5L, 42L), Num = c(189L, 13L,
6L, 10L, 190L, 107L, 7L, 45L, 35L, 8L, 76L)), .Names = c(Case,
Yes, No, Num), class = data.frame, row.names = c(NA,
-11L))



I would probably have turned to logistic regression and used  
p.adjust to deal with the multiple comparisons issues. By default,  
the Intercept p-value estimate for a simple model with a factor  
variable as the predictor tests for the Intercept = 0 tests for of  
a  proportion  to 0.50 (since that is the proportion at which the  
log(odds) = 0). It then tests for equality to that odds for all the  
other levels (the same as odds ratios being 1 or their  
coefficients=0). (The first way I set this up I ended up with your A- 
level being the reference and the hypotheses being tested didn't  
seem reasonable or interesting namely that the lowest  
proportion was equal to 50% and that all the rest of the proportions  
were equal to that lowest proportion.)


as.matrix(proportion$Yes/proportion$Num)
  [,1]
[1,] 0.0952381
[2,] 0.1538462
[3,] 0.167
[4,] 0.200
[5,] 0.2315789
[6,] 0.2523364
[7,] 0.2857143
[8,] 0.333
[9,] 0.3714286
[10,] 0.375
[11,] 0.4473684

So I took your data and made your K level the reference value  
since its value (0.44) is the one closest to 50% and then looked at  
the p-values from the others as information about their similarity  
to the K level.


proportion$Case - relevel(proportion$Case, K)
casefit - glm(cbind(Yes, Num) ~ Case, data= proportion,  
family=binomial)


Should have been :

casefit - glm(cbind(Yes, No) ~ Case, data= proportion,  
family=binomial)



summary(casefit)
summary(casefit)$coefficients
p.adjust( summary(casefit)$coefficients[ , 4] )

That set of hypotheses gives very few significant results, but  
most of your groups are pretty sparse so that should be expected.  
Another (and I think better) approach resulting in even fewer  
significant results is to use logistic regression with an offset  
term equal to the log(odds) for the entire group:


log(  with(proportion, sum(Yes)/sum(No) ) )   # returns [1] -1.181994
This corresponds to an expected proportion of 0.2346939

casefit - glm(cbind(Yes, Num) ~ 0+ Case, data= proportion,
  offset=rep( log( with(proportion, sum(Yes)/ 
sum(No) ) ), 11),   # each level gets the same offset.

  family=binomial)


This should have been:

casefit - glm(cbind(Yes, No) ~ 0+ Case, data= proportion,  
offset=rep( log( with(proportion, sum(Yes)/sum(No) ) ), 11),  
family=binomial)




I used the 0+ on the RHS of the formula to force a labeled estimate  
for each level, since the p-values are now in reference to the  
overall proportion. Since only one level has a significant p-value  
and it is highly significant (due to the combination of higher  
numbers of events and distance from the NULL estimate) you will  
not see much change if you adjust its p-value.




New version of table :

  cbind(round( summary(casefit)$coefficients, 3), Prpns  
=round( (proportion$Yes/proportion$Num)[c(11,1:10)], 3),  
N_Yes=proportion$Yes[c(11,1:10)] )

  Estimate Std. Error z value Pr(|z|) Prpns N_Yes
CaseK0.971  0.231   4.2080.000 0.44734
CaseA   -1.069  0.248  -4.3150.000 0.09518
CaseB   -0.523  0.769  -0.6800.496 0.154 2
CaseC   -0.427  1.095  -0.3900.696 0.167 1
CaseD   -0.204  0.791  -0.2580.796 0.200 2
CaseE   -0.017  0.172  -0.1010.919 0.23244
CaseF0.096  0.223   0.4300.667 0.25227
CaseG0.266  0.837   0.3180.751 0.286 2
CaseH0.489  0.316   1.5460.122 0.33315
CaseI0.656  0.350   1.8750.061 0.37113
CaseJ0.671  0.730   0.9190.358 0.375 3


Comparing the numbers and the observed proportions to the overall  
proportion (0.2346939) shows not many (only one actually) groups  
with strong evidence of being different. I think you need larger  
numbers.




Now have two groups with significant differences.
 p.adjust( 

Re: [R] unique possible bug

2011-10-06 Thread Uwe Ligges

I see.  For now: Yes, you need to change and recompile.
I will take a look what was actually changed and will run some test cases.

Best,
Uwe


On 06.10.2011 16:50, Patrick McCann wrote:

The error I am referring to is in unique.c in Base R, it cannot
accomodate greater than 2^29 values, even though it appears the
overflow protection should be 2^30. The only relevance of the arules
package is I was using it while I discovered this issue.

Thanks,
Patrick



2011/10/6 Uwe Liggeslig...@statistik.tu-dortmund.de:



On 05.10.2011 22:15, Patrick McCann wrote:


Hi,

I am trying to read in a rather large list of transactions using the
arules library.


You mean the arules package?



It seems in the coerce method into the dgCmatrix, it
somewhere calls unique. Unique.c throws an error when  n536870912;
however, when 4*n was modified to 2*n in 2004, the overflow protection
should have changed from 2^29 to 2^30, right? If so, how would I
change it in my copy? Do I have to recompile everything?


Yes.

There is also the way to ask the maintainer to improve it, but it won't work
without reinstallation of the changed package sources.

Uwe Ligges




Thanks,
Patrick McCann


Here is a simple to reproduce example:


runif(2^29+5)-a
sum(unique(a))-b


Error in unique.default(a) : length 536870917 is too large for hashing


traceback()


3: unique.default(a)
2: unique(a)
1: unique(a)


unique.default


function (x, incomparables = FALSE, fromLast = FALSE, ...)
{
 z- .Internal(unique(x, incomparables, fromLast))
 if (is.factor(x))
 factor(z, levels = seq_len(nlevels(x)), labels = levels(x),
 ordered = is.ordered(x))
 else if (inherits(x, POSIXct))
 structure(z, class = class(x), tzone = attr(x, tzone))
 else if (inherits(x, Date))
 structure(z, class = class(x))
 else z
}
environment: namespace:base


 From http://svn.r-project.org/R/trunk/src/main/unique.c I see:



/*
  Choose M to be the smallest power of 2
  not less than 2*n and set K = log2(M).
  Need K= 1 and hence M= 2, and 2^M= 2^31 -1, hence n= 2^29.

  Dec 2004: modified from 4*n to 2*n, since in the worst case we have
  a 50% full table, and that is still rather efficient -- see
  R. Sedgewick (1998) Algorithms in C++ 3rd edition p.606.
*/
static void MKsetup(int n, HashData *d)
{
int n4 = 2 * n;

if(n0 || n536870912) /* protect against overflow to -ve */
error(_(length %d is too large for hashing), n);
d-M = 2;
d-K = 1;
while (d-Mn4) {
d-M *= 2;
d-K += 1;
}
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] KS test and theoretical distribution

2011-10-06 Thread francogrex
 x - runif(100)
 y - runif(100)
 ks.test(x,y)

   Two-sample Kolmogorov-Smirnov test

data:  x and y 
D = 0.11, p-value = 0.5806
alternative hypothesis: two-sided 

ok I expected that, but:

 ks.test(runif(100), runif)

One-sample Kolmogorov-Smirnov test

data:  runif(100) 
D = 0.9106, p-value  2.2e-16
alternative hypothesis: two-sided 

How come? 

--
View this message in context: 
http://r.789695.n4.nabble.com/KS-test-and-theoretical-distribution-tp3878640p3878640.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] KS test and theoretical distribution

2011-10-06 Thread Achim Zeileis

On Thu, 6 Oct 2011, francogrex wrote:


x - runif(100)
y - runif(100)
ks.test(x,y)


  Two-sample Kolmogorov-Smirnov test

data:  x and y
D = 0.11, p-value = 0.5806
alternative hypothesis: two-sided

ok I expected that, but:


ks.test(runif(100), runif)


   One-sample Kolmogorov-Smirnov test

data:  runif(100)
D = 0.9106, p-value  2.2e-16
alternative hypothesis: two-sided

How come?


Because you didn't read the docs, or didn't follow them, or mistyped. The 
docs say


   y: either a numeric vector of data values, or a character string
  naming a cumulative distribution function or an actual
  cumulative distribution function such as 'pnorm'.  Only
  continuous CDFs are valid.

So you would want

ks.test(runif(100), punif)

Best,
Z


--
View this message in context: 
http://r.789695.n4.nabble.com/KS-test-and-theoretical-distribution-tp3878640p3878640.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SPlus to R

2011-10-06 Thread Scott Raynaud
I'm re-posting this since it did not appear at the end of the thread.  Sorry 
for the inconvenience.  Not sure why it's giving the message: An embedded and 
charset-unspecified text was scrubbed...  As far as I know my replies are set 
up as text and not html.
 I did some testing and I believe the program is operating properly.  It takes 
some time to finish, especially as sample sizes get larger, but I seem to be 
able to reproduce the results from the original paper.  Right now I'm most 
interested in method 3.  I set nc=40 and d=.2 as in the paper.  The results are 
below.  The program iterates to find a solution and the more subjects (nc), the 
more iterations it needs.  The initial pass uses estimates from the randomized 
method as starting values and this is plotted in black on a graph.  The next 
pass uses the algorithm to generate sample size estimates which are plotted in 
gray on the first pass.  Each new estimate is plotted in dark blue as the 
previous one turns black.  This reults in several curves being plotted but one 
can get an idea of the relevant one by looking for the blue plot and examining 
the $ne vector below.
 
Reading the graph is more troublesome as nc increases.  It takes many more 
iterations to complete and the graph gets quite crowded.  I'm not sure what the 
best solution is.  Which would be easiest?
 
1) Plot ne against pc for the final iteration.  The problem here is pc is not 
saved in a vector.
2) Cut and paste $ne into Excel and re-create the corresponding pc's.
 
It's not clear to me how pc is incremented.  Seems like it's in the code block 
below:
 
### for method 3
if (method==3) {
if (tol1  tol2/10) tol1-tol2/10
ncstar-(1-d)*nc
pc-(0:ncstar)/nc
ne-rep(NA,ncstar + 1)
for (i in (0:ncstar))
{ ne[i+1]-ss.rand(i,nc,d,alpha=.05,power=.8,tol=.01)
}
plot(pc,ne,type='l',ylim=c(0,max(ne)*1.5))
ans-c.searchd(nc, d, ne, alpha, power, cc, tol1)

 
 sshc()
 [1] 0.8000 0.7889 0.7825 0.7786 0.7763 0.7751 0.7747 0.7750 0.7759 0.7772
[11] 0.7790 0.7815 0.7852 0.7899 0.7943 0.7968 0.7979 0.7989 0.8002 0.8021
[21] 0.8045 0.8072 0.8100 0.8123 0.8137 0.8137 0.8124 0.8105 0.8095 0.8103
[31] 0.8134 0.8180 0.8244
 [1]  11.64  19.62  27.50  36.44  46.54  58.19  71.81  87.93 107.30 130.38
[11] 158.47 192.34 235.47 291.93 370.48 471.26 569.17 626.36 626.18 588.94
[21] 529.27 459.67 389.30 323.70 263.50 210.58 164.26 124.25  90.49  63.23
[31]  42.13  26.23   8.13
[1]  old.abs.dev= 0.434571909427754
[1]  abs.dev= 0.259398042835904
 [1] 0.8000 0.8031 0.8063 0.8094 0.8120 0.8140 0.8152 0.8156 0.8152 0.8141
[11] 0.8122 0.8097 0.8073 0.8055 0.8045 0.8037 0.8025 0.8015 0.8006 0.7995
[21] 0.7982 0.7969 0.7956 0.7946 0.7939 0.7930 0.7916 0.7897 0.7879 0.7872
[31] 0.7876 0.7886 0.7887
 [1]  11.64  20.91  30.51  42.13  56.19  73.71  95.59 123.29 159.20 204.83
[11] 263.66 337.90 424.41 520.76 623.55 727.31 804.98 810.60 733.27 621.58
[21] 504.83 397.60 310.92 242.78 190.93 151.72 120.81  95.24  72.67  52.84
[31]  36.01  22.42   6.32
[1]  old.abs.dev= 0.259398042835904
[1]  abs.dev= 0.160699574601494
 [1] 0.8000 0.7985 0.7972 0.7958 0.7944 0.7930 0.7917 0.7907 0.7903 0.7907
[11] 0.7917 0.7929 0.7941 0.7951 0.7964 0.7977 0.7982 0.7984 0.7989 0.7995
[21] 0.8000 0.8006 0.8013 0.8023 0.8036 0.8048 0.8054 0.8055 0.8055 0.8063
[31] 0.8081 0.8103 0.8125
 [1]  11.64  20.56  29.46  39.64  51.27  64.86  80.67  99.46 122.30 149.90
[11] 184.62 230.35 286.04 352.93 429.63 508.43 588.14 635.70 616.86 561.50
[21] 489.16 408.17 333.63 269.21 215.86 173.66 139.24 110.43  84.50  61.28
[31]  41.38  25.49   7.42
[1]  old.abs.dev= 0.160699574601494
[1]  abs.dev= 0.0893104146067729
 [1] 0.8000 0.8006 0.8013 0.8021 0.8029 0.8036 0.8043 0.8048 0.8050 0.8051
[11] 0.8049 0.8045 0.8037 0.8024 0.8019 0.8019 0.8015 0.8012 0.8010 0.8009
[21] 0.8007 0.8002 0.7995 0.7990 0.7987 0.7985 0.7980 0.7970 0.7958 0.7950
[31] 0.7950 0.7950 0.7939
 [1]  11.64  20.72  29.92  40.75  53.42  68.99  87.79 111.53 141.92 179.12
[11] 226.54 290.16 365.88 456.60 556.96 640.04 722.83 757.33 711.48 624.30
[21] 522.26 423.34 335.91 264.10 206.62 162.86 128.85 101.60  77.86  56.72
[31]  38.30  23.41   6.49
[1]  old.abs.dev= 0.0893104146067729
[1]  abs.dev= 0.0506515395975703
 [1] 0.8000 0.7996 0.7994 0.7990 0.7986 0.7982 0.7978 0.7974 0.7970 0.7969
[11] 0.7970 0.7975 0.7981 0.7979 0.7982 0.7990 0.7991 0.7990 0.7990 0.7992
[21] 0.7994 0.7996 0.7996 0.7998 0.8002 0.8009 0.8015 0.8015 0.8014 0.8016
[31] 0.8025 0.8035 0.8041
 [1]  11.64  20.65  29.71  40.22  52.31  66.93  84.02 104.94 131.25 161.98
[11] 200.06 251.89 310.71 386.59 476.45 545.33 622.06 664.45 636.92 570.09
[21] 486.12 402.82 326.45 262.37 208.03 165.78 131.88 104.87  81.05  59.38
[31]  40.18  24.54   6.99
$ne
 [1]  11.637555  20.651336  29.710258  40.223468  52.314568  66.928887
 [7]  84.022673 104.936774 131.254491 161.981416 200.060785 251.890694
[13] 310.710449 386.591379 476.449601 545.327456 622.056417 664.447078
[19] 636.923065 570.091413 486.117837 402.817857 326.451672 

Re: [R] KS test and theoretical distribution

2011-10-06 Thread Prof Brian Ripley

On Thu, 6 Oct 2011, francogrex wrote:


x - runif(100)
y - runif(100)
ks.test(x,y)


  Two-sample Kolmogorov-Smirnov test

data:  x and y
D = 0.11, p-value = 0.5806
alternative hypothesis: two-sided

ok I expected that, but:


ks.test(runif(100), runif)


   One-sample Kolmogorov-Smirnov test

data:  runif(100)
D = 0.9106, p-value  2.2e-16
alternative hypothesis: two-sided

How come?


Pilot error.  The help says

   y: either a numeric vector of data values, or a character string
  naming a cumulative distribution function or an actual
  cumulative distribution function such as ‘pnorm’.  Only
  continuous CDFs are valid.

runif is not 'an actual cumulative distribution function' 




--
View this message in context: 
http://r.789695.n4.nabble.com/KS-test-and-theoretical-distribution-tp3878640p3878640.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with .C

2011-10-06 Thread Jan van der Laan

Quoting Uwe Ligges lig...@statistik.tu-dortmund.de:




I don't agree that it's overkill -- you get to sidestep the whole `R
CMD SHLIB ...` and `dyn.load` dance this way while you experiment with
C(++) code 'live using the inline package.



You need two additional packages now where you have to rely on the fact
those are available. Moreover, you have to get used to that syntax, and
part of it seems to be C++ now? At least I do not know why the above
should work at all, while I know the simple C function does.


OK, I agree that switching to Rcpp/C++ might be a bit of overkill in
this example although in a lot of other example I find the Rcpp syntax
much more readable than the c-code when dealing with .Call .

The example could also have been writen in C using inline removing the
need of Rcpp and looking more like the original example:

library(inline)

test - cfunction(signature(b = numeric, l = integer) , '
 for(int i=0; i  *l; i++) b[i] += i;
 ', convention=.C)

I find that the advantage of using inline (especially in case of
simple functions like this) is that
1. I no long need to compile and load the shared library manually,
which can sometimes be frustrating when windows locks the dll.
2. Inline performs typechecking and casts variables to the right type.  
You can now type test(1:10,10) without needing as.numeric or  
as.integer. Reducing the amount of r code and the probabiliry of  
screwing things up by passing the wrong type.



Jan




Uwe



It's really handy.


Just make the original source


void test(double *b, int *l)
{
int i;
for(i=0; i  *l ; i++) b[i] += i;
}


which you would have know after reading the Wriiting R Extensions manual.


I agree that this step is unavoidable no matter which avenue (Rcpp or
otherwise) one decides to take.

-steve



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Limitations of R

2011-10-06 Thread Uwe Ligges



On 06.10.2011 16:49, Alaios wrote:

Dear all,
I have a few binary files like 9Gb or even of 50Gb.
I would like to ask you what are the known limits of the R for the data 
processing part, I have a system with a lot of RAM (500Gb)


Really accessible from one core? Amazing.


but I am not sure about the internal limitations of the R. How long for 
example a vector can be?


See
?Memory-limits

Uwe Ligges






Could you please inform me for these internal R limitations?

I would like to thank you in advance for your help

B.R
Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] converting commas to points

2011-10-06 Thread Anna Lee
Hello everyone!

I work with a german excell version which uses commas instead of
points for seperating decimal places. R work with points so in order
to be able to save my excell tables without changing the commas to
points, whenever I load a table I type in: read.table(..., dec = ,)
only R puts the points into the wron places. For excample excell has a
cell with the number: 0,09 so what R does, it writes the number as 0.9
which is wrong and makes my calculations become useless.

Does anyone know this problem? Maby I made a mistake somewhere?

I would be glad about your answers!

Cheers, Anna

-- 



Der Inhalt dieser E-Mail ist vertraulich. Sollte Ihnen die E-Mail
irrtümlich zugesandt worden sein, bitte ich Sie, mich unverzüglich zu
benachrichtigen und die E-Mail zu löschen.

This e-mail is confidential. If you have received it in error, please
notify me immediately and delete it from your system.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Urgent help needed for honours project - breaks between races in one year

2011-10-06 Thread Jana.K
Hi to anyone who is willing to help,

I have a csv. file which has 1999 horses as the rows and the age(in years)
of the horse at each race as columns. Ive read this file into R and called
it 'horses'.
Im trying to find the longest break between each race in the horse's first
year of racing. I already have a numeric called 'age.first' which is the age
at which each horse had its first race. I understand if i want to look at
the breaks a year from the first race i have to add +1. But i am not sure
how to write the code into R. Can anyone help me get the longest break in
the first year of racing for each horse?

Jana.

--
View this message in context: 
http://r.789695.n4.nabble.com/Urgent-help-needed-for-honours-project-breaks-between-races-in-one-year-tp3878187p3878187.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sum of functions

2011-10-06 Thread Dimitris.Kapetanakis
Dear all,

I would like to create a code for semiparametric Klein and Spady's
estimator. For that I created a function that provides the log-likelihood
function for each observation (so it is a function of betas and i, where i
denotes the observation). Now, in order to maximize the log-likelihood
function, I have to sum these log-likelihood functions for each i and so to
get another function that is a function only of betas and so to maximize it
through maxLik for instance. Is that possible? 

In order to be more clear I give an example of how it could be:

Prob1- function(b, i)
g.yj(b,y=1,h.np,i)/(g.yj(b,y=1,h.np,i)+g.yj(b,y=0,h.np,i))
loglik.i- function(b, i) Y[i,]*log(Prob1(b,i))+(1-Y[i,])*log(1-Prob1(b,i))

where b denotes the betas, i the observations, Y is the response vector and
g.yj(b,1,h.np,i) a function that I created previously, Prob1(b,i) is a
function that gives the conditional probability for observation i and
loglik.i(b,i) is function that gives the log-likelihood for observation i.

How can I sum the loglik.i(b,i) for each i and remain as a function of b
ONLY in order to maximize it???

For exemple this could be done manually by
loglik- function(b)
loglik.i(b,1)+loglik.i(b,2)+loglik.i(b,3)+….+loglik.i(b,N)

but how can I do it automatically for all observations?

Thank you

Dimitris



--
View this message in context: 
http://r.789695.n4.nabble.com/sum-of-functions-tp3878448p3878448.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About stepwise regression problem

2011-10-06 Thread pigpigmeow
using AIC/BIC, I'm not know too much about this. I just know using p-value to
perform stepwise regression

if I used p-value to perform multimodel stepwise regression, is it correct
in the first message box?

--
View this message in context: 
http://r.789695.n4.nabble.com/About-stepwise-regression-problem-tp3870217p3878297.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] setting up the RPostgreSQL_0.2-0.tar

2011-10-06 Thread Elinor Hartman
I am a new R user, and am looking to call Postgres through R.  I am not so
technical and need help in setting up.
I have downloaded the tar file but cannot get passed this step.  Is there
documentation for dummies to get working with this?
I don't even know where I should save this file?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] initial value in ComboBox tkwidget

2011-10-06 Thread Alexander
Hello,
I took me very long to find out how to set the initial value in the combobox
widget in tk (see the example below). My question is now : Why does
textvariable has to be a tclVar while values can be a normal vector? My
next question is: How could I have known this much earlier? Is there a
documentation for the tcl/tk usage in R? I know a lot of website with
example. But the documentation is not very complet. Any tipps?

require(tcltk)
tclRequire(BWidget)
tt - tktoplevel()
tkgrid(tklabel(tt,text=What's your favorite fruit?))

fruits - c(Apple,Orange,Banana,Pear)
comboBox -
tkwidget(tt,ComboBox,editable=TRUE,values=fruits,textvariable=tclVar(Banana))
tkgrid(comboBox)

OnOK - function()
{
fruitChoice - fruits[as.numeric(tclvalue(tcl(comboBox,getvalue)))]
tkdestroy(tt)
msg - paste(Good choice! ,fruitChoice,s are delicious!,sep=)
tkmessageBox(title=Fruit Choice,message=msg)

}
OK.but -tkbutton(tt,text=   OK   ,command=OnOK)
tkgrid(OK.but)
tkfocus(tt)




--
View this message in context: 
http://r.789695.n4.nabble.com/initial-value-in-ComboBox-tkwidget-tp3878164p3878164.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting commas to points

2011-10-06 Thread Rainer Schuermann
From ?read.csv:

   read.csv2( file, header = TRUE, sep = ;, quote=\, dec=,,
   fill = TRUE, comment.char=, ...)

I think this is specifically set up for German decimal commas.

Rgds,
Rainer





On Thursday 06 October 2011 17:39:46 Anna Lee wrote:
 Hello everyone!
 
 I work with a german excell version which uses commas instead of
 points for seperating decimal places. R work with points so in order
 to be able to save my excell tables without changing the commas to
 points, whenever I load a table I type in: read.table(..., dec = ,)
 only R puts the points into the wron places. For excample excell has a
 cell with the number: 0,09 so what R does, it writes the number as 0.9
 which is wrong and makes my calculations become useless.
 
 Does anyone know this problem? Maby I made a mistake somewhere?
 
 I would be glad about your answers!
 
 Cheers, Anna
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unique possible bug

2011-10-06 Thread Patrick McCann
The error I am referring to is in unique.c in Base R, it cannot
accomodate greater than 2^29 values, even though it appears the
overflow protection should be 2^30. The only relevance of the arules
package is I was using it while I discovered this issue.

Thanks,
Patrick



2011/10/6 Uwe Ligges lig...@statistik.tu-dortmund.de:


 On 05.10.2011 22:15, Patrick McCann wrote:

 Hi,

 I am trying to read in a rather large list of transactions using the
 arules library.

 You mean the arules package?


 It seems in the coerce method into the dgCmatrix, it
 somewhere calls unique. Unique.c throws an error when  n  536870912;
 however, when 4*n was modified to 2*n in 2004, the overflow protection
 should have changed from 2^29 to 2^30, right? If so, how would I
 change it in my copy? Do I have to recompile everything?

 Yes.

 There is also the way to ask the maintainer to improve it, but it won't work
 without reinstallation of the changed package sources.

 Uwe Ligges



 Thanks,
 Patrick McCann


 Here is a simple to reproduce example:

 runif(2^29+5)-a
 sum(unique(a))-b

 Error in unique.default(a) : length 536870917 is too large for hashing

 traceback()

 3: unique.default(a)
 2: unique(a)
 1: unique(a)

 unique.default

 function (x, incomparables = FALSE, fromLast = FALSE, ...)
 {
     z- .Internal(unique(x, incomparables, fromLast))
     if (is.factor(x))
         factor(z, levels = seq_len(nlevels(x)), labels = levels(x),
             ordered = is.ordered(x))
     else if (inherits(x, POSIXct))
         structure(z, class = class(x), tzone = attr(x, tzone))
     else if (inherits(x, Date))
         structure(z, class = class(x))
     else z
 }
 environment: namespace:base

 From http://svn.r-project.org/R/trunk/src/main/unique.c I see:


 /*
  Choose M to be the smallest power of 2
  not less than 2*n and set K = log2(M).
  Need K= 1 and hence M= 2, and 2^M= 2^31 -1, hence n= 2^29.

  Dec 2004: modified from 4*n to 2*n, since in the worst case we have
  a 50% full table, and that is still rather efficient -- see
  R. Sedgewick (1998) Algorithms in C++ 3rd edition p.606.
 */
 static void MKsetup(int n, HashData *d)
 {
    int n4 = 2 * n;

    if(n  0 || n  536870912) /* protect against overflow to -ve */
        error(_(length %d is too large for hashing), n);
    d-M = 2;
    d-K = 1;
    while (d-M  n4) {
        d-M *= 2;
        d-K += 1;
    }
 }

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting commas to points

2011-10-06 Thread Uwe Ligges



On 06.10.2011 17:39, Anna Lee wrote:

Hello everyone!

I work with a german excell version which uses commas instead of
points for seperating decimal places. R work with points so in order
to be able to save my excell tables without changing the commas to
points, whenever I load a table I type in: read.table(..., dec = ,)
only R puts the points into the wron places. For excample excell has a
cell with the number: 0,09 so what R does, it writes the number as 0.9


Please specify an example with the excel output and what R reads. You 
will find that such an example does not exist and hence you made some 
mistake and compared different numbers or so.


Uwe Ligges



which is wrong and makes my calculations become useless.

Does anyone know this problem? Maby I made a mistake somewhere?

I would be glad about your answers!

Cheers, Anna



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting commas to points

2011-10-06 Thread Anna Lee
Sorry guys, I allready found the solution. Excell showed some of the
numbers in the format: 1,90053-E05 and R interpreted it as 1.9... I
changed that in Excel

Cheers, Anna

Am 6. Oktober 2011 17:48 schrieb Uwe Ligges lig...@statistik.tu-dortmund.de:


 On 06.10.2011 17:39, Anna Lee wrote:

 Hello everyone!

 I work with a german excell version which uses commas instead of
 points for seperating decimal places. R work with points so in order
 to be able to save my excell tables without changing the commas to
 points, whenever I load a table I type in: read.table(..., dec = ,)
 only R puts the points into the wron places. For excample excell has a
 cell with the number: 0,09 so what R does, it writes the number as 0.9

 Please specify an example with the excel output and what R reads. You will
 find that such an example does not exist and hence you made some mistake and
 compared different numbers or so.

 Uwe Ligges


 which is wrong and makes my calculations become useless.

 Does anyone know this problem? Maby I made a mistake somewhere?

 I would be glad about your answers!

 Cheers, Anna





-- 



Der Inhalt dieser E-Mail ist vertraulich. Sollte Ihnen die E-Mail
irrtümlich zugesandt worden sein, bitte ich Sie, mich unverzüglich zu
benachrichtigen und die E-Mail zu löschen.

This e-mail is confidential. If you have received it in error, please
notify me immediately and delete it from your system.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] undefined slot classes in package code

2011-10-06 Thread Pascal A. Niklaus

Dear all,

I am facing a problem (warning message) when building a package that I 
am unable to fix:



* checking whether package ‘myRcppTest’ can be installed ... WARNING
Found the following significant warnings:
  Warning: undefined slot classes in definition of S4test: 
Rcpptest(class Rcpp_test)



The package (originally generated with Rcpp.package.skeleton) contains a 
class named test, in module test (see the *.cpp file below). This 
class works fine (I can generate and use objects created with 
new(test$test)).


To extend the functionality of this class, I want to wrap a S4 class 
(named S4test in the minimal example below) around the Rcpp class. The 
idea is to store an instance of the Rcpp-class in a slot (see R file 
below), and provide S4 methods that make use of this internal object.


The problem seems to be that during the check/install process 
test$test is not known when this file is processed. However, I could 
not figure out so far how to make test$test known within the context 
of this file...


Thanks for your help

Pascal


--- cpp file 

#include Rcpp.h
using namespace Rcpp ;

RCPP_MODULE(test)
{
  class_test(test)
.constructor()
.field(value, test::val)
  ;
}

test::test() {}
test::~test() {}

-- R file -

setClass(S4test,
 representation(
  Rcpptest=Rcpp_test)
);

setMethod(f=initialize,
  signature=S4test,
  definition=function(.Object) {
 .Object@Rcpptest - new(test$test);
 .Object;
  }
);

newS4test - function()
{
  new(S4test);
}

- zzz.R -

.NAMESPACE - environment()
test - new( Module )

.onLoad - function(libname, pkgname)
{
  unlockBinding( test , .NAMESPACE )
  assign( test,  Module( test ), .NAMESPACE )
  lockBinding( test, .NAMESPACE )
}

.onAttach - function(libname, pkgname) {}
.onUnload - function(libname, pkgname) {}

 NAMESPACE 

useDynLib(myRcppTest)
exportPattern(^[[:alpha:]]+)
import( Rcpp )



--

Pascal A. Niklaus
Institute of Evolutionary Biology and Environmental Studies
University of Zurich
CH-8057 Zurich / Switzerland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Limitations of R

2011-10-06 Thread Martin Morgan

On 10/06/2011 08:33 AM, Uwe Ligges wrote:



On 06.10.2011 16:49, Alaios wrote:

Dear all,
I have a few binary files like 9Gb or even of 50Gb.


Hi Alaios --

Maybe you have a particular domain you are interested in (e.g., 
high-throughput sequence analysis) and there are packages (e.g., at 
http://bioconductor.org) that make it easier to work with this size and 
format of data.


Martin


I would like to ask you what are the known limits of the R for the
data processing part, I have a system with a lot of RAM (500Gb)


Really accessible from one core? Amazing.


but I am not sure about the internal limitations of the R. How long
for example a vector can be?


See
?Memory-limits

Uwe Ligges






Could you please inform me for these internal R limitations?

I would like to thank you in advance for your help

B.R
Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting commas to points

2011-10-06 Thread Eik Vettorazzi
Hi Anna,
have a look at ?write.csv2, which deals with the Excel conventions for
CSV in German locale.

cheers


Am 06.10.2011 17:39, schrieb Anna Lee:
 Hello everyone!
 
 I work with a german excell version which uses commas instead of
 points for seperating decimal places. R work with points so in order
 to be able to save my excell tables without changing the commas to
 points, whenever I load a table I type in: read.table(..., dec = ,)
 only R puts the points into the wron places. For excample excell has a
 cell with the number: 0,09 so what R does, it writes the number as 0.9
 which is wrong and makes my calculations become useless.
 
 Does anyone know this problem? Maby I made a mistake somewhere?
 
 I would be glad about your answers!
 
 Cheers, Anna
 


-- 
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790

--
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; 
Gerichtsstand: Hamburg

Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. 
Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] setting up the RPostgreSQL_0.2-0.tar

2011-10-06 Thread Uwe Ligges
I got this message in a private message before and asked you to read the 
posting guide before posting here! I also asked you to specify R version 
and operating system assuming install.packages() does not work. And I 
still do not know what the error was you got.


Uwe Ligges


On 06.10.2011 17:13, Elinor Hartman wrote:

I am a new R user, and am looking to call Postgres through R.  I am not so
technical and need help in setting up.
I have downloaded the tar file but cannot get passed this step.  Is there
documentation for dummies to get working with this?
I don't even know where I should save this file?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running *slow*

2011-10-06 Thread Patrick Burns

Probably most of the time you're waiting
for this you are in Circle 2 of 'The R
Inferno'.  If the values are numbers,
you might also be in Circle 1.

On 06/10/2011 13:37, Thomas wrote:

Anyone got any hints on how to make this code more efficient? An early
version (which to be fair did more than this one is) ran for 330 hours
and produced no output.

I have a two column table, Dat, with 12,000,000 rows and I want to
produce a lookup table, ltable, in a 1 dimensional matrix with one copy
of each of the values in Dat:

for (i in 1:nrow(Dat))
{
for (j in 1:2)
{
#If next value is already in ltable, do nothing
if (is.na(match(Dat[i,j], ltable))){ltable - rbind(ltable,Dat[i,j])}
}
}

but it takes forever to produce anything.

Any advice gratefully received.

Thomas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting parabolic function to data

2011-10-06 Thread Henri Mone
Hi Duncan,

Thanks for your reply. You saved my Day :)

Thanks,
Henri



On Thu, Oct 6, 2011 at 12:07 PM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
 On 11-10-06 4:59 AM, Henri Mone wrote:

 Dear R users and experts,

 I want to fit a shifted parabolic function with the following
 functional form to my data:

 f(x)=a0*(x+a1)^2+a2

 (a0, a1 and a2 are scaling factors.)
 What is standard approach to do this in R? I tried the lm function
 in R but I got problems getting the above functional form.

 Any help is welcome :) .

 That can be expanded into a regular quadratic:

 (a0*a1^2 + a2) + 2*a0*a1*x + a0*x^2

 So fit a regular quadratic, and then solve for a0, a1, a2 from the resulting
 coefficients.  The only tricky bit will be computing errors on the a's.

 Duncan Murdoch


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running *slow*

2011-10-06 Thread R. Michael Weylandt
Patrick is right, most of the time is probably taken up for the
reasons documented in the (masterful) R Inferno, namely the rbind()
calls.

There is another problem though and it gets at the very core of R, and
for that matter, all interpreted languages that I'm familiar with.
I'll give a fairly elementary explanation and gloss over many of the
subtleties that R core worries about so we mere mortals don't have to.

At the end of the day, everything is looped, there's no way to get
around it. However, from a code perspective we have a choice of
looping in C or R. Whenever possible it is better to loop in C than R
and most of the key built-in functions, like unique(), are designed to
do just that. The reason for it is pretty straightforward: consider
what has to happen to run a loop in R:

Iterator is defined: a sequence of C calls start this
first line of loop is hit - interpreted by R - sent to C code -
executed - changed back into an R result - passed to the next line
of the loop
iterator is increased: C again
second line of loop is hit - interpreted by R - sent to C code -
executed - changed back into an R result - passed to the next line
of the loop
etc.

Complicated and/or multiple lines of code only compound the problem
because you have to go up and down multiple times at each iteration.

Looping on the C level gets rid of all those translations between
C/R, save 2, and thereby mightily increases efficiency. Hence, even if
you are using the same (or heaven forbid a faster!) algorithm on the R
level, it can look super slow because of all the moving up and down
the ladder; I don't know how unique.C is implemented, but my guess is
it's more or less like what you have now, with more efficient memory
usage/preallocation, it just looks *much* faster because of the C
architecture.

DISCLAIMER: there are quite a few inaccuracies, most small, maybe a
few large, in here, and I probably only am aware of a small fraction
thereof, but this wasn't intended to be a super accurate explanation.

On another note, I should explain my solution a little more clearly.

A straight call to unique() would check for unique ROWS not values of
x. I take x, make a copy so as not to harm the original object, strip
if of its dimensionality (thereby converting it to a vector
efficiently), and then apply unique() which will now find unique
values. It's not a huge thing, but not immediately apparent from what
I did.

Hope this helps,

Michael


On Thu, Oct 6, 2011 at 11:59 AM, Patrick Burns pbu...@pburns.seanet.com wrote:
 Probably most of the time you're waiting
 for this you are in Circle 2 of 'The R
 Inferno'.  If the values are numbers,
 you might also be in Circle 1.

 On 06/10/2011 13:37, Thomas wrote:

 Anyone got any hints on how to make this code more efficient? An early
 version (which to be fair did more than this one is) ran for 330 hours
 and produced no output.

 I have a two column table, Dat, with 12,000,000 rows and I want to
 produce a lookup table, ltable, in a 1 dimensional matrix with one copy
 of each of the values in Dat:

 for (i in 1:nrow(Dat))
 {
 for (j in 1:2)
 {
 #If next value is already in ltable, do nothing
 if (is.na(match(Dat[i,j], ltable))){ltable - rbind(ltable,Dat[i,j])}
 }
 }

 but it takes forever to produce anything.

 Any advice gratefully received.

 Thomas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
 Patrick Burns
 pbu...@pburns.seanet.com
 twitter: @portfolioprobe
 http://www.portfolioprobe.com/blog
 http://www.burns-stat.com
 (home of 'Some hints for the R beginner'
 and 'The R Inferno')

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] any way to convert back to DateTime class when accidental conversion to numeric?

2011-10-06 Thread Mike Williamson
Hi Dr. Ripley, All,

Thanks for the succinct advice!  Perfectly what I needed!

Jeff,

Absolutely I agree that this is a dangerous path, and I would never
consider doing it for something that needs to be robust.  But in 'R' type
casting is a bit messed up, so I've come to accept that sometimes something
I called a string might become a factor, a date became a numeric, etc.  Then
I stick in enough catches throughout my functions to deal with it, worst
case.  It is ugly, but I cannot figure out a way to have tight types and
still use 'R'.  Yet, 'R' has so many cool functions (and more added every
day); I'd be silly to throw the baby out with the bathwater.
So my typical best case or robust solution is to write a parent script
in python (I simply know python best) that handles all data typing, etc.,
then call 'R' once I know that everything is clean.  In this particular case
above where I was asking, this is really for exploratory work.  Once I get a
solution, I will likely handle typing outside of 'R'.

Thanks for the advice!

  Regards,
 Mike


---
XKCD http://www.xkcd.com



On Wed, Oct 5, 2011 at 11:41 PM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote:

 A more portable way (that function only works in some versions of R) is

 as.POSIXct(1317857320, origin=1970-01-01)

 possibly with a 'tz' argument if you need to restore the timezone.


 On Wed, 5 Oct 2011, jim holtman wrote:

  Here is what I use:

 unix2POSIXct(1317857320)
 [1] 2011-10-05 19:28:40 EDT


 unix2POSIXct  -  function (time) structure(time, class = c(POSIXt,
 POSIXct))


 On Wed, Oct 5, 2011 at 7:38 PM, Mike Williamson this.is@gmail.com
 wrote:

 Hi,

In short, I would like to know if there is any way to convert a
 numeric
 into a date, similar to how strptime() can convert a string to a date
 time
 class?

There are some functions, etc. which don't work well with dates, and
 tend to force them into numerics.  I understand that the number it spits
 back is the number of seconds since the beginning of 1970 (see the first
 few
 sentences of the Details portion of ?DateTimeClasses).
However, it's a bit of a hassle to convert that by hand.  I can create
 a
 function to do this, and it isn't so hard, but I found it hard to believe
 such a function didn't already exist, so I wanted to ask the community.

As an example, today (Oct 5th 2011 at approximately 4:30pm, Pacific
 time) is approximately 1317857320 as a numeric, but I would like to know
 how
 to go from that number back to the 2011-10-05 16:28:39 PDT date time
 class
 which originally generated it.

Thanks!
  Mike

 ---
 XKCD http://www.xkcd.com

[[alternative HTML version deleted]]

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  
 http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SPlus to R

2011-10-06 Thread David Winsemius


On Oct 6, 2011, at 11:17 AM, Scott Raynaud wrote:

I'm re-posting this since it did not appear at the end of the  
thread.  Sorry for the inconvenience.  Not sure why it's giving the  
message: An embedded and charset-unspecified text was scrubbed...   
As far as I know my replies are set up as text and not html.


As far as I can tell but have not been able to find this well- 
documented, the test being applied is not to the content of an  
attachment but rather to the extension of the attachment's name. If it  
is not something.txt or something_else.pdf, it's probably  
going to fail the server test and be scrubbed. Files with  
extensions .csv or .dat will generally fail, despite being text  
formats.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About stepwise regression problem

2011-10-06 Thread Dennis Murphy
Hi:

Please read this:
http://www.stata.com/support/faqs/stat/stepwise.html

Dennis

On Thu, Oct 6, 2011 at 6:41 AM, pigpigmeow gloryk...@hotmail.com wrote:
 using AIC/BIC, I'm not know too much about this. I just know using p-value to
 perform stepwise regression

 if I used p-value to perform multimodel stepwise regression, is it correct
 in the first message box?

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/About-stepwise-regression-problem-tp3870217p3878297.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with regexp

2011-10-06 Thread Jannis
Thanks to all who replied! With all these possible solutions it will be hard to 
find the best one :-).

--- Gabor Grothendieck ggrothendi...@gmail.com schrieb am Mi, 5.10.2011:

 Von: Gabor Grothendieck ggrothendi...@gmail.com
 Betreff: Re: [R] help with regexp
 An: Jannis bt_jan...@yahoo.de
 CC: r-h...@stat.math.ethz.ch
 Datum: Mittwoch, 5. Oktober, 2011 15:13 Uhr
 On Wed, Oct 5, 2011 at 7:56 AM,
 Jannis bt_jan...@yahoo.de
 wrote:
  Dear list memebers,
 
 
  I am stuck with using regular expressions.
 
 
  Imagine I have a vector of character strings like:
 
  test - c('filename_1_def.pdf',
 'filename_2_abc.pdf')
 
  How could I use regexpressions to extract only the
 'def'/'abc' parts of these strings?
 
 
  Some try from my side yielded no results:
 
  testresults -
 grep('(?=filename_[[:digit:]]_).{1,3}(?=.pdf)', perl =
 TRUE, value = TRUE)
 
  Somehow I seem to miss some important concept here.
 Until now I always used nested sub expressions like:
 
  testresults - sub('.pdf$', '',
 sub('^filename_[[:digit:]]_', '' , test))
 
 
  but this tends to become cumbersome and I was
 wondering whether there is a more elegant way to do this?
 
 
 Here are a couple of solutions:
 
 # remove everything up to _b as well as everything from .
 onwards
 gsub(.*_|[.].*, , test)
 
 # extract everything that is not a _ provided it is
 immediately followed by .
 library(gsubfn)
 strapply(test, ([^_]+)[.], simplify = TRUE)
 
 -- 
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Advice in model construction

2011-10-06 Thread Weidong Gu
Hi Chris,

Linear regression model of categorical variables is equivalent to
anova. If one of estimates of coefficients is significant in lm, it is
interpreted as samples of Family did not come from the same population
and the signficant factor level is compared to the reference level. If
you want pair-wise comparison, you need post-hoc testing methods
(?TukeyHSD).

HTH

Weidong Gu

On Wed, Oct 5, 2011 at 1:55 PM, Chris Mcowen chrismco...@gmail.com wrote:
 Dear list,

 I am unsure how to structure my model, i have tried something and it makes 
 sense but i am unsure if i am interpreting it correctly?

 i have a continuous response variable - the observed quantity of evolutionary 
 history - EH

 Then i have a number of species which have a hierarchical structure ~ Genus, 
 Family etc

 My research question is do certain families have significantly higher ( or 
 lower) EH values than the others.

 Reproducible example:

 example - structure(list(Family = structure(c(2L, 1L, 1L, 5L, 7L, 7L, 3L,
 4L, 6L, 6L, 1L, 3L), .Label = c(Araceae, Asphodelaceae, Bromeliaceae,
 Cyperaceae, Orchidaceae, Poaceae, Zingiberaceae), class = factor),
    Genus = structure(c(3L, 4L, 4L, 1L, 6L, 6L, 2L, 5L, 8L, 9L,
    4L, 7L), .Label = c(Acianthera, Aechmea, Aloe, Anthurium,
    Bulbostylis, Hedychium, Lindmania, Psathyrostachys,
    Sesleria), class = factor), Species = structure(c(9L,
    1L, 10L, 11L, 7L, 4L, 3L, 5L, 8L, 2L, 6L, 12L), .Label = c(bonplandii,
    coerulans, cymosopaniculata, elatum, emmerichiae,
    gehrigeri, glabrum, juncea, pubescens, sagittatum,
    scalpricaulis, sessilis), class = factor), EH = c(8.746525,
    24.462699, 33.03942, 32.719489, 13.598201, 13.598201, 13.164928,
    9.339228, 9.69705, 13.478372, 37.497137, 59.562911)), .Names = c(Family,
 Genus, Species, EH), class = data.frame, row.names = c(NA,
 -12L))

 #My model

 test - lm(EH~Family, data = example)

 #in this small example no families are significant but if one was - would 
 that mean they have significantly more EH than the others?

 Thanks

 Chris



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About stepwise regression problem

2011-10-06 Thread Chris Mcowen
I hadn't seen that page Dennis, that makes the case much more succinctly than 
my anti stepwise ramblings!

Furthermore, pigpigmeow if you are using a random effects model i.e lmer - 
where are you getting your p-values from? And what do they mean in this 
context? I would strongly advise using information criteria in these scenarios.

Chris

P.S - Doug Bates discusses P-values in relation to mixed effect models here : 
https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html


On 6 Oct 2011, at 18:17, Dennis Murphy wrote:

Hi:

Please read this:
http://www.stata.com/support/faqs/stat/stepwise.html

Dennis

On Thu, Oct 6, 2011 at 6:41 AM, pigpigmeow gloryk...@hotmail.com wrote:
 using AIC/BIC, I'm not know too much about this. I just know using p-value to
 perform stepwise regression
 
 if I used p-value to perform multimodel stepwise regression, is it correct
 in the first message box?
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/About-stepwise-regression-problem-tp3870217p3878297.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a loop for a function

2011-10-06 Thread R. Michael Weylandt
Well you said you wanted statistics from the test, but you didn't say
which statistics you wanted: any of the following would work:

sapply(1:10, function(i) Box.test (lfut, lag = i, type=Ljung)$statistic)
sapply(1:10, function(i) Box.test (lfut, lag = i, type=Ljung)$parameter)
sapply(1:10, function(i) Box.test (lfut, lag = i, type=Ljung)$p.value)

extractor wasn't quite the right word, but more or less that's the
idea (if you happen to know it from other programming contexts).

Michael


On Thu, Oct 6, 2011 at 1:43 AM, upananda pani upananda.p...@gmail.com wrote:
 Respected Sir,
 I am grateful to you for your reply. This is working perfectly fine. I do
 not know how to add extractors. Please give me one example with this
 function.
 With regards,
 Upananda

 On Thu, Oct 6, 2011 at 1:04 AM, R. Michael Weylandt
 michael.weyla...@gmail.com wrote:

 lapply(1:10, function(i) Box.test (lfut, lag = i, type=Ljung))

 Add extractors to get statistics as desired.

 Michael Weylandt

 On Wed, Oct 5, 2011 at 1:09 PM, upananda pani upananda.p...@gmail.com
 wrote:
  Dear All,
 
  I want to create a loop within a function r. The example follows:
 
  Box.test (lfut, lag = 1, type=Ljung)
 
  if i want to compute the Box.test for lag 1 to 10, I have to write
  manually
   change each time for different lag. So i wan to write a loop for the
  lag 1
  to 10 and return the statistics for each lag. Is there any method to do
  this
  ?
 
  With regards,
  Upananda
 
  --
 
 
  You may delay, but time will not.
 
 
  Research Scholar
  alternative mail id: up...@iitkgp.ac.in
  Department of HSS, IIT KGP
  KGP
 
         [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --


 You may delay, but time will not.


 Research Scholar
 alternative mail id: up...@iitkgp.ac.in
 Department of HSS, IIT KGP
 KGP



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gefp() boundaries?

2011-10-06 Thread bonda
Thank you very much for the answer. If I take Poisson model and follow
Generalized M-fluctuation tests for parameter instability, A. Zeileis and
K. Hornik, Statistica Neerlandica (2007) Vol. 61, N. 4, p. 500-501 (section
4.3):

data(Boston)
n - 506;
my.X - as.matrix(cbind(1, Boston[crim], Boston[age])); 
my.model - glm(tax ~ crim + age, family = poisson, data = Boston);
my.psi - estfun(my.model);
my.mu - fitted(my.model);
J - sum(my.mu*my.X%*%t(my.X))/n;
my.process - apply(as.matrix(my.psi), 2, cumsum)/sqrt(J*n);

gprocess - gefp(tax ~ crim + age, family = poisson, data=Boston);

then my.process and gprocess$process have to be the same? 

Best regards,
Julia

--
View this message in context: 
http://r.789695.n4.nabble.com/gefp-boundaries-tp3872529p3878886.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] factors in probit regression

2011-10-06 Thread garciap
Hi to all of you,

I'm fitting an full factorial probit model from an experiment, and I've the
independent variables as factors. The model is as follows:


fit16-glm(Sube ~ as.factor(CE)*as.factor(CEBO)*as.factor(Luz),
family=binomial(link=probit), data=experimento)

but, when I took a look to the results I've obtained the following:

glm(formula = Sube ~ CE * CEBO * Luz, family = binomial(link = probit), 
data = experimento)

Deviance Residuals: 
   Min  1Q  Median  3Q Max  
-1.651e-06  -1.651e-06   1.651e-06   1.651e-06   1.651e-06  

Coefficients: (3 not defined because of singularities)
Estimate Std. Error z value
Pr(|z|)
(Intercept)6.991e+00  3.699e+04   0   
1
CEexperimental 5.357e-09  4.775e+04   0   
1
CENO  -1.398e+01  4.320e+04   0   
1
CEBOcombinado  4.948e-26  4.637e+04   0   
1
CEBOolor   1.183e-25  4.446e+04   0   
1
CEBOvisual 7.842e-26  5.650e+04   0   
1
Luzoscuridad   3.383e-26  4.637e+04   0   
1
CEexperimental:CEBOcombinado  -6.227e-26  6.656e+04   0   
1
CENO:CEBOcombinado-3.758e-26  5.540e+04   0   
1
CEexperimental:CEBOolor   -2.611e-25  6.865e+04   0   
1
CENO:CEBOolor -5.252e-26  5.620e+04   0   
1
CEexperimental:CEBOvisual -2.786e-09  7.700e+04   0   
1
CENO:CEBOvisual8.169e-15  6.334e+04   0   
1
CEexperimental:Luzoscuridad   -1.703e-25  6.304e+04   0   
1
CENO:Luzoscuridad -1.672e-28  6.117e+04   0   
1
CEBOcombinado:Luzoscuridad 1.028e-26  5.950e+04   0   
1
CEBOolor:Luzoscuridad  9.212e-27  6.207e+04   0   
1
CEBOvisual:Luzoscuridad   NA NA  NA  
NA
CEexperimental:CEBOcombinado:Luzoscuridad  9.783e-26  8.744e+04   0   
1
CENO:CEBOcombinado:Luzoscuridad   -2.948e-26  7.959e+04   0   
1
CEexperimental:CEBOolor:Luzoscuridad   1.573e-25  9.005e+04   0   
1
CENO:CEBOolor:Luzoscuridad-2.111e-26  8.208e+04   0   
1
CEexperimental:CEBOvisual:LuzoscuridadNA NA  NA  
NA
CENO:CEBOvisual:Luzoscuridad  NA NA  NA  
NA

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 2.0853e+02  on 150  degrees of freedom
Residual deviance: 4.1146e-10  on 130  degrees of freedom
AIC: 42


Well, there are too many levels of the original factors lacking in this
table. As an example, the factor CE has three levels (Undefined, Control,
Experimental), but in the table there are only two of them (NO=undefined,
Experimental=Experimental). I need to check the complete result, how can I
obtain the effects for the remaining levels of the factors?

Thanks,

Pablo

--
View this message in context: 
http://r.789695.n4.nabble.com/factors-in-probit-regression-tp3879176p3879176.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sweave and bibliographies

2011-10-06 Thread Terry Therneau
 I added a bibliographic reference to one of my .Rnw documents, to wit:

\begin{thebibliography}{9}
  \bibitem{Jose} Jos{\'{e}} C. Pinheiro and Douglas M. Bates,
\emph{Mixed-Effects Models in S and S-PLUS},
Springer, 2000.
\end{thebibliography}

just before the \end{document}.
When I run Sweave on the result these lines disappear without a trace.

Is this intentional?  

Subquestion: is there a link to some Sweave documentation from the main CRAN
or R page?

Terry T.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Ignore my last message

2011-10-06 Thread Terry Therneau
  It was an editor interaction -- the changes weren't being saved.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Titles changing when a plot is redrawn

2011-10-06 Thread David Winsemius


On Oct 6, 2011, at 2:29 PM, David Winsemius wrote:



On Oct 6, 2011, at 8:25 AM, John Nolan wrote:


Thank you for telling me a fix.

But I still don't know if this behavior is what is intended.  I  
used bquote(...) because the plotmath(...) help page refers to  
bquote and gives an example like this.  I suspect most users will  
be baffled by this kind of behavior, especially since it does not  
occur when there is one plot.  By this I mean that I can draw one  
plot and title it with the same string using bquote( ).


It seems to be an infelicity that is not reproducible on Macs:

RMSE.titles.pdf


Apologies, I take it back. I read you original post incorrectly and  
thought you were having problems with the original plot appearing  
incorrectly. When I do the the resizing I do see the same unexpected  
change to both titles having  i = 2 



If I change the value of i, and redraw the graph, the redrawn graph  
has the original value of i in the title, not the updated value. So  
in this case, an unevaluated expression is not re-evaluated at draw  
time?


I certainly would not have expected redrawing to call any R code  
again. I would have expected the graphics device to do the  
recalculations.


As I said. I would not have expected this, either. I seem to remember  
this being brought up before, but I was unable to find it on a search.


--
David.



John


-xieyi...@gmail.com wrote: -
To: John Nolan jpno...@american.edu
From: Yihui Xie
Sent by: xieyi...@gmail.com
Date: 10/05/2011 11:49PM
Cc: r-help@r-project.org
Subject: Re: [R] Titles changing when a plot is redrawn

I think the problem is your str1 is an unevaluated expression and  
will
change with the value of i. You should be able to get a fixed title  
by

this:

par(mfrow = c(2, 1))
for (i in 1:2) {
  x - 1:100
  rmse - sin(x/5)  # fake data
  plot(x, rmse, main = substitute(list(RMSE(theta), i == z), list(z  
= i)))

}

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA



On Wed, Oct 5, 2011 at 10:01 PM, John Nolan jpno...@american.edu  
wrote:

I ran into a problem with titles on graphs.  I wanted a graph with
multiple subplots, with each having a title that involved both
a Greek letter and an identifier for each graph.  Below is a
simplified version of code to do this.  The graph appears fine,
with the first graph having i=1 in the title, and the second
graph having i=2 in the title.  However, when I resize the graph,
the plot titles change, with both showing i=2. The titles also
change when I save the plot to a file using the File menu,
then Save as in Windows.  Is this what should happen?  I
always thought that titles are static once the graph is
drawn, and couldn't change.

The problem occurs on some version of R, but not on others.
It does occur with the latest version of R:

str(R.Version())

List of 13
$ platform  : chr i386-pc-mingw32
$ arch  : chr i386
$ os: chr mingw32
$ system: chr i386, mingw32
$ status: chr 
$ major : chr 2
$ minor : chr 13.2
$ year  : chr 2011
$ month : chr 09
$ day   : chr 30
$ svn rev   : chr 57111
$ language  : chr R
$ version.string: chr R version 2.13.2 (2011-09-30)

The problem also occurs on:  R 2.13.0 on Win32
and Mac (R 2.12.0, x86_64-apple-darwin9.8.0)
The problem DOES NOT occur under R 2.10.0 on Win32.

If the code below is bracketed with pdf(test.pdf)
and dev.off(), the correct labels appear in the file.
This behavior doesn't seem to appear if there is only
one plot.

My guess is that the titles are being reevaluated when
the plot is redrawn, and since the value of i is 2 when
the redraw occurs, both labels get set to i=2.  I guess
Save as forces a redraw because a dialog box pops up?

If could be that this behavior is what is intended, and that
somewhere between R 2.10.0 and R 2.13.2 an old bug was fixed.
Or this behavior is not what was intended, and a bug was
introduced.  If the former, this should be explained to the user
somewhere.  If the latter, can someone track it down and fix?

John Nolan

#-
par(mfrow=c(2,1))
for (i in 1:2) {
x - 1:100
rmse - sin(x/5)  # fake data
plot(x,rmse)
str1 - bquote( paste(RMSE(,theta,), ,i==.(i)  ))
title( str1 )
}
#-




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot titles and labels XXXX

2011-10-06 Thread Dan Abner
Hi everyone,

Is it possible to to specify titles and axis labels on 2 separate lines when
using the plot fn? What about tick mark labels?

Example of main title:

 Figure 1: Side-by-side Boxplots
of Y by X

instead of

Figure 1: Side-by-side Boxplots of Y by X

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] TCGA expression data: plotting ....

2011-10-06 Thread baumeist
Hi,

I am new to R.
I am trying to figure out how to graph expression data from the TCGA
database.
If I understand correctly the expression data I have downloaded is from a
microarray using the AgilentG4502A.

I've had trouble reading into R in the level I, level II, and the gene
expression analysis data using 

dat-read.table(C:\\file.txt,  header=T, row.names=1)

for example:

 dat1-read.table(C:\\US82800149_251976011000_S01_GE2_105_Dec08.txt, 
 header=T, row.names=1)

 dat-read.table(C:\\unc.edu__AgilentG4502A_07_3__TCGA-A6-2674-01A-02R-0821-07__gene_expression_analysis.txt,
  
 header=T, row.names=1)

in all cases I get the error
 more columns than column names

I have only been able to read in the level II data with the code:

 dat2-read.table(C:\\US82800149_251976011000_S01_GE2_105_Dec08.txt_lmean.out.logratio.probe.tcga_level2.data.txt,header
 = TRUE, as.is = TRUE, sep=\t)

So this is what I am working with.

I can see that the dimensions of this data are 

 dim(dat2)
[1] 90798 2

When I print dat2 to the screen it looks like this:
I assume that this is one patient with expression (intensity) data for a
large number of genes, but don't know.

49995A_23_P67323  -0.427
49996A_23_P67330 -0.3275
49997A_23_P67332  -0.409
49998A_23_P67339  3.2955
4A_23_P67355   1.205

If I try to plot the data with the following below 

 names(dat2)
[1] Hybridization.REFTCGA.A6.2674.01A.02R.0821.07


 x-c(Hybridization.REF)
 y-(TCGA.A6.2674.01A.02R.0821.07)


 plot(x,y,type='p',xlab='Hybridization.REF',ylab='TCGA.A6.2674.01A.02R.0821.07',main='plot')
   


I get the error:

Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf
5: In min(x) : no non-missing arguments to min; returning Inf
6: In max(x) : no non-missing arguments to max; returning -Inf
 

I am really not sure how to plot this data, partly because I'm not sure what
the level II data represents.

Can anyone tell me what the level II data represents and what type of
plotting functions I might use?

Thanks in advance,
MAB

--
View this message in context: 
http://r.789695.n4.nabble.com/TCGA-expression-data-plotting-tp3879484p3879484.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot titles and labels XXXX

2011-10-06 Thread David Winsemius


On Oct 6, 2011, at 2:51 PM, Dan Abner wrote:


Hi everyone,

Is it possible to to specify titles and axis labels on 2 separate  
lines when

using the plot fn? What about tick mark labels?

Example of main title:



?Quotes

try:

title(main=Figure 1: Side-by-side Boxplots\n   of Y by X)




instead of

Figure 1: Side-by-side Boxplots of Y by X

[[alternative HTML version deleted]]

You should learn to post in plain text.

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] counts in quantiles in and from a matrix

2011-10-06 Thread Ben qant
Excellent! Thank you!

ben

On Wed, Oct 5, 2011 at 9:18 PM, Dennis Murphy djmu...@gmail.com wrote:

 Hi:

 Here's one way:

 m - matrix(rpois(100, 8), nrow = 5)
 f - function(x) {
q - quantile(x, c(0.1, 0.9), na.rm = TRUE)
c(sum(x  q[1]), sum(x  q[2]))
}

 t(apply(m, 1, f))

 HTH,
 Dennis

 On Wed, Oct 5, 2011 at 8:11 PM, Ben qant ccqu...@gmail.com wrote:
  Hello,
 
  I'm trying to get the count of values in each row that are above and
 below
  quantile thresholds. Thanks!
 
  Example:
 
  x = matrix(1:30,5,6)
  x
  [,1] [,2] [,3] [,4] [,5] [,6]
  [1,]16   11   16   21   26
  [2,]27   12   17   22   27
  [3,]38   13   18   23   28
  [4,]49   14   19   24   29
  [5,]5   10   15   20   25   30
  qtl = t(apply(x, 1, quantile, probs = c(.1,.9),na.rm=T))
  qtl
  10%  90%
  [1,] 3.5 23.5
  [2,] 4.5 24.5
  [3,] 5.5 25.5
  [4,] 6.5 26.5
  [5,] 7.5 27.5
 
  I would like counts like this for each row:
 
  cnts
 [,1] [,2]
  [1,]   11
  [2,]   11
  [3,]   11
  [4,]   11
  [5,]   11
 
  ...because for the first row (x[1,]) only value 1 is less than 3.5 and
 only
  value 26 is greater 23.5 and so on for the other rows. I'm thinking its a
  apply(x,1,...some FUN here...), but still getting use to apply and I've
 been
  coding for too long...
 
  Also, if anyone knows how to change the background color of the r-Tinn
  editor my eyes would love you!  Off to bed. I look forward to your
 answers!
 
  Thanks!
 
  Ben
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Duplicate elements of a vector

2011-10-06 Thread syrvn
Hi,

let's assume I have the following vector a:

1 5 23

How can I use R to duplicate the elements so that my new vector looks like:

1 1 5 5 23 23


Many thanks,
Syrvn

--
View this message in context: 
http://r.789695.n4.nabble.com/Duplicate-elements-of-a-vector-tp3879561p3879561.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >