date:20101103

On Wed, Nov 3, 2010 at 2:07 AM, Santosh Srinivas
santosh.srini...@gmail.com wrote:
 Dear Group,

 Inside each cell there should be a circle (sphere preferable) with radius of
 mod(data value). The color should be either red or green depending on -ve or
 +ve and the intensity should be based on the value of the datapoint.

 Any help on how to go about this?

 If you really want a sphere then you should look at the rgl package,
which enables the drawing of 3d graphic objects with illumination.
However it does it in its own graphics window and you'll not be able
to use any of the standard R graphics functions. Otherwise you'll have
to find some way of putting a 3d sphere on  a 2d R graphics window, or
faking it with a shaded circle and some highlights. Yuck.

 Also, drawing circles (strictly, a disc) with radius proportional to
data value is usually a bad idea since we interpret areas. A circle
with twice the radius has four times the area, and so looks four times
as big. But the data is only twice as big...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] boxplot of timeseries with different lengths

2010-11-03 Thread Simone Gabbriellini

Hello List,

I have a time serie of observations representing the activity of some users in 
different time periods, like:

 table(obs1)

 user1  user2  user3 user31 user33  user4  user5  user6  user7  user8 user82 
user83 user85   user9 
 1  1   3   1   1 1 6   1  
11   6  11  7

 table(obs2)

 user1  user2  user3 user31 user33  user4  user5  user6  user7  user8 user82 
user83 user84  user85 user86 user87  user9 
 3   9 29  12  13  142113   
 13   15   20 21  1   11  9 
  427 

I would like to boxplot them, but since they have different length, I don't 
know how to handle the dataset properly. Is it wise to use different arrays, 
one for each observation? or it is better to force the tabled observations to 
the same length, in order to put them into a data frame?

thanks in advance for any advice.

best regards,
Simone Gabbriellini
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] install vegan

2010-11-03 Thread Carolin

Dear all,

I am trying to install Vegan, but I allways get the following error
message:

Warning in install.packages(choose.files(, filters =
Filters[c(zip,  :
  'lib = C:/Programme/R/R-2.12.0/library' is not writable
Error in install.packages(choose.files(, filters =
Filters[c(zip,  :
  unable to install packages
 utils:::menuInstallLocal()

does anybody know what is wrong?

Thanks in advance,
Carolin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Recoding -- test whether number begins with a certain number

2010-11-03 Thread Marcel Gerds


 Dear R community,

I have a question concerning recoding of a variable. I have a data set
in which there is a variable devoted to the ISCO code describing the
occupation of this certain individual
(http://www.ilo.org/public/english/bureau/stat/isco/isco88/major.htm).
Every type of occupation begins with a number and every number added to
this number describes th occupation more detailed.
Now my problem: I want to recode this variable in a way that every value
beginning with a certain number is labeled as the respective category.
For example, that all values of this variable beginning with a 6 is
labeled as agri.
My problem is that I cannot find a test which I can use for that purpose.

I would really appreciate any help on that subject. Thank you.

Best regards
Marcel Gerds

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] save() with 64 bit and 32 bit R

2010-11-03 Thread Nicola Sturaro Sommacal (Quantide srl)


Andrew Collier wrote:

hi,

i have been using a 64 bit desktop machine to process a whole lot of
data which i have then subsequently used save() to store. i am now
wanting to use this data on my laptop machine, which is a 32 bit
install. i suppose that i should not be surprised that the 64 bit data
files do not open on my 32 bit machine! does anyone have a smart idea as
to how these data can be reformatted for 32 bits? unfortunately the data
processing that i did on the 64 bit machine took just under 20 days to
complete, so i am not very keen to just throw away this data and begin
again on the 32 bit machine.

sorry, in retrospect this all seems rather idiotic, but i assumed that
the data stored by save() would be compatible between 64 bit and 32 bit
(there is no warning in the manual).



The data would normally be compatible on all architectures.  However, it 
may need the same version of R (or a newer one), and may need to have 
the same packages installed in order to read it.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] save() with 64 bit and 32 bit R

2010-11-03 Thread Prof Brian Ripley


On Wed, 3 Nov 2010, Andrew Collier wrote:


hi,

i have been using a 64 bit desktop machine to process a whole lot of
data which i have then subsequently used save() to store. i am now
wanting to use this data on my laptop machine, which is a 32 bit
install. i suppose that i should not be surprised that the 64 bit data
files do not open on my 32 bit machine! does anyone have a smart idea as
to how these data can be reformatted for 32 bits? unfortunately the data
processing that i did on the 64 bit machine took just under 20 days to
complete, so i am not very keen to just throw away this data and begin
again on the 32 bit machine.

sorry, in retrospect this all seems rather idiotic, but i assumed that
the data stored by save() would be compatible between 64 bit and 32 bit
(there is no warning in the manual).


It is, and the help says so:

 All R platforms use the XDR (bigendian) representation of C ints
 and doubles in binary save-d files, and these are portable across
 all R platforms. (ASCII saves used to be useful for moving data
 between platforms but are now mainly of historical interest.)

So there is something specific about your save, and you haven't even 
told us the error message (see the posting guide).  One possibility is 
that you saved references to namespaces, when those packages need to 
be installed on the machine used to load() the .RData file (but this 
is fairly unusual).  Another is that you simply don't have enough 
memory on the 32-bit machine, when one remedy is to go back to the 
64-bit machine and save individual objects.




thanks,
andrew.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Tukey's table

2010-11-03 Thread Silvano


Hi,

I'm building Tukey's table using qtukey function.

It happens that I can't get the values of Tukey's one degree 
of freedom and also wanted to eliminate the first column.


The program is:

Trat - c(1:30) # number of treatments
gl - c(1:30, 40, 60, 120) # degree freedom

tukval - matrix(0, nr=length(gl), nc=length(Trat))

for(i in 1:length(gl))
 for(j in 1:length(Trat))
   tukval[i,j] - qtukey(.95, Trat[j], gl[i])

rownames(tukval) - gl
colnames(tukval) - paste(Trat, , sep=)
tukval

require(xtable)
xtable(tukval)


Some suggest?

--
Silvano Cesar da Costa
Departamento de Estatística
Universidade Estadual de Londrina
Fone: 3371-4346

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Colour filling in panel.bwplot from lattice

2010-11-03 Thread Rainer Hurling


Am 03.11.2010 10:23 (UTC+1) schrieb Deepayan Sarkar:

On Wed, Nov 3, 2010 at 4:11 AM, Dennis Murphydjmu...@gmail.com  wrote:

Hi:

I don't know why, but it seems that in

bwplot(voice.part ~ height, data = singer,
main = NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red'
'pink' 'violet' 'brown' 'gold',
fill=c(yellow,blue,green,red,pink,violet,brown,gold))

the assignment of colors is offset by 3:

Levels: Bass 2 Bass 1 Tenor 2 Tenor 1 Alto 2 Alto 1 Soprano 2 Soprano 1
fillcol- c(yellow,blue,green,red,pink,violet,brown,gold)

In the above plot,

yellow -  Bass 2  (1)
blue -  Tenor 1 (4)
green -  Soprano 2  (7)
red -  Bass 1 (10 mod 8 = 2)
pink -  Alto 2 (13 mod 8 = 5)
etc.

It's certainly curious.


Curious indeed. It turns out that because of the way this was
implemented, every 11th color was used, so you end up with the order


sel.cols- c(yellow,blue,green,red,pink,violet,brown,gold)
rep(sel.cols, 100) [ seq(1, by = 11, length.out = 8) ]

[1] yellow redbrown  blue   pink   gold   green  violet

It's easy to fix this so that we get the expected order, and I will do
so for the next release.


Thank you for this proposal. We are looking forward for the next release :-)

We frequently have to colour selected boxes to be able to compare 
special cases over different panels.



Having said that, it should be noted that any vectorization behaviour
in lattice panel functions is a consequence of implementation and not
guaranteed by design (although certainly useful in many situations).
In particular, it is risky to depend on vectorization in multipanel
plots, because the vectorization starts afresh in each panel for
whatever data subset happens to be in that panel, and there may be no
relation between the colors and the original data.


Thank you for the warning.


One alternative is to use panel.superpose with panel.groups=panel.bwplot:

bwplot(voice.part ~ height, data = singer, groups = voice.part, panel
= panel.superpose, panel.groups = panel.bwplot, fill = sel.cols)


This indeed works nice 'as a workaround'.


-Deepayan


Thanks again for this wonderful package,
Rainer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] density() function: differences with S-PLUS

Dear Joshua,

first of all, thank you very much for reply. I hoped that someone who's
familiar with both S+ and R can reply to me, because I spent some hours to
looking for a solution.

If someone else would try, this is the SPLUS code and output, while below
there is the R code. I obtain the same x values, while y values are
differents for both examples.

Thank you very much.

Nicola


### S-PLUS CODE AND OUTPUT ###

 density(1:1000, width = 4)
$x:
 [1]-2.018.5102039.0204159.5306180.04082
100.55102   121.06122
 [8]   141.57143   162.08163   182.59184   203.10204   223.61224
244.12245   264.63265
[15]   285.14286   305.65306   326.16327   346.67347   367.18367
387.69388   408.20408
[22]   428.71429   449.22449   469.73469   490.24490   510.75510
531.26531   551.77551
[29]   572.28571   592.79592   613.30612   633.81633   654.32653
674.83673   695.34694
[36]   715.85714   736.36735   756.87755   777.38776   797.89796
818.40816   838.91837
[43]   859.42857   879.93878   900.44898   920.95918   941.46939
961.97959   982.48980
[50]  1003.0

$y:
 [1] 4.565970e-006 1.31e-003 9.999374e-004 1.31e-003 9.999471e-004
1.31e-003
 [7] 9.999560e-004 1.30e-003 9.999643e-004 1.29e-003 9.999718e-004
1.28e-003
[13] 9.999788e-004 1.26e-003 9.999852e-004 1.24e-003 9.10e-004
1.22e-003
[19] 9.63e-004 1.19e-003 1.01e-003 1.16e-003 1.06e-003
1.13e-003
[25] 1.10e-003 1.10e-003 1.13e-003 1.06e-003 1.16e-003
1.01e-003
[31] 1.19e-003 9.63e-004 1.22e-003 9.10e-004 1.24e-003
9.999852e-004
[37] 1.26e-003 9.999788e-004 1.28e-003 9.999718e-004 1.29e-003
9.999643e-004
[43] 1.30e-003 9.999560e-004 1.31e-003 9.999471e-004 1.31e-003
9.999374e-004
[49] 1.31e-003 4.432131e-006


 exdata = iris[, 1, 1]
 density(exdata, width = 4)
$x:
 [1] 1.30 1.453061 1.606122 1.759184 1.912245 2.065306 2.218367 2.371429
2.524490
[10] 2.677551 2.830612 2.983673 3.136735 3.289796 3.442857 3.595918 3.748980
3.902041
[19] 4.055102 4.208163 4.361224 4.514286 4.667347 4.820408 4.973469 5.126531
5.279592
[28] 5.432653 5.585714 5.738776 5.891837 6.044898 6.197959 6.351020 6.504082
6.657143
[37] 6.810204 6.963265 7.116327 7.269388 7.422449 7.575510 7.728571 7.881633
8.034694
[46] 8.187755 8.340816 8.493878 8.646939 8.80

$y:
 [1] 0.0007849649 0.0013097474 0.0021225491 0.0033616520 0.0052059615
0.0078856717
 [7] 0.0116917555 0.0169685132 0.0241073754 0.0335286785 0.0456521053
0.0608554862
[13] 0.0794235072 0.1014901241 0.1269807991 0.1555625999 0.1866111931
0.2192033788
[19] 0.2521417640 0.2840144993 0.3132881074 0.3384260582 0.3580208688
0.3709241384
[25] 0.3763578665 0.3739920600 0.3639778683 0.3469316232 0.3238721233
0.2961200278
[31] 0.2651731505 0.2325739601 0.1997853985 0.1680884651 0.1385105802
0.1117884914
[37] 0.0883644110 0.0684099972 0.0518702141 0.0385181792 0.0280126487
0.0199513951
[43] 0.0139159044 0.0095050745 0.0063575653 0.0041639082 0.0026680819
0.0016700727
[49] 0.0010169912 0.0005962089


### R CODE ###

# S-PLUS CODE: density(1:1000, width = 4) SAME x BUT DIFFERENT y
density(1:1000, bw = 4, window = g,  n = 50, cut = 0.75)$x
density(1:1000, bw = 4, window = g,  n = 50, cut = 0.75)$y

# S-PLUS CODE: exdata = iris[, 1, 1]; density(exdata, width = 4) SAME x
BUT DIFFERENT y
exdata = iris$Sepal.Length[iris$Species == setosa]
density(exdata, bw = 4, n = 50, cut = 0.75)$x
density(exdata, bw = 4, n = 50, cut = 0.75)$y



2010/11/2 Joshua Wiley jwiley.ps...@gmail.com

 Dear Nicola,

 There are undoubtedly people here who are familiar with both S+ and R,
 but they may not always be around or get to every question.  In that
 case there are (at least) two good options for you:

 1) Say what you want mathematically (something of a universal
 language) or statistically

 2) Rather than just give us S+ code, show sample data (e.g., 1:1000),
 and the values you would like obtained (in this case whatever the
 output from S+ was).  This would let us *try* to figure out what
 happened and duplicate it in R.

 From the arcane step of reading R's documentation for density (?density):

 width: this exists for compatibility with S; if given, and bw is
  not, will set bw to width if this is a character string,
  or to a kernel-dependent multiple of width if this is
  numeric.

 Which makes me wonder if this works for you (in R)?

 density(1:1000, width = 4)


 Cheers,

 Josh


 On Tue, Nov 2, 2010 at 3:04 AM, Nicola Sturaro Sommacal (Quantide srl)
 mailingl...@sturaro.net wrote:
  Hello!
 
  Someone know what are the difference between R and S-PLUS in the
 density()
  function?
 
  For example, I would like to reply this simple S-PLUS code in R, but I
 don't
  understand which parameter I should modify to get the same results.
 
  S-PLUS CODE:
  density(1:1000, width = 4)
 
  R-CODE:
  density(1:1000, bw = 4, window = g,  n = 50, cut = 0.75)
 
  I obtain the same

[R] bad optimization with nnet?

2010-11-03 Thread rynkiewicz


Hy,

I try to give an example of overfitting with multi-layer perceptron.

I have done following small example :

library(nnet)

set.seed(1)
x - matrix(rnorm(20),10,2)
z - matrix(rnorm(10),10,1)
rx - max(x)-min(x)
rz - max(z)-min(z)
x - x/rx
z - z/rz
erreur - 10^9
for(i in 1:100){
  temp.mod - nnet(x=x,y=z,size=10,rang=1,maxit=1000)
  if(temp.mod$valueerreur){
res.mod - temp.mod
erreur - res.mod$value
  }
}

cat(\nFinal error : ,res.mod$value,\n)


Normaly it is easy task for an MLP with 10 hidden units to reduce the 
final error to almost 0 (althougt there is nothing to predict).


But the smallest error that I get is :  0.753895 (very poor result)

Maybe this problem is already known??

Maybe the fault is mine but I don't see where.

Joseph Rynkiewicz

--
Ce message a ete verifie par MailScanner
pour des virus ou des polluriels et rien de
suspect n'a ete trouve.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple imputation for nominal data

2010-11-03 Thread Frank Harrell


The aregImpute function in the Hmisc package can do this through predictive
mean matching and canonical variates (Fisher's optimum scoring algorithm).

Frank


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Multiple-imputation-for-nominal-data-tp3024276p3025181.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Recoding -- test whether number begins with a certain number

On Wed, Nov 3, 2010 at 10:01 AM, Marcel Gerds marcel.ge...@gmx.de wrote:
  Dear R community,

 I have a question concerning recoding of a variable. I have a data set
 in which there is a variable devoted to the ISCO code describing the
 occupation of this certain individual
 (http://www.ilo.org/public/english/bureau/stat/isco/isco88/major.htm).
 Every type of occupation begins with a number and every number added to
 this number describes th occupation more detailed.
 Now my problem: I want to recode this variable in a way that every value
 beginning with a certain number is labeled as the respective category.
 For example, that all values of this variable beginning with a 6 is
 labeled as agri.
 My problem is that I cannot find a test which I can use for that purpose.

 I would really appreciate any help on that subject. Thank you.

 If it's a numeric variable, convert to character with 'as.character'.

 Then check the first character with substr(x,1,1). Then create a
factor and set the levels...

  z=as.integer(runif(10,0,100))
  z
 [1] 26 92 47 99  2 98 15 21 58 82
  zc=factor(substr(as.character(z),1,1))
  zc
 [1] 2 9 4 9 2 9 1 2 5 8
Levels: 1 2 4 5 8 9
  levels(zc)=c(Foo,Bar,Baz,Qux,Quux,Quuux)
  zc
 [1] Bar   Quuux Baz   Quuux Bar   Quuux Foo   Bar   Qux   Quux
Levels: Foo Bar Baz Qux Quux Quuux
  data.frame(z=z,zc=zc)
zzc
1  26   Bar
2  92 Quuux
3  47   Baz
4  99 Quuux
5   2   Bar
6  98 Quuux
7  15   Foo
8  21   Bar
9  58   Qux
10 82  Quux

 Now all the 9-somethings are Quuux, the 2's are Bar etc etc.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Colour filling in panel.bwplot from lattice

2010-11-03 Thread Deepayan Sarkar

On Wed, Nov 3, 2010 at 4:25 PM, Rainer Hurling rhur...@gwdg.de wrote:
 Am 03.11.2010 10:23 (UTC+1) schrieb Deepayan Sarkar:

 On Wed, Nov 3, 2010 at 4:11 AM, Dennis Murphydjmu...@gmail.com  wrote:

 Hi:

 I don't know why, but it seems that in

 bwplot(voice.part ~ height, data = singer,
 main = NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red'
 'pink' 'violet' 'brown' 'gold',
 fill=c(yellow,blue,green,red,pink,violet,brown,gold))

 the assignment of colors is offset by 3:

 Levels: Bass 2 Bass 1 Tenor 2 Tenor 1 Alto 2 Alto 1 Soprano 2 Soprano 1
 fillcol- c(yellow,blue,green,red,pink,violet,brown,gold)

 In the above plot,

 yellow -  Bass 2  (1)
 blue -  Tenor 1     (4)
 green -  Soprano 2  (7)
 red -  Bass 1 (10 mod 8 = 2)
 pink -  Alto 2 (13 mod 8 = 5)
 etc.

 It's certainly curious.

 Curious indeed. It turns out that because of the way this was
 implemented, every 11th color was used, so you end up with the order

 sel.cols-
 c(yellow,blue,green,red,pink,violet,brown,gold)
 rep(sel.cols, 100) [ seq(1, by = 11, length.out = 8) ]

 [1] yellow red    brown  blue   pink   gold   green
  violet

 It's easy to fix this so that we get the expected order, and I will do
 so for the next release.

 Thank you for this proposal. We are looking forward for the next release :-)

 We frequently have to colour selected boxes to be able to compare special
 cases over different panels.

 Having said that, it should be noted that any vectorization behaviour
 in lattice panel functions is a consequence of implementation and not
 guaranteed by design (although certainly useful in many situations).
 In particular, it is risky to depend on vectorization in multipanel
 plots, because the vectorization starts afresh in each panel for
 whatever data subset happens to be in that panel, and there may be no
 relation between the colors and the original data.

 Thank you for the warning.

 One alternative is to use panel.superpose with panel.groups=panel.bwplot:

 bwplot(voice.part ~ height, data = singer, groups = voice.part, panel
 = panel.superpose, panel.groups = panel.bwplot, fill = sel.cols)

 This indeed works nice 'as a workaround'.

Actually, I would reiterate that this is the right solution and the
it's other fix that qualifies as a quick workaround (especially if you
are considering comparing things across multiple panels).

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drawing circles on a chart

2010-11-03 Thread Santosh Srinivas

Thanks Barry ... actually the intention was to have areas of the circle
depicting the value (radius imputed)

-Original Message-
From: b.rowling...@googlemail.com [mailto:b.rowling...@googlemail.com] On
Behalf Of Barry Rowlingson
Sent: 03 November 2010 15:02
To: Santosh Srinivas
Cc: r-help@r-project.org
Subject: Re: [R] Drawing circles on a chart

On Wed, Nov 3, 2010 at 2:07 AM, Santosh Srinivas
santosh.srini...@gmail.com wrote:
 Dear Group,

 Inside each cell there should be a circle (sphere preferable) with radius
of
 mod(data value). The color should be either red or green depending on -ve
or
 +ve and the intensity should be based on the value of the datapoint.

 Any help on how to go about this?

 If you really want a sphere then you should look at the rgl package,
which enables the drawing of 3d graphic objects with illumination.
However it does it in its own graphics window and you'll not be able
to use any of the standard R graphics functions. Otherwise you'll have
to find some way of putting a 3d sphere on  a 2d R graphics window, or
faking it with a shaded circle and some highlights. Yuck.

 Also, drawing circles (strictly, a disc) with radius proportional to
data value is usually a bad idea since we interpret areas. A circle
with twice the radius has four times the area, and so looks four times
as big. But the data is only twice as big...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Colour filling in panel.bwplot from lattice

2010-11-03 Thread Rainer Hurling


Am 03.11.2010 12:52 (UTC+1) schrieb Deepayan Sarkar:

On Wed, Nov 3, 2010 at 4:25 PM, Rainer Hurlingrhur...@gwdg.de  wrote:

Am 03.11.2010 10:23 (UTC+1) schrieb Deepayan Sarkar:


On Wed, Nov 3, 2010 at 4:11 AM, Dennis Murphydjmu...@gmail.comwrote:


Hi:

I don't know why, but it seems that in

bwplot(voice.part ~ height, data = singer,
main = NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red'
'pink' 'violet' 'brown' 'gold',
fill=c(yellow,blue,green,red,pink,violet,brown,gold))

the assignment of colors is offset by 3:

Levels: Bass 2 Bass 1 Tenor 2 Tenor 1 Alto 2 Alto 1 Soprano 2 Soprano 1
fillcol- c(yellow,blue,green,red,pink,violet,brown,gold)

In the above plot,

yellow -Bass 2  (1)
blue -Tenor 1 (4)
green -Soprano 2  (7)
red -Bass 1 (10 mod 8 = 2)
pink -Alto 2 (13 mod 8 = 5)
etc.

It's certainly curious.


Curious indeed. It turns out that because of the way this was
implemented, every 11th color was used, so you end up with the order


sel.cols-
c(yellow,blue,green,red,pink,violet,brown,gold)
rep(sel.cols, 100) [ seq(1, by = 11, length.out = 8) ]


[1] yellow redbrown  blue   pink   gold   green
  violet

It's easy to fix this so that we get the expected order, and I will do
so for the next release.


Thank you for this proposal. We are looking forward for the next release :-)

We frequently have to colour selected boxes to be able to compare special
cases over different panels.


Having said that, it should be noted that any vectorization behaviour
in lattice panel functions is a consequence of implementation and not
guaranteed by design (although certainly useful in many situations).
In particular, it is risky to depend on vectorization in multipanel
plots, because the vectorization starts afresh in each panel for
whatever data subset happens to be in that panel, and there may be no
relation between the colors and the original data.


Thank you for the warning.


One alternative is to use panel.superpose with panel.groups=panel.bwplot:

bwplot(voice.part ~ height, data = singer, groups = voice.part, panel
= panel.superpose, panel.groups = panel.bwplot, fill = sel.cols)


This indeed works nice 'as a workaround'.


Actually, I would reiterate that this is the right solution and the
it's other fix that qualifies as a quick workaround (especially if you
are considering comparing things across multiple panels).


Yes, this comparing across multiple panels was our intention.
Rainer


-Deepayan


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] install vegan

2010-11-03 Thread Uwe Ligges




On 03.11.2010 10:39, Carolin wrote:

Dear all,

I am trying to install Vegan, but I allways get the following error
message:

Warning in install.packages(choose.files(, filters =
Filters[c(zip,  :
   'lib = C:/Programme/R/R-2.12.0/library' is not writable
Error in install.packages(choose.files(, filters =
Filters[c(zip,  :
   unable to install packages

utils:::menuInstallLocal()




does anybody know what is wrong?


Yes: you do not have permission to write to 
C:/Programme/R/R-2.12.0/library where you are trying to install the 
package to.


Uwe Ligges


Thanks in advance,
Carolin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory allocation problem

2010-11-03 Thread Jonathan P Daily

The optim function is very resource hungry. I have had similar problems in 
the past when dealing with extremely large datasets.

What is perhaps happening is that each 'step' of the optimization 
algorithm stores some info so that it can compare to the next 'step', and 
while the original vector may only be a few Mb of data, over many 
iterations a huge amount memory is allocated to the optimization steps.

Maybe look at the control options under ?optim, particularly stuff like 
trace, fnscale, ndeps, etc. that may cut down on the amount of data being 
stored each step as well as the number of steps needed.

Good luck!
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly



From:
Lorenzo Cattarino l.cattar...@uq.edu.au
To:
David Winsemius dwinsem...@comcast.net, Peter Langfelder 
peter.langfel...@gmail.com
Cc:
r-help@r-project.org
Date:
11/03/2010 03:26 AM
Subject:
Re: [R] memory allocation problem
Sent by:
r-help-boun...@r-project.org



Thanks for all your suggestions,

This is what I get after removing all the other (not useful) objects and
run my code:

 getsizes()
[,1]
org_results 47240832
myfun  11672
getsizes4176
SS  3248
coeff168
NA  NA
NA  NA
NA  NA
NA  NA
NA  NA

 est_coeff - optim(coeff,SS, steps=org_results$no.steps,
Range=org_results$Range, H1=org_results$H1, H2=org_results$H2,
p=org_results$p)
Error: cannot allocate vector of size 5.0 Mb
In addition: Warning messages:
1: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  :
  Reached total allocation of 4055Mb: see help(memory.size)
2: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  :
  Reached total allocation of 4055Mb: see help(memory.size)
3: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  :
  Reached total allocation of 4055Mb: see help(memory.size)
4: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  :
  Reached total allocation of 4055Mb: see help(memory.size)


It seems that R is using all the default availabe memory (4 GB, which is
the RAM of my processor).

 memory.limit()
[1] 4055
 memory.size()
[1] 4049.07


My dataframe has a size of 47240832 bytes, or about 45 Mb. So it should
not be a problem in terms of memory usage?

I do not understand what is going on.

Thanks for your help anyway

Lorenzo

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Wednesday, 3 November 2010 12:48 PM
To: Lorenzo Cattarino
Cc: r-help@r-project.org
Subject: Re: [R] memory allocation problem

Restart your computer. (Yeah, I know that what the help-desk always 
says.)
Start R before doing anything else.

Then run your code in a clean session. Check ls() oafter starte up to 
make sure you don't have a bunch f useless stuff in your .Rdata 
file.   Don't load anything that is not germane to this problem.  Use 
this function to see what sort of space issues you might have after 
loading objects:

  getsizes - function() {z - sapply(ls(envir=globalenv()),
 function(x) object.size(get(x)))
(tmp - as.matrix(rev(sort(z))[1:10]))}

Then run your code.

-- 
David.

On Nov 2, 2010, at 10:13 PM, Lorenzo Cattarino wrote:

 I would also like to include details on my R version



 version  _

 platform   x86_64-pc-mingw32
 arch   x86_64

 os mingw32
 system x86_64, mingw32
 status
 major  2
 minor  11.1
 year   2010
 month  05
 day31
 svn rev52157
 language   R
 version.string R version 2.11.1 (2010-05-31)

 from FAQ 2.9

(http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-b
 e-a-limit-on-the-memory-it-uses_0021

http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-b
 e-a-limit-on-the-memory-it-uses_0021 ) it says that:
 For a 64-bit build, the default is the amount of RAM

 So in my case the amount of RAM would be 4 GB. R should be able to
 allocate a vector of size 5 Mb without me typing any command (either 
 as
 memory.limit() or appended string in the target path), is that right?



 From: Lorenzo Cattarino
 Sent: Wednesday, 3 November 2010 10:55 AM
 To: 'r-help@r-project.org'
 Subject: memory allocation problem



 I forgot to mention that I am using windows 7 (64-bit) and the R 
 version
 2.11.1 (64-bit)



 From: Lorenzo Cattarino

 I am trying to run a non linear parameter optimization using the
 function optim() and I have problems regarding memory allocation.

 My data are in a dataframe with 9 columns. There are 656100 rows.

 head(org_results)

 comb.id   p

[R] optim works on command-line but not inside a function

2010-11-03 Thread Damokun


Dear all, 

I am trying to optimize a logistic function using optim, inside the
following functions: 
#Estimating a and b from thetas and outcomes by ML

IRT.estimate.abFromThetaX - function(t, X, inits, lw=c(-Inf,-Inf),
up=rep(Inf,2)){

  optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, 

  gr=IRT.gradZL, 

  lower=lw, upper=up, t=t, X=X)

  c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) )

}

#Estimating a and b from thetas and outcomes by ML, avoiding 0*log(0)
IRT.estimate.abFromThetaX2 - function(tar, Xes, inits, lw=c(-Inf,-Inf),
up=rep(Inf,2)){

  optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, 

  gr=IRT.gradZL, 

  lower=lw, upper=up, t=tar, X=Xes)

  c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) )

}

The problem is that this does not work: 
 IRT.estimate.abFromThetaX(sx, st, c(0,0))
Error in optim(inits, method = L-BFGS-B, fn = IRT.llZetaLambdaCorrNan,  : 
  L-BFGS-B needs finite values of 'fn'
But If I try the same optim call on the command line, with the same data, it
works fine:
 optRes - optim(c(0,0), method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, 
+   gr=IRT.gradZL, 
+   lower=c(-Inf, -Inf), upper=c(Inf, Inf), t=st, X=sx)
 optRes
$par
[1] -0.6975157  0.7944972
$convergence
[1] 0
$message
[1] CONVERGENCE: REL_REDUCTION_OF_F = FACTR*EPSMCH

Does anyone have an idea what this could be, and what I could try to avoid
this error? I tried bounding the parameters, with lower=c(-10, -10) and
upper=... but that made no difference. 

Thanks, 
Diederik Roijers
Utrecht University MSc student. 
--
PS: the other functions I am using are: 

#IRT.p is the function that represents the probability 
#of a positive outcome of an item with difficulty b, 
#discriminativity a, in combination with a student with
#competence theta. 

IRT.p - function(theta, a, b){

   epow - exp(-a*(theta-b))

   result - 1/(1+epow)

   result

}


# = IRT.p^-1 ; for usage in the loglikelihood

IRT.oneOverP - function(theta, a, b){

   epow - exp(-a*(theta-b))

   result - (1+epow)

   result

}

# = (1-IRT.p)^-1 ; for usage in the loglikelihood
IRT.oneOverPneg - function(theta, a, b){

   epow - exp(a*(theta-b))

   result - (1+epow)

   result

}


#simulation-based sample generation of thetas and outcomes
#based on a given a and b. (See IRT.p) The sample-size is n

IRT.generateSample - function(a, b, n){

   x-rnorm(n, mean=b, sd=b/2)

   t-IRT.p(x,a,b)

   ch-runif(length(t))

   t[t=ch]=1

   t[tch]=0

   cbind(x,t)

}


#This loglikelihood function is based on the a and be parameters, 
#and requires thetas as input in X, and outcomes in t
#prone to give NaN errors due to 0*log(0)

IRT.logLikelihood2 - function(params, t, X){

   pos- sum(t * log(IRT.p(X,params[1],params[2])))

   neg- sum(  (1-t) * log( (1-IRT.p(X,params[1],params[2])) )  )

   -pos-neg

}

#Avoiding NaN problems due to 0*log(0) 
#otherwise equivalent to IRT.logLikelihood2
IRT.logLikelihood2CorrNan - function(params, t, X){

   pos- sum(t * log(IRT.oneOverP(X,params[1],params[2])))

   neg- sum((1-t) * log(IRT.oneOverPneg(X,params[1],params[2])))

   -pos-neg

}

#IRT.p can also be espressed in terms of z and l 
#where z=-ab and l=a - makes it a standard logit function

IRT.pZL - function(theta, z, l){

   epow - exp(-(z+l*theta))

   result - 1/(1+epow)

   result

}

#as IRT.oneOverP but now for IRT.pZL 
IRT.pZLepos - function(theta, z, l){

   epow - exp(-(z+l*theta))

   result - (1+epow)

   result

}


#as IRT.oneOverPneg but now for IRT.pZL 
IRT.pZLeneg - function(theta, z, l){

   epow - exp(z+l*theta)

   result - (1+epow)

   result

}



#The loglikelihood of IRT, but now expressed in terms of z and l

IRT.llZetaLambda - function(params, t, X){

   pos- sum(t * log(IRT.pZL( X,params[1],params[2]) ))

   neg- sum(  (1-t) * log( (1-IRT.pZL(X,params[1],params[2] )) )  )

   -pos-neg

}

#Same as IRT.logLikelihood2CorrNan but for IRT.llZetaLambda
IRT.llZetaLambdaCorrNan - function(params, t, X){

   pos - sum(t * log(IRT.pZLepos( X,params[1],params[2]) ))

   neg - sum((1-t) * log(IRT.pZLeneg(X,params[1],params[2]) ))

   pos+neg

}


#Gradient of IRT.llZetaLambda

IRT.gradZL - function(params, t, X){

  res-numeric(length(params))

  res[1] - sum(t-IRT.pZL( X,params[1],params[2] ))

  res[2] - sum(X*(t-IRT.pZL( X,params[1],params[2] )))

  -res

}

#And to create the sample: 
s - IRT.generateSample(0.8, 1, 50)
sx - s[,1]
st - s[,2]
IRT.estimate.abFromThetaX(sx, st, c(0,0))


-- 
View this message in context: 
http://r.789695.n4.nabble.com/optim-works-on-command-line-but-not-inside-a-function-tp3025414p3025414.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] optim works on command-line but not inside a function

2010-11-03 Thread Jonathan P Daily

As the error message says, the values of your function must be finite in 
order to run the algorithm.

Some part of your loop is passing arguments (inits maybe... you only tried 
(0,0) in the CLI example) that cause IRT.llZetaLambdaCorrNan to be 
infinite.
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly



From:
Damokun dmroi...@students.cs.uu.nl
To:
r-help@r-project.org
Date:
11/03/2010 10:19 AM
Subject:
[R] optim works on command-line but not inside a function
Sent by:
r-help-boun...@r-project.org




Dear all, 

I am trying to optimize a logistic function using optim, inside the
following functions: 
#Estimating a and b from thetas and outcomes by ML

IRT.estimate.abFromThetaX - function(t, X, inits, lw=c(-Inf,-Inf),
up=rep(Inf,2)){

  optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, 

  gr=IRT.gradZL, 

  lower=lw, upper=up, t=t, X=X)

  c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) )

}

#Estimating a and b from thetas and outcomes by ML, avoiding 0*log(0)
IRT.estimate.abFromThetaX2 - function(tar, Xes, inits, lw=c(-Inf,-Inf),
up=rep(Inf,2)){

  optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, 

  gr=IRT.gradZL, 

  lower=lw, upper=up, t=tar, X=Xes)

  c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) )

}

The problem is that this does not work: 
 IRT.estimate.abFromThetaX(sx, st, c(0,0))
Error in optim(inits, method = L-BFGS-B, fn = IRT.llZetaLambdaCorrNan, : 

  L-BFGS-B needs finite values of 'fn'
But If I try the same optim call on the command line, with the same data, 
it
works fine:
 optRes - optim(c(0,0), method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, 
+   gr=IRT.gradZL, 
+   lower=c(-Inf, -Inf), upper=c(Inf, Inf), t=st, X=sx)
 optRes
$par
[1] -0.6975157  0.7944972
$convergence
[1] 0
$message
[1] CONVERGENCE: REL_REDUCTION_OF_F = FACTR*EPSMCH

Does anyone have an idea what this could be, and what I could try to avoid
this error? I tried bounding the parameters, with lower=c(-10, -10) and
upper=... but that made no difference. 

Thanks, 
Diederik Roijers
Utrecht University MSc student. 
--
PS: the other functions I am using are: 

#IRT.p is the function that represents the probability 
#of a positive outcome of an item with difficulty b, 
#discriminativity a, in combination with a student with
#competence theta. 

IRT.p - function(theta, a, b){

   epow - exp(-a*(theta-b))

   result - 1/(1+epow)

   result

}


# = IRT.p^-1 ; for usage in the loglikelihood

IRT.oneOverP - function(theta, a, b){

   epow - exp(-a*(theta-b))

   result - (1+epow)

   result

}

# = (1-IRT.p)^-1 ; for usage in the loglikelihood
IRT.oneOverPneg - function(theta, a, b){

   epow - exp(a*(theta-b))

   result - (1+epow)

   result

}


#simulation-based sample generation of thetas and outcomes
#based on a given a and b. (See IRT.p) The sample-size is n

IRT.generateSample - function(a, b, n){

   x-rnorm(n, mean=b, sd=b/2)

   t-IRT.p(x,a,b)

   ch-runif(length(t))

   t[t=ch]=1

   t[tch]=0

   cbind(x,t)

}


#This loglikelihood function is based on the a and be parameters, 
#and requires thetas as input in X, and outcomes in t
#prone to give NaN errors due to 0*log(0)

IRT.logLikelihood2 - function(params, t, X){

   pos- sum(t * log(IRT.p(X,params[1],params[2])))

   neg- sum(  (1-t) * log( (1-IRT.p(X,params[1],params[2])) )  )

   -pos-neg

}

#Avoiding NaN problems due to 0*log(0) 
#otherwise equivalent to IRT.logLikelihood2
IRT.logLikelihood2CorrNan - function(params, t, X){

   pos- sum(t * log(IRT.oneOverP(X,params[1],params[2])))

   neg- sum((1-t) * log(IRT.oneOverPneg(X,params[1],params[2])))

   -pos-neg

}

#IRT.p can also be espressed in terms of z and l 
#where z=-ab and l=a - makes it a standard logit function

IRT.pZL - function(theta, z, l){

   epow - exp(-(z+l*theta))

   result - 1/(1+epow)

   result

}

#as IRT.oneOverP but now for IRT.pZL 
IRT.pZLepos - function(theta, z, l){

   epow - exp(-(z+l*theta))

   result - (1+epow)

   result

}


#as IRT.oneOverPneg but now for IRT.pZL 
IRT.pZLeneg - function(theta, z, l){

   epow - exp(z+l*theta)

   result - (1+epow)

   result

}



#The loglikelihood of IRT, but now expressed in terms of z and l

IRT.llZetaLambda - function(params, t, X){

   pos- sum(t * log(IRT.pZL( X,params[1],params[2]) ))

   neg- sum(  (1-t) * log( (1-IRT.pZL(X,params[1],params[2] )) )  )

   -pos-neg

}

#Same as IRT.logLikelihood2CorrNan but for IRT.llZetaLambda
IRT.llZetaLambdaCorrNan - function(params, t, X){

   pos - sum(t * log(IRT.pZLepos( X,params[1],params[2]) ))

   neg - sum((1-t) * log(IRT.pZLeneg(X,params[1],params[2]) ))

   pos+neg

}


#Gradient of IRT.llZetaLambda

IRT.gradZL - function(params, t, X){

  res-numeric(length(params))

[R] Granger causality with panel data (econometrics question)

2010-11-03 Thread Harun Özkan


Hi folks,

I am trying to perform a Granger causality analysis with panel data. There 
are some packages around for panel data analysis and Granger causality. 
However, I have found neither a package for both panel data and Granger 
causality nor any R procedures (homogenous/heterogenous causality 
hypotheses, related tests such as Wald, unit root tests etc.).


Of course, someone must have encountered this problem before me. Can anyone 
suggest a solution to this case?


Thanks in advance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] dll problem with C++ function

2010-11-03 Thread Carsten Dormann


Dear fellow R-users,

I have the problem of being unable to repeatedly use a C++-function 
within R unless by dyn.unloading/dyn.loading it after each .C call.


The C++-code (too large to attach) compiles without problem using R CMD 
SHLIB. It loads (using dyn.load(myfun.so)) and executes (via .C(myfun, 
...) ) properly. The function returns no object, only reads files from 
disk, performs calculations and later writes a file to disk.
When I now use the same line of code again to re-run the analysis (again 
via .C), I get an error message claiming a malformed input file. This 
seemingly malformed input file is absolutely correct.


When I now use dyn.unload(myfun.so) and then again 
dyn.load(myfun.so), I can use it as before.


I have absolutely no clue what is going on here. The C++-function 
returns a 1 if run correctly and 0 otherwise. The stand-alone version 
works fine. My feeling is that R cannot deallocate the memory or somehow 
doesn't grasp that the dll should be freed after running.


My impression is there is a very simple reason, but I couldn't find it 
(in the Writing R Extensions or in any of the R help lists, including 
R-sig-mac).


ANY hint greatly appreciated!

Cheers,

Carsten


For what it's worth, here my system details:

R version 2.12.0 (2010-10-15)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] de_DE.UTF-8/de_DE.UTF-8/C/C/de_DE.UTF-8/de_DE.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

The C++-functions starts like this:
#include R.h

#include stdio.h
#include iostream
#include fstream
#include string
#include stdlib.h
#include time.h

#include myhelperfunctions.h

using namespace std;

extern C {

... various long C++ functions without any change for inclusion into R 
(apart from renaming main to myfun)


}





--
Dr. Carsten F. Dormann
Department of Computational Landscape Ecology
Helmholtz Centre for Environmental Research-UFZ 
(Department Landschaftsökologie)
(Helmholtz Zentrum für Umweltforschung - UFZ)
Permoserstr. 15 
04318 Leipzig   
Germany

Tel: ++49(0)341 2351946
Fax: ++49(0)341 2351939
Email: carsten.dorm...@ufz.de
internet: http://www.ufz.de/index.php?de=4205

Registered Office/Sitz der Gesellschaft: Leipzig
Commercial Register Number/Registergericht: Amtsgericht Leipzig, 
Handelsregister Nr. B 4703
Chairman of the Supervisory Board/Vorsitzender des Aufsichtsrats: MinR Wilfried 
Kraus
Scientific Managing Director/Wissenschaftlicher Geschäftsführer: Prof. Dr. 
Georg Teutsch
Administrative Managing Director/Administrativer Geschäftsführer: Dr. Andreas 
Schmidt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] longer object length is not a multiple of shorter object length

2010-11-03 Thread Stephen Liu

Hi folks,

I'm following An Introduction to R
http://cran.r-project.org/doc/manuals/R-intro.html#R-and-statistics

to learn R.

Coming to;
2.2 Vector arithmetic

 v - 2*x + y + 1
Warning message:
In 2 * x + y :
  longer object length is not a multiple of shorter object length

What does it mean?  How to rectify it?  Please help.  TIA

B.R.
Stephen L



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] longer object length is not a multiple of shorter object length



On Nov 3, 2010, at 11:00 AM, Stephen Liu wrote:


Hi folks,

I'm following An Introduction to R
http://cran.r-project.org/doc/manuals/R-intro.html#R-and-statistics

to learn R.

Coming to;
2.2 Vector arithmetic


v - 2*x + y + 1

Warning message:
In 2 * x + y :
 longer object length is not a multiple of shorter object length

What does it mean?  How to rectify it?  Please help.  TIA


What does this return:

c(length(x), length(y))  # ?



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Lattice plots for images

2010-11-03 Thread Neba Funwi-Gabga

Hello UseRs,
I need help on how to plot several raster images (such as those obtained
from a kernel-smoothed intensity function) in a layout
such as that obtained from the lattice package. I would like to obtain
something such as obtained from using the levelplot or xyplot
in lattice. I currently use:

par(mfrow=c(3,3)

to set the workspace, but the resulting plots leave a lot of blank space
between individual plots. If I can get it to the lattice format,
I think it will save me some white space.

Any help is greatly appreciated.

Neba.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] longer object length is not a multiple of shorter object length

2010-11-03 Thread Stephen Liu

- Original Message 

From: David Winsemius dwinsem...@comcast.net
To: Stephen Liu sati...@yahoo.com
Cc: r-help@r-project.org
Sent: Wed, November 3, 2010 11:03:18 PM
Subject: Re: [R] longer object length is not a multiple of shorter object length

- snip -

 v - 2*x + y + 1
 Warning message:
 In 2 * x + y :
  longer object length is not a multiple of shorter object length

 What does it mean?  How to rectify it?  Please help.  TIA

 What does this return:

 c(length(x), length(y))  # ?

c(length(x), length(y))
[1]  5 11

B.R.
Stephen L

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] longer object length is not a multiple of shorter object length

On Nov 3, 2010, at 11:17 AM, Stephen Liu wrote:

- Original Message 

From: David Winsemius dwinsem...@comcast.net
To: Stephen Liu sati...@yahoo.com
Cc: r-help@r-project.org
Sent: Wed, November 3, 2010 11:03:18 PM
Subject: Re: [R] longer object length is not a multiple of shorter  
object length

- snip -

v - 2*x + y + 1
Warning message:
In 2 * x + y :
longer object length is not a multiple of shorter object length

What does it mean?  How to rectify it?

You were not supposed to rectify it. That example was designed to show  
you what happens in R when two vectors (actually three) are offered to  
the Arithmetic operators. Read the material that is above and below  
that expression again.

Please help.  TIA

What does this return:

c(length(x), length(y))  # ?

c(length(x), length(y))
[1]  5 11

B.R.
Stephen L

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [klaR package] [NaiveBayes] warning message numerical 0 probability

2010-11-03 Thread Fabon Dzogang

Hi,

I run R 2.10.1 under ubuntu 10.04 LTS (Lucid Lynx) and klaR version 0.6-4.

I compute a model over a 2 classes dataset (composed of 700 examples).
To that aim, I use the function NaiveBayes provided in the package
klaR.
When I then use the prediction function : predict(my_model, new_data).
I get the following warning :

In FUN(1:747[[747L]], ...) : Numerical 0 probability with observation 458

As I did not find any documentation or any discussion concerning this
warning message, I looked in the klaR source code and found the
following line in predict.NaiveBayes.R :

warning(Numerical 0 probability with observation , i)

Unfortunately, it is hard to get a clear picture of the whole process
reading the code. I wonder if someone could help me with the meaning
of this warning message.

Sorry I did not provide an example, but I could not simulate the same
message over a small toy example.

Thank you,

Fabon Dzogang.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dll problem with C++ function

2010-11-03 Thread Peter Langfelder

Just a shot in the dark... Do you properly close the input/output
files at the end of your function? If not and the file remains open,
it may throw an error upon new attempt to read it. It is possible that
dyn.unload, among other things, closes all open connections and hence
upon re-load everything works fine.

Peter

On Wed, Nov 3, 2010 at 6:47 AM, Carsten Dormann carsten.dorm...@ufz.de wrote:
 Dear fellow R-users,

 I have the problem of being unable to repeatedly use a C++-function within R
 unless by dyn.unloading/dyn.loading it after each .C call.

 The C++-code (too large to attach) compiles without problem using R CMD
 SHLIB. It loads (using dyn.load(myfun.so)) and executes (via .C(myfun,
 ...) ) properly. The function returns no object, only reads files from disk,
 performs calculations and later writes a file to disk.
 When I now use the same line of code again to re-run the analysis (again via
 .C), I get an error message claiming a malformed input file. This seemingly
 malformed input file is absolutely correct.

 When I now use dyn.unload(myfun.so) and then again dyn.load(myfun.so), I
 can use it as before.

 I have absolutely no clue what is going on here. The C++-function returns a
 1 if run correctly and 0 otherwise. The stand-alone version works fine. My
 feeling is that R cannot deallocate the memory or somehow doesn't grasp
 that the dll should be freed after running.

 My impression is there is a very simple reason, but I couldn't find it (in
 the Writing R Extensions or in any of the R help lists, including
 R-sig-mac).

 ANY hint greatly appreciated!

 Cheers,

 Carsten


 For what it's worth, here my system details:

 R version 2.12.0 (2010-10-15)
 Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

 locale:
 [1] de_DE.UTF-8/de_DE.UTF-8/C/C/de_DE.UTF-8/de_DE.UTF-8

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 The C++-functions starts like this:
 #include R.h

 #include stdio.h
 #include iostream
 #include fstream
 #include string
 #include stdlib.h
 #include time.h

 #include myhelperfunctions.h

 using namespace std;

 extern C {

 ... various long C++ functions without any change for inclusion into R
 (apart from renaming main to myfun)

 }





 --
 Dr. Carsten F. Dormann
 Department of Computational Landscape Ecology
 Helmholtz Centre for Environmental Research-UFZ
 (Department Landschaftsökologie)
 (Helmholtz Zentrum für Umweltforschung - UFZ)
 Permoserstr. 15
 04318 Leipzig
 Germany

 Tel: ++49(0)341 2351946
 Fax: ++49(0)341 2351939
 Email: carsten.dorm...@ufz.de
 internet: http://www.ufz.de/index.php?de=4205

 Registered Office/Sitz der Gesellschaft: Leipzig
 Commercial Register Number/Registergericht: Amtsgericht Leipzig,
 Handelsregister Nr. B 4703
 Chairman of the Supervisory Board/Vorsitzender des Aufsichtsrats: MinR
 Wilfried Kraus
 Scientific Managing Director/Wissenschaftlicher Geschäftsführer: Prof. Dr.
 Georg Teutsch
 Administrative Managing Director/Administrativer Geschäftsführer: Dr.
 Andreas Schmidt

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lattice plots for images

2010-11-03 Thread Matt Shotwell

Have you tried using the 'mai' argument to par()? Something like:
par(mfrow=c(3,3), mai=c(0,0,0,0))

I've used this in conjunction with image() to plot raster data in a
tight grid. http://biostatmatt.com/archives/727

-Matt

On Wed, 2010-11-03 at 11:13 -0400, Neba Funwi-Gabga wrote:
 Hello UseRs,
 I need help on how to plot several raster images (such as those obtained
 from a kernel-smoothed intensity function) in a layout
 such as that obtained from the lattice package. I would like to obtain
 something such as obtained from using the levelplot or xyplot
 in lattice. I currently use:
 
 par(mfrow=c(3,3)
 
 to set the workspace, but the resulting plots leave a lot of blank space
 between individual plots. If I can get it to the lattice format,
 I think it will save me some white space.
 
 Any help is greatly appreciated.
 
 Neba.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lattice plots for images



On Nov 3, 2010, at 11:13 AM, Neba Funwi-Gabga wrote:


Hello UseRs,
I need help on how to plot several raster images (such as those  
obtained

from a kernel-smoothed intensity function) in a layout
such as that obtained from the lattice package. I would like to obtain
something such as obtained from using the levelplot or xyplot
in lattice. I currently use:


par(mfrow=c(3,3)


to set the workspace, but the resulting plots leave a lot of blank  
space

between individual plots. If I can get it to the lattice format,
I think it will save me some white space.


(It's not clear what plotting paradigm you are using since you do not  
name a particular function or package, but this assumes you will be  
using lattice. If you are using base graphics, then the answer is  
undoubtedly ?par

)

In the archives are examples you might use to look up in the  
documentation and then to modify to fit you specifications:


http://finzi.psych.upenn.edu/R/Rhelp02/archive/58102.html

trellis.par.set(list(layout.widths = list(left.padding = -1)))
trellis.par.set(list(layout.widths = list(right.padding = -1,
 ylab.axis.padding = -0.5)))

http://finzi.psych.upenn.edu/R/Rhelp02/archive/62912.html

theme.novpadding - list(layout.heights =
list(top.padding = 0,
 main.key.padding = 0,
 key.axis.padding = 0,
 axis.xlab.padding = 0,
 xlab.key.padding = 0,
 key.sub.padding = 0,
 bottom.padding = 0))

Both citations hound with search space between lattice plots

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Line numbers in Sweave

2010-11-03 Thread Yihui Xie

Well, I know this is true for the default Sweave(), and my problem
actually comes from the pgfSweave package: I used some tricks to
cheat R to parse and deparse the code chunks so that the output can
be automatically formatted (and preserve the comments, [1]). The price
to pay for not being honest is these line numbers since R 2.12.0
(even if there are no errors in my code). As I am unable to figure out
the logic behind Sweave() to generate line numbers, I cannot modify my
code either. So a temporary dirty solution is to turn off the
reporting of line numbers.

I wish the keep.source=FALSE option can preserve comments so that we
don't need to touch utils:::RweaveLatexRuncode ([2]). Formatting the
code chunks is really a nice feature with keep.source=FALSE, but the
price of discarding comments is too high... I think the evaluate
package might be a good place to look at (or perhaps the highlight
package), which performs just like the R terminal (keep the comments
and report the errors without really stopping R, [3]).

Thanks!

[1] 
https://github.com/cameronbracken/pgfSweave/blob/master/R/pgfSweaveDriver.R#L297
[2] http://yihui.name/en/wp-content/uploads/2009/11/Sweave2.pdf
[3] see the error in the last but one example:
http://had.co.nz/ggplot2/stat_smooth.html

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA



On Tue, Nov 2, 2010 at 5:09 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote:
 On 02/11/2010 5:50 PM, Yihui Xie wrote:

 Hi,

 I thumbed through the source code Sweave.R but was unable to figure
 out when (under what conditions) R will insert the line numbers to the
 output. The R 2.12.0 news said:

     • Parsing errors detected during Sweave() processing will now be
       reported referencing their original location in the source file.

 Do we have any options to turn off this reporting? Thanks!

 Sure:  just don't include any syntax errors.

 Duncan Murdoch


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] non-numeric argument to binary operator error while reading ncdf file

2010-11-03 Thread David Pierce

Charles Novaes de Santana wrote:
 Thank you everybody for the help! The solution of my problem is here:

 http://climateaudit.org/2009/10/10/unthreaded-23/

 The mv variable is the designated NA for the variable and it appears that
 somebody screwed that up in the file. This workaround worked for me:

[...]

Charles,

not sure if you got my previous email, but if you send me a copy of the
file that triggers the problem, I can fix it for everyone instead of
requiring that kind of work around.

Regards,

--Dave

---
David W. Pierce
Division of Climate, Atmospheric Science, and Physical Oceanography
Scripps Institution of Oceanography
(858) 534-8276 (voice)  /  (858) 534-8561 (fax)dpie...@ucsd.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tukey's table

2010-11-03 Thread Dennis Murphy

Hi:

Try this:

Trat - c(2:30) # number of treatments
gl - c(2:30, 40, 60, 120)

# Write a one-line 2D function to get the Tukey distribution quantile:
f - function(x,y) qtukey(0.95, x, y)

outer(Trat, gl, f)

It's slow (takes a few seconds) but it seems to work.

HTH,
Dennis



On Wed, Nov 3, 2010 at 3:52 AM, Silvano silv...@uel.br wrote:

 Hi,

 I'm building Tukey's table using qtukey function.

 It happens that I can't get the values of Tukey's one degree of freedom and
 also wanted to eliminate the first column.


Firstly, one needs at least two treatments to find a studentized range
(which is why you get NaNs across the first row when you set Trat = 1).
Secondly, if you have at least two groups, you need at least two
observations per group to get a variance estimate, which means that the
variance estimate of the difference needs to have at least 2 df. If one
group has only observation in it, the variance of the difference is the
variance of the group with = 2 observations, which doesn't make intuitive
sense. This is why you get NaNs along the first column.

HTH,
Dennis


 The program is:

 Trat - c(1:30) # number of treatments
 gl - c(1:30, 40, 60, 120) # degree freedom

 tukval - matrix(0, nr=length(gl), nc=length(Trat))

 for(i in 1:length(gl))
  for(j in 1:length(Trat))
   tukval[i,j] - qtukey(.95, Trat[j], gl[i])

 rownames(tukval) - gl
 colnames(tukval) - paste(Trat, , sep=)
 tukval

 require(xtable)
 xtable(tukval)


 Some suggest?

 --
 Silvano Cesar da Costa
 Departamento de Estatística
 Universidade Estadual de Londrina
 Fone: 3371-4346

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Line numbers in Sweave

2010-11-03 Thread Nicola Sturaro Sommacal (Quantide srl)


On 03/11/2010 11:52 AM, Yihui Xie wrote:

Well, I know this is true for the default Sweave(), and my problem
actually comes from the pgfSweave package: I used some tricks to
cheat R to parse and deparse the code chunks so that the output can
be automatically formatted (and preserve the comments, [1]). The price
to pay for not being honest is these line numbers since R 2.12.0
(even if there are no errors in my code). As I am unable to figure out
the logic behind Sweave() to generate line numbers, I cannot modify my
code either. So a temporary dirty solution is to turn off the
reporting of line numbers.


Could you be more specific?  I don't understand how the line numbers 
would affect anything if you don't have syntax errors.



I wish the keep.source=FALSE option can preserve comments so that we
don't need to touch utils:::RweaveLatexRuncode ([2]).


This is basically impossible:  with keep.source=FALSE, you are just 
seeing deparsed code.  The comments don't make it into parsed code, so 
the deparser never sees them.


I don't know the evaluate package, but I think I remember that the 
highlight package has its own parser, it doesn't use R's.  Perhaps R's 
parser could import some of the differences, but I think it's 
complicated enough as it is, and would rather not make it more so.




Formatting the
code chunks is really a nice feature with keep.source=FALSE, but the
price of discarding comments is too high... I think the evaluate
package might be a good place to look at (or perhaps the highlight
package), which performs just like the R terminal (keep the comments
and report the errors without really stopping R, [3]).


The error in [3] is a run-time error, not a syntax error.  It should be 
unaffected by the line numbers.


Duncan Murdoch


Thanks!

[1] 
https://github.com/cameronbracken/pgfSweave/blob/master/R/pgfSweaveDriver.R#L297
[2] http://yihui.name/en/wp-content/uploads/2009/11/Sweave2.pdf
[3] see the error in the last but one example:
http://had.co.nz/ggplot2/stat_smooth.html

Regards,
Yihui
--
Yihui Xiexieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA



On Tue, Nov 2, 2010 at 5:09 PM, Duncan Murdochmurdoch.dun...@gmail.com  wrote:
  On 02/11/2010 5:50 PM, Yihui Xie wrote:

  Hi,

  I thumbed through the source code Sweave.R but was unable to figure
  out when (under what conditions) R will insert the line numbers to the
  output. The R 2.12.0 news said:

   • Parsing errors detected during Sweave() processing will now be
 reported referencing their original location in the source file.

  Do we have any options to turn off this reporting? Thanks!

  Sure:  just don't include any syntax errors.

  Duncan Murdoch



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to unquote string in R

2010-11-03 Thread lord12



s= Hey
a = Hello
table = rbind(s,a)
write.table(table,paste(blah,.PROPERTIES,sep = ),row.names =
FALSE,col.names = FALSE)

In my table, how do I output only the words and not the words with the
quotations?

-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-unquote-string-in-R-tp3025654p3025654.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to unquote string in R

2010-11-03 Thread Henrique Dallazuanna

Put the quote = FALSE argument in write.table

On Wed, Nov 3, 2010 at 2:13 PM, lord12 trexi...@yahoo.com wrote:



 s= Hey
 a = Hello
 table = rbind(s,a)
 write.table(table,paste(blah,.PROPERTIES,sep = ),row.names =
 FALSE,col.names = FALSE)

 In my table, how do I output only the words and not the words with the
 quotations?

 --
 View this message in context:
 http://r.789695.n4.nabble.com/How-to-unquote-string-in-R-tp3025654p3025654.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] density() function: differences with S-PLUS

Dear William,

thank you very much for your reply. I see it only after my reply to Joshua.

Unfortunately I cannot try until tomorrow, because I don't have S-PLUS on
this machine.

Thanks again.

Nicola



2010/11/3 William Dunlap wdun...@tibco.com

 Did you get my reply (1:31pm PST Tuesday)
 to your request?  It showed how you needed
 to use the from= and to= argument to density
 to get identical x components to the output
 and that the small differences in the y
 component were due to S+ truncating the
 gaussian kernel at +- 4 standard deviations
 from the center while R does not truncate
 the gaussian kernel (it output looks like it
 uses a Fourier transform to do the convolution).


 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

  -Original Message-
  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] On Behalf Of Nicola
  Sturaro Sommacal (Quantide srl)
  Sent: Wednesday, November 03, 2010 3:34 AM
  To: Joshua Wiley
  Cc: r-help@r-project.org
  Subject: Re: [R] density() function: differences with S-PLUS
 
  Dear Joshua,
 
  first of all, thank you very much for reply. I hoped that
  someone who's
  familiar with both S+ and R can reply to me, because I spent
  some hours to
  looking for a solution.
 
  If someone else would try, this is the SPLUS code and output,
  while below
  there is the R code. I obtain the same x values, while y values are
  differents for both examples.
 
  Thank you very much.
 
  Nicola
 
 
  ### S-PLUS CODE AND OUTPUT ###
 
   density(1:1000, width = 4)
  $x:
   [1]-2.018.5102039.0204159.5306180.04082
  100.55102   121.06122
   [8]   141.57143   162.08163   182.59184   203.10204   223.61224
  244.12245   264.63265
  [15]   285.14286   305.65306   326.16327   346.67347   367.18367
  387.69388   408.20408
  [22]   428.71429   449.22449   469.73469   490.24490   510.75510
  531.26531   551.77551
  [29]   572.28571   592.79592   613.30612   633.81633   654.32653
  674.83673   695.34694
  [36]   715.85714   736.36735   756.87755   777.38776   797.89796
  818.40816   838.91837
  [43]   859.42857   879.93878   900.44898   920.95918   941.46939
  961.97959   982.48980
  [50]  1003.0
 
  $y:
   [1] 4.565970e-006 1.31e-003 9.999374e-004 1.31e-003
  9.999471e-004
  1.31e-003
   [7] 9.999560e-004 1.30e-003 9.999643e-004 1.29e-003
  9.999718e-004
  1.28e-003
  [13] 9.999788e-004 1.26e-003 9.999852e-004 1.24e-003
  9.10e-004
  1.22e-003
  [19] 9.63e-004 1.19e-003 1.01e-003 1.16e-003
  1.06e-003
  1.13e-003
  [25] 1.10e-003 1.10e-003 1.13e-003 1.06e-003
  1.16e-003
  1.01e-003
  [31] 1.19e-003 9.63e-004 1.22e-003 9.10e-004
  1.24e-003
  9.999852e-004
  [37] 1.26e-003 9.999788e-004 1.28e-003 9.999718e-004
  1.29e-003
  9.999643e-004
  [43] 1.30e-003 9.999560e-004 1.31e-003 9.999471e-004
  1.31e-003
  9.999374e-004
  [49] 1.31e-003 4.432131e-006
 
 
   exdata = iris[, 1, 1]
   density(exdata, width = 4)
  $x:
   [1] 1.30 1.453061 1.606122 1.759184 1.912245 2.065306
  2.218367 2.371429
  2.524490
  [10] 2.677551 2.830612 2.983673 3.136735 3.289796 3.442857
  3.595918 3.748980
  3.902041
  [19] 4.055102 4.208163 4.361224 4.514286 4.667347 4.820408
  4.973469 5.126531
  5.279592
  [28] 5.432653 5.585714 5.738776 5.891837 6.044898 6.197959
  6.351020 6.504082
  6.657143
  [37] 6.810204 6.963265 7.116327 7.269388 7.422449 7.575510
  7.728571 7.881633
  8.034694
  [46] 8.187755 8.340816 8.493878 8.646939 8.80
 
  $y:
   [1] 0.0007849649 0.0013097474 0.0021225491 0.0033616520 0.0052059615
  0.0078856717
   [7] 0.0116917555 0.0169685132 0.0241073754 0.0335286785 0.0456521053
  0.0608554862
  [13] 0.0794235072 0.1014901241 0.1269807991 0.1555625999 0.1866111931
  0.2192033788
  [19] 0.2521417640 0.2840144993 0.3132881074 0.3384260582 0.3580208688
  0.3709241384
  [25] 0.3763578665 0.3739920600 0.3639778683 0.3469316232 0.3238721233
  0.2961200278
  [31] 0.2651731505 0.2325739601 0.1997853985 0.1680884651 0.1385105802
  0.1117884914
  [37] 0.0883644110 0.0684099972 0.0518702141 0.0385181792 0.0280126487
  0.0199513951
  [43] 0.0139159044 0.0095050745 0.0063575653 0.0041639082 0.0026680819
  0.0016700727
  [49] 0.0010169912 0.0005962089
 
 
  ### R CODE ###
 
  # S-PLUS CODE: density(1:1000, width = 4) SAME x BUT DIFFERENT y
  density(1:1000, bw = 4, window = g,  n = 50, cut = 0.75)$x
  density(1:1000, bw = 4, window = g,  n = 50, cut = 0.75)$y
 
  # S-PLUS CODE: exdata = iris[, 1, 1]; density(exdata, width =
  4) SAME x
  BUT DIFFERENT y
  exdata = iris$Sepal.Length[iris$Species == setosa]
  density(exdata, bw = 4, n = 50, cut = 0.75)$x
  density(exdata, bw = 4, n = 50, cut = 0.75)$y
 
 
 
  2010/11/2 Joshua Wiley jwiley.ps...@gmail.com
 
   Dear Nicola,
  
   There are undoubtedly people here who are familiar with
  both S+ and R,
   but they may not always be around or get to

Re: [R] How to unquote string in R

2010-11-03 Thread Nicola Sturaro Sommacal (Quantide srl)




lord12 wrote:


s= Hey
a = Hello
table = rbind(s,a)
write.table(table,paste(blah,.PROPERTIES,sep = ),row.names =
FALSE,col.names = FALSE)

In my table, how do I output only the words and not the words with the
quotations?



You read the help page for the function you're using :).

From ?write.table:

   quote: a logical value (‘TRUE’ or ‘FALSE’) or a numeric vector.  If
  ‘TRUE’, any character or factor columns will be surrounded by
  double quotes.  If a numeric vector, its elements are taken
  as the indices of columns to quote.  In both cases, row and
  column names are quoted if they are written.  If ‘FALSE’,
  nothing is quoted.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] smooth: differences between R and S-PLUS

Hi!

I am studying differences between R and S-PLUS smooth() functions. I know
from the help that they worked differently, so I ask:
 - exist a package that permit to have the same results?
 - alternatively, someone know how can I obtain the same results in R, using
a self made script?

I know that S-PLUS use the 4(3RSR)2H running median smoothing and I try to
implement it with the code below. I obtain some result equal to the S-PLUS
one, so I think the main problem is understand how NA value from moving
median are treated.

The R result is:
 [1] NA NA 4.6250 4.9375 4.7500 4. 3.2500 3.
 [9] 2.8750 2.5000 2.1250

the S-PLUS one is:
 [1]   *   *   *   4.6250 4.9375 4.7500 4. 3.2500 3.**

where * stand for a number different from the R one that I don't remember.
 Unfortunately I cannot give more details about the S-PLUS function now,
because I am working on a machine without this software. If someone can help
me, tomorrow (CET time), I will provide more details.

Thanks in advance.

Nicola


### EXAMPLE
# Comments indicates which step of the 4(3RSR)2H algorithm I try to
replicate.

# Data
x1 - c(4, 1, 3, 6, 6, 4, 1, 6, 2, 4, 2)

# 4
out = NULL
for (i in 1:11) {out[i] = median(x1[i:(i+3)])}
out[is.na(out)] = x1[is.na(out)]
out

# (3RSR)
x2 = smooth(out, 3RSR, twiceit = F)
x2

# 2
out2 = NULL
for (i in 1: 11) {out2[i] = median(x2[i:(i+1)])}
out2[is.na(out2)] = x2[is.na(out2)]
out2

# H
filter(out2, filter = c(1/4, 1/2, 1/4), sides = 2)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Orthogonalization with different inner products

2010-11-03 Thread adet...@uw.edu

Suppose one wanted to consider random variables X_1,...X_n and from each 
subtract off the piece which is correlated with the previous variables in the 
list. i.e. make new variables Z_i so that Z_1=X_1 and 
Z_i=X_i-cov(X_i,Z_1)Z_1/var(Z_1)-...- cov(X_i,Z__{i-1})Z__{i-1}/var(Z_{i-1})  I 
have code to do this but I keep getting a non-conformable array error in the 
line with the covariance.  Does anyone have any suggestions?  Here is my code:

gov=read.table(file.choose(), sep=\t,header=T)

gov1=gov[3:length(gov[1,])]
n_indices=length(names(gov1))

x=data.matrix(gov1)


v=x
R=matrix(rep(0,length(x[,1])*length(x[1,])),length(x[,1]))

for(j in 1:n_indices){
   u=matrix(rep(0,length(v[,1])),length(v[,1]))

for(i in 1:j-1){
   u = u+cov(v[,j],v[,i])*v[,i]/var(v[,i])#(error here)
   }
   v[,j]=v[,j]-u

}

Thanks,
Andrew



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] '=' vs '-'

2010-11-03 Thread km

Hi all,

can we use '=' instead of '-' operator for assignment in R programs?

regards,
KM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] programming questions

quick programming questions.  I want to turn on more errors.  there
are two traps I occasionally fall into.

* I wonder why R thinks that a variable is always defined in a data frame.

  is.defined(d)
 [1] FALSE
  d= data.frame( x=1:5, y=1:5 )
  is.defined(d$z)
 [1] TRUE
  is.defined(nonexisting$garbage)
 [1] TRUE

this is a bit unfortunate for me, because subsequent errors become
less clear.   right now, I need to do '(is.defined(d) and
!is.null(d$z))' to check that my function inputs are valid.  It would
be nicer if one could just write if (is.defined(d$z).

* is there a way to turn off automatic recycling?  I would rather get
an error than unexpected recycling.  I can force recycling with rep()
when I need to.

regards,

/iaw


Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] programming questions




ivo welch wrote:

quick programming questions.  I want to turn on more errors.  there
are two traps I occasionally fall into.

* I wonder why R thinks that a variable is always defined in a data frame.

  is.defined(d)
 [1] FALSE
  d= data.frame( x=1:5, y=1:5 )
  is.defined(d$z)
 [1] TRUE
  is.defined(nonexisting$garbage)
 [1] TRUE


Which package/version of R is the 'is.defined' function in?

I don't seem to have it here on 2.11.1, which I know is not
the latest version of R.

What does 'defined' mean?



this is a bit unfortunate for me, because subsequent errors become
less clear.   right now, I need to do '(is.defined(d) and
!is.null(d$z))' to check that my function inputs are valid.  It would
be nicer if one could just write if (is.defined(d$z).


z %in% names(d) ?



* is there a way to turn off automatic recycling?  I would rather get
an error than unexpected recycling.  I can force recycling with rep()
when I need to.

regards,

/iaw


Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] programming questions


On 03/11/2010 2:05 PM, ivo welch wrote:

quick programming questions.  I want to turn on more errors.  there
are two traps I occasionally fall into.

* I wonder why R thinks that a variable is always defined in a data frame.

is.defined(d)
  [1] FALSE
d= data.frame( x=1:5, y=1:5 )
is.defined(d$z)
  [1] TRUE
is.defined(nonexisting$garbage)
  [1] TRUE

this is a bit unfortunate for me, because subsequent errors become
less clear.   right now, I need to do '(is.defined(d) and
!is.null(d$z))' to check that my function inputs are valid.  It would
be nicer if one could just write if (is.defined(d$z).

* is there a way to turn off automatic recycling?  I would rather get
an error than unexpected recycling.  I can force recycling with rep()
when I need to.


Where did you find the is.defined() function?  It's not part of R.  The 
R function to do that is exists().


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] programming questions

yikes.  this is all my fault.  it was the first thing that I ever
defined when I started using R.

   is.defined - function(name) exists(as.character(substitute(name)))

I presume there is something much better...

/iaw


On Wed, Nov 3, 2010 at 2:12 PM, Erik Iverson er...@ccbr.umn.edu wrote:


 ivo welch wrote:

 quick programming questions.  I want to turn on more errors.  there
 are two traps I occasionally fall into.

 * I wonder why R thinks that a variable is always defined in a data frame.

      is.defined(d)
     [1] FALSE
      d= data.frame( x=1:5, y=1:5 )
      is.defined(d$z)
     [1] TRUE
      is.defined(nonexisting$garbage)
     [1] TRUE

 Which package/version of R is the 'is.defined' function in?

 I don't seem to have it here on 2.11.1, which I know is not
 the latest version of R.

 What does 'defined' mean?


 this is a bit unfortunate for me, because subsequent errors become
 less clear.   right now, I need to do '(is.defined(d) and
 !is.null(d$z))' to check that my function inputs are valid.  It would
 be nicer if one could just write if (is.defined(d$z).

 z %in% names(d) ?


 * is there a way to turn off automatic recycling?  I would rather get
 an error than unexpected recycling.  I can force recycling with rep()
 when I need to.

 regards,

 /iaw

 
 Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] programming questions

2010-11-03 Thread Jonathan P Daily

For data frames you can also use with()

in your example:

with(d, exists(z))
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly



From:
ivo welch ivo.we...@gmail.com
To:
Erik Iverson er...@ccbr.umn.edu
Cc:
r-help r-h...@stat.math.ethz.ch
Date:
11/03/2010 02:20 PM
Subject:
Re: [R] programming questions
Sent by:
r-help-boun...@r-project.org



yikes.  this is all my fault.  it was the first thing that I ever
defined when I started using R.

   is.defined - function(name) exists(as.character(substitute(name)))

I presume there is something much better...

/iaw


On Wed, Nov 3, 2010 at 2:12 PM, Erik Iverson er...@ccbr.umn.edu wrote:


 ivo welch wrote:

 quick programming questions.  I want to turn on more errors.  there
 are two traps I occasionally fall into.

 * I wonder why R thinks that a variable is always defined in a data 
frame.

  is.defined(d)
 [1] FALSE
  d= data.frame( x=1:5, y=1:5 )
  is.defined(d$z)
 [1] TRUE
  is.defined(nonexisting$garbage)
 [1] TRUE

 Which package/version of R is the 'is.defined' function in?

 I don't seem to have it here on 2.11.1, which I know is not
 the latest version of R.

 What does 'defined' mean?


 this is a bit unfortunate for me, because subsequent errors become
 less clear.   right now, I need to do '(is.defined(d) and
 !is.null(d$z))' to check that my function inputs are valid.  It would
 be nicer if one could just write if (is.defined(d$z).

 z %in% names(d) ?


 * is there a way to turn off automatic recycling?  I would rather get
 an error than unexpected recycling.  I can force recycling with rep()
 when I need to.

 regards,

 /iaw

 
 Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spliting first 10 words in a string

Hi all, 

 

Thanks for all the help. I realize i have a lot to learn in R but i love it.

 

m

 

From: steven mosher [mailto:mosherste...@gmail.com] 
Sent: Tuesday, November 02, 2010 11:45 PM
To: MatevÅ¾ PavliÄ
Cc: David Winsemius; Gaj Vidmar; r-h...@stat.math.ethz.ch
Subject: Re: [R] spliting first 10 words in a string

 

just merge the data.frames back together.

 

use merge or cbind()

 

cbind will be easier

 

DF1 - data.frame(x,y,z)

DF2 -data.frame(DF1$x) # copy a column

then you added columns to DF2

 

just put them back together

 

DF3 -cbind(DF2,DF1$y,DF$z)

 

if you spend more time with R you will be able to do things like this 
elegantly, but for

now This way will work and you will learn a bit about R.

 

As for counting instances of a string, I might suggest looking at the table 
command

 

k - c( all, but,all)

 table(k)

k

all but 

  2   1 

 

So you can do a table for each column in your dataframe

 

On Tue, Nov 2, 2010 at 12:53 PM, MatevÅ¾ PavliÄ matevz.pav...@gi-zrmk.si 
wrote:

Hi,

Ok, i got this now. At least i think so. I got a data.frame with 15 fields, all 
other words have bee truncated. Which is what i want. But ia have that in a 
seperate data.frame from that one it was before (would be nice if it would be 
in the same ...)

'data.frame':   22801 obs. of  15 variables:
 $ V1 : chr  HUMUS SLABO MALO SLABO ...
 $ V2 : chr  IN GRANULIRAN PREPEREL VEZAN ...
 $ V3 : chr  HUMUSNA PEÅ ÄEN MELJAST ,KONGLOMERAT, ...
 $ V4 : chr  GLINA PROD PROD P0ROZEN, ...
 $ V5 : chr  Z DO DO S ...
 $ V6 : chr  MALO r r PLASTMI ...
 $ V7 : chr  PODA, = = GFs, ...
 $ V8 : chr  LAHKO 8Q 60mm, SIVORJAV ...
 $ V9 : chr  GNETNA, mm, S  ...
 $ V10: chr  RJAVA S PRODNIKI,  ...
 $ V11: chr   PRODNIKI MALO  ...
 $ V12: chr   DO PEÅ ÄEN  ...
 $ V13: chr   R S  ...
 $ V14: chr   = TANKIMI  ...

Now, i have another problem. Is it possible to count which word occours most 
often each field (V1, V2, V3, ...) and which one is the second and so on. 
Ideally to create a table for each field (V1, V2, V3, ...) with the word and 
thenumber of occuraces in that field (column) .
I suppose it could be done in SQL, but what since i saw what R can do i guess 
this can be done here to?

Thanks, m


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net]

Sent: Tuesday, November 02, 2010 8:23 PM
To: MatevÅ¾ PavliÄ

Cc: Gaj Vidmar; r-h...@stat.math.ethz.ch
Subject: Re: [R] spliting first 10 words in a string


On Nov 2, 2010, at 3:01 PM, MatevÅ¾ PavliÄ wrote:

 Hi all,

 Thanks for all the help. I managed to do it with what Gaj suggested
 (Excel :().

 The last solution from David is also freat i just don't undestand why
 R  put the words in 14 columns and thre rows?

Because the maximum number of words was 14 and the fill argument was TRUE. 
There were three rows because there were three items in the supplied character 
vector.

 I would like it to put just the first 10 words in source field to 10
 diefferent destiantion fields, but the same row. And so on...is that
 possible?

I don't know what a destination field might be. Those are not R data types.

This would trim the extra columns (in this example set to those greater than 8) 
by adding a lot of NULL's to the end of a colClasses specification  at 
the expense of a warning message which can be
ignored:

  read.table(textConnection(words), fill=T, colClasses = c(rep(character, 
  8), rep(NULL, 30) ) , stringsAsFactors=FALSE )
   V1V2V3  V4V5V6V7  V8
1   I  have a columnn  with  text  that has
2   I would  like  to split these words  in
3 but  just first ten wordsin   the string.
Warning message:
In read.table(textConnection(words), fill = T, colClasses = c(rep(character,  
:
  cols = 14 != length(data) = 38


If you want to assign the first column to a variable then just:
  first8 - read.table(textConnection(words), fill=T, colClasses = 
  c(rep(character, 8), rep(NULL, 30) ) , stringsAsFactors=FALSE)   var1 
  - first8[[1]]   var1
[1] I   I   but

--
David.


 Thank you, m
 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org
 ] On Behalf Of David Winsemius
 Sent: Tuesday, November 02, 2010 3:47 PM
 To: Gaj Vidmar
 Cc: r-h...@stat.math.ethz.ch
 Subject: Re: [R] spliting first 10 words in a string


 On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote:

 Though forbidden in this list, in Excel it's just (literally!) five
 clicks away!
 (with the column in question selected) Data - Text to Columns -
 Delimited - tick Space - Finish Pa je! (~Voila in Slovenian) (then
 import back to R, keeping only the first 10 columns if so
 desired)

 You could do the same thing without needing to leave R. Just
 read.table( textConnection(..), header=FALSE, fill=TRUE)

 read.table(textConnection(words), fill=T)
V1V2V3  V4V5V6V7  V8   V9
 V10  V11   V12 V13 V14
 1   I  have a columnn  with  text  that

Re: [R] '=' vs '-'

2010-11-03 Thread Doran, Harold

Yes, but - is preferred. Note, there are also some differences. You can do the 
following:

 a - 10
 b = 10
 identical(a,b)
[1] TRUE

And you can also do
 myFun - function(x, y = 100){
+ result - x*y
+ result}
 myFun(x  = 20)
[1] 2000

But, you cannot use '-' to define the arguments of a function

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of km
 Sent: Wednesday, November 03, 2010 2:05 PM
 To: r-help@r-project.org
 Subject: [R] '=' vs '-'
 
 Hi all,
 
 can we use '=' instead of '-' operator for assignment in R programs?
 
 regards,
 KM
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] '=' vs '-'

On Wed, Nov 3, 2010 at 6:04 PM, km srikrishnamo...@gmail.com wrote:
 Hi all,

 can we use '=' instead of '-' operator for assignment in R programs?

 Yes, mostly, you can also use 'help' to ask such questions:

  help(=)

The operators ‘-’ and ‘=’ assign into the environment in which
 they are evaluated.  The operator ‘-’ can be used anywhere,
 whereas the operator ‘=’ is only allowed at the top level (e.g.,
 in the complete expression typed at the command prompt) or as one
 of the subexpressions in a braced list of expressions.

and so on...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] programming questions

On Wed, Nov 3, 2010 at 6:17 PM, ivo welch ivo.we...@gmail.com wrote:
 yikes.  this is all my fault.  it was the first thing that I ever
 defined when I started using R.

   is.defined - function(name) exists(as.character(substitute(name)))

 I presume there is something much better...

 You didn't do a good job testing your is.defined :)

 Let's see what happens when you feed it 'nonexisting$garbage'. What
gets passed into 'exists'?

acs=function(name){as.character(substitute(name))}

  acs(nonexisting$garbage)
[1] $   nonexisting garbage

 - and then your exists test is doing effectively exists($) which
exists. Hence TRUE.

 What you are getting here is the expression parsed up as a function
call ($) and its args. You'll see this if you do:

  acs(fix(me))
[1] fix me

Perhaps you meant to deparse it:

  acs=function(name){as.character(deparse(substitute(name)))}
  acs(nonexisting$garbage)
 [1] nonexisting$garbage
  exists(acs(nonexisting$garbage))
 [1] FALSE

But you'd be better off testing list elements with is.null

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] getting p-values from fitted ARIMA

2010-11-03 Thread h0453497


Hi

I fitted an ARIMA model using the function arima(). The output  
consists of the fitted coefficients with their standard errors.


However i need information about the significance of the coefficients,  
like p-values. I hope you can help me on that issue...


ciao
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] '=' vs '-'

2010-11-03 Thread Joshua Wiley

@all:  Does it seem reasonable to add a discussion of '=' vs. '-' to
the FAQ?  It seems a regular question and something of a hot topic
to debate.

@KM  Here are links I've accumulated to prior discussions on this
topic.  I am pretty certain they are all unique.


http://blog.revolutionanalytics.com/2008/12/use-equals-or-arrow-for-assignment.html

http://www.mail-archive.com/r-help@r-project.org/msg69310.html

http://www.mail-archive.com/r-help@r-project.org/msg99789.html

http://www.mail-archive.com/r-help@r-project.org/msg104102.html

http://www.mail-archive.com/r-help@r-project.org/msg16881.html

https://stat.ethz.ch/pipermail/r-sig-teaching/2010q4/000312.html

http://r.789695.n4.nabble.com/advice-opinion-on-vs-in-teaching-R-td1014502.html#a1014502


On Wed, Nov 3, 2010 at 11:04 AM, km srikrishnamo...@gmail.com wrote:
 Hi all,

 can we use '=' instead of '-' operator for assignment in R programs?

 regards,
 KM

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] getting p-values from fitted ARIMA

2010-11-03 Thread Jorge Ivan Velez

Hi Stefan,

Take a look at https://stat.ethz.ch/pipermail/r-help/2009-June/202173.html

HTH,
Jorge


On Wed, Nov 3, 2010 at 2:50 PM,  wrote:

 Hi

 I fitted an ARIMA model using the function arima(). The output consists of
 the fitted coefficients with their standard errors.

 However i need information about the significance of the coefficients, like
 p-values. I hope you can help me on that issue...

 ciao
 Stefan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] NFFT on a Zoo?

2010-11-03 Thread Bob Cunningham

I have an irregular time series in a Zoo object, and I've been unable to 
find any way to do an FFT on it.  More precisely, I'd like to do an NFFT 
(non-equispaced / non-uniform time FFT) on the data.


The data is timestamped samples from a cheap self-logging 
accelerometer.  The data is weakly regular, with the following 
characteristics:

- short gaps every ~20ms
- large gaps every ~200ms
- jitter/noise in the timestamp

The gaps cover ~10% of the acquisition time.  And they occur often 
enough that the uninterrupted portions of the data are too short to 
yield useful individual FFT results, even without timestamp noise.


My searches have revealed no NFFT support in R, but I'm hoping it may be 
known under some other name (just as non-uniform time series are known 
as 'zoo' rather than 'nts' or 'nuts').


I'm using R through RPy, so any solution that makes use of numpy/scipy 
would also work.  And I care more about accuracy than speed, so a 
non-library solution in R or Python would also work.


Alternatively, is there a technique by which multiple FFTs over smaller 
(incomplete) data regions may be combined to yield an improved view of 
the whole?  My experiments have so far yielded only useless results, but 
I'm getting ready to try PCA across the set of partial FFTs.


TIA,

-BobC

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] getting p-values from fitted ARIMA

2010-11-03 Thread Achim Zeileis


On Wed, 3 Nov 2010, h0453...@wu.ac.at wrote:


Hi

I fitted an ARIMA model using the function arima(). The output consists of 
the fitted coefficients with their standard errors.


However i need information about the significance of the coefficients, like 
p-values. I hope you can help me on that issue...


If you want to use a standard normal approximation, you can use coeftest() 
from the lmtest package. For example:


fit3 - arima(presidents, c(3, 0, 0))
library(lmtest)
coeftest(fit3)

Whether or not this is a good approximation is a different question, 
though. See also the coments on ?arima wrt the Hessian.


Best,
Z


ciao
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lattice plots for images

2010-11-03 Thread Deepayan Sarkar

On Wed, Nov 3, 2010 at 8:13 AM, Neba Funwi-Gabga fusigabsm...@gmail.com wrote:
 Hello UseRs,
 I need help on how to plot several raster images (such as those obtained
 from a kernel-smoothed intensity function) in a layout
 such as that obtained from the lattice package. I would like to obtain
 something such as obtained from using the levelplot or xyplot
 in lattice. I currently use:

par(mfrow=c(3,3)

 to set the workspace, but the resulting plots leave a lot of blank space
 between individual plots. If I can get it to the lattice format,
 I think it will save me some white space.

 Any help is greatly appreciated.

It's not clear what your question is exactly, but you may want to look
at ?panel.levelplot.raster and ?panel.smoothScatter (particularly the
'raster' argument) in lattice.

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] NFFT on a Zoo?

2010-11-03 Thread Gabor Grothendieck

On Wed, Nov 3, 2010 at 2:59 PM, Bob Cunningham flym...@gmail.com wrote:
 I have an irregular time series in a Zoo object, and I've been unable to
 find any way to do an FFT on it.  More precisely, I'd like to do an NFFT
 (non-equispaced / non-uniform time FFT) on the data.

 The data is timestamped samples from a cheap self-logging accelerometer.
  The data is weakly regular, with the following characteristics:
 - short gaps every ~20ms
 - large gaps every ~200ms
 - jitter/noise in the timestamp

 The gaps cover ~10% of the acquisition time.  And they occur often enough
 that the uninterrupted portions of the data are too short to yield useful
 individual FFT results, even without timestamp noise.

 My searches have revealed no NFFT support in R, but I'm hoping it may be
 known under some other name (just as non-uniform time series are known as
 'zoo' rather than 'nts' or 'nuts').

 I'm using R through RPy, so any solution that makes use of numpy/scipy would
 also work.  And I care more about accuracy than speed, so a non-library
 solution in R or Python would also work.

 Alternatively, is there a technique by which multiple FFTs over smaller
 (incomplete) data regions may be combined to yield an improved view of the
 whole?  My experiments have so far yielded only useless results, but I'm
 getting ready to try PCA across the set of partial FFTs.


Check out the entire thread that starts here.

http://www.mail-archive.com/r-help@r-project.org/msg36349.html

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] programming questions

thanks, barry and eric.  I didn't do a good job---I did an awful job.

alas, should R not come with an is.defined() function?  a variable may
never have been created, and this is different from a variable
existing but holding a NULL.  this can be the case in the global
environment or in a data frame.

   is.null(never.before.seen)
  Error: objected 'never.before.seen' not found
   is.defined(never.before.seen)  ## I need this, because I do not
want an error:
  [1] FALSE

your acs function doesn't really do what I want, either, because {
d=data.frame( x=1:4); exists(acs(d$x)) } tells me FALSE .  I really
need

   d - data.frame( x=1:5, y=1:5 )
   is.defined(d$x)
  TRUE
   is.defined(d$z)
  FALSE
   is.defined(never.before.seen)
  FALSE
   is.defined(never.before.seen$anything)  ## if a list does not
exist, anything in it does not exist either
  FALSE

how would I define this function?

regards,

/iaw

On Wed, Nov 3, 2010 at 2:48 PM, Barry Rowlingson
b.rowling...@lancaster.ac.uk wrote:
 On Wed, Nov 3, 2010 at 6:17 PM, ivo welch ivo.we...@gmail.com wrote:
 yikes.  this is all my fault.  it was the first thing that I ever
 defined when I started using R.

   is.defined - function(name) exists(as.character(substitute(name)))

 I presume there is something much better...

  You didn't do a good job testing your is.defined :)

  Let's see what happens when you feed it 'nonexisting$garbage'. What
 gets passed into 'exists'?

 acs=function(name){as.character(substitute(name))}

   acs(nonexisting$garbage)
 [1] $           nonexisting garbage

  - and then your exists test is doing effectively exists($) which
 exists. Hence TRUE.

  What you are getting here is the expression parsed up as a function
 call ($) and its args. You'll see this if you do:

   acs(fix(me))
 [1] fix me

 Perhaps you meant to deparse it:

   acs=function(name){as.character(deparse(substitute(name)))}
   acs(nonexisting$garbage)
  [1] nonexisting$garbage
   exists(acs(nonexisting$garbage))
  [1] FALSE

 But you'd be better off testing list elements with is.null

 Barry


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] smooth: differences between R and S-PLUS

2010-11-03 Thread William Dunlap

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Nicola 
 Sturaro Sommacal (Quantide srl)
 Sent: Wednesday, November 03, 2010 10:41 AM
 To: r-help@r-project.org
 Subject: [R] smooth: differences between R and S-PLUS

 Hi!

 I am studying differences between R and S-PLUS smooth() 
 functions. I know
 from the help that they worked differently, so I ask:
  - exist a package that permit to have the same results?
  - alternatively, someone know how can I obtain the same 
 results in R, using
 a self made script?

 I know that S-PLUS use the 4(3RSR)2H running median smoothing 
 and I try to
 implement it with the code below. I obtain some result equal 
 to the S-PLUS
 one, so I think the main problem is understand how NA value 
 from moving
 median are treated.

 The R result is:
  [1] NA NA 4.6250 4.9375 4.7500 4. 3.2500 3.
  [9] 2.8750 2.5000 2.1250

 the S-PLUS one is:
  [1]   *   *   *   4.6250 4.9375 4.7500 4. 3.2500 3.**

In S+ I get:
   x1 - c(4, 1, 3, 6, 6, 4, 1, 6, 2, 4, 2)
   smooth(x1)
   1: 2.404297 3.283203 4.140625 4.789063 5.093750 4.886719
   7: 4.078125 3.269531 3.00 3.00 3.00
   start deltat frequency 
   1  1 1
   smooth(x1, twiceit=FALSE)
   1: 2.03125 3.0 3.93750 4.62500 4.93750 4.75000 4.0
   8: 3.25000 3.0 3.0 3.0
   start deltat frequency 
   1  1 1
Tukey's EDA book (1977) may give the details on how to deal
with the ends.  The code in S+ is unchanged since that era,
aside from being converted from single to double precision.
There are many better smoothers out there.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

 where * stand for a number different from the R one that I 
 don't remember.
  Unfortunately I cannot give more details about the S-PLUS 
 function now,
 because I am working on a machine without this software. If 
 someone can help
 me, tomorrow (CET time), I will provide more details.

 Thanks in advance.

 Nicola

 ### EXAMPLE
 # Comments indicates which step of the 4(3RSR)2H algorithm I try to
 replicate.

 # Data
 x1 - c(4, 1, 3, 6, 6, 4, 1, 6, 2, 4, 2)

 # 4
 out = NULL
 for (i in 1:11) {out[i] = median(x1[i:(i+3)])}
 out[is.na(out)] = x1[is.na(out)]
 out

 # (3RSR)
 x2 = smooth(out, 3RSR, twiceit = F)
 x2

 # 2
 out2 = NULL
 for (i in 1: 11) {out2[i] = median(x2[i:(i+1)])}
 out2[is.na(out2)] = x2[is.na(out2)]
 out2

 # H
 filter(out2, filter = c(1/4, 1/2, 1/4), sides = 2)

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] programming questions



alas, should R not come with an is.defined() function?  


?exists

a variable may

never have been created, and this is different from a variable
existing but holding a NULL.  this can be the case in the global
environment or in a data frame.

   is.null(never.before.seen)
  Error: objected 'never.before.seen' not found
   is.defined(never.before.seen)  ## I need this, because I do not
want an error:
  [1] FALSE


exists(never.before.seen) #notice the quotes
[1] FALSE



your acs function doesn't really do what I want, either, because {
d=data.frame( x=1:4); exists(acs(d$x)) } tells me FALSE .  I really
need

   d - data.frame( x=1:5, y=1:5 )
   is.defined(d$x)
  TRUE


with(d, exists(x))


   is.defined(d$z)
  FALSE


with(d, exists(z))


   is.defined(never.before.seen)
  FALSE


exists(never.before.seen)


   is.defined(never.before.seen$anything)  ## if a list does not
exist, anything in it does not exist either
  FALSE


This one I'm a bit confused about.  If you're
programming a function, then the user either:

1) passes in an object, which is bound to a
local variable, and therefore exists. You can
do checks on that object to see that it conforms
to any constraints you have set.

2) does not pass in the object, in which case
you can test for that with ?missing.

Is writing your own functions for others to
use what you're doing?

--Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] deleteing all but some observations in data.frame

Hi, 

 

I am sure that can be done in R

How would i delete all but let say 20 observations in data.frame?

 

Thank you, M

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] deleteing all but some observations in data.frame


It depends on which 20 you want.

If you have a data.frame called 'test.df', you can do:

#first 20
test.df[20, ]

-or-

head(test.df, 20)

#random 20
test.df[sample(nrow(test.df), 20), ]

None of this was tested, but it should be a start.

--Erik

Matevž Pavlič wrote:
Hi, 

 


I am sure that can be done in R

How would i delete all but let say 20 observations in data.frame?

 


Thank you, M

 

 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] deleteing all but some observations in data.frame


Note that these methods don't 'delete' observations.
They all create brand new objects that are
subsets of the test.df object.  You can effectively
'delete' the observations by replacing the original
data.frame with the returned object...

so

test.df - head(test.df, 20)


Erik Iverson wrote:

It depends on which 20 you want.

If you have a data.frame called 'test.df', you can do:

#first 20
test.df[20, ]

-or-

head(test.df, 20)

#random 20
test.df[sample(nrow(test.df), 20), ]

None of this was tested, but it should be a start.

--Erik

Matevž Pavlič wrote:

Hi,
 


I am sure that can be done in R

How would i delete all but let say 20 observations in data.frame?

 


Thank you, M

 

 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] programming questions



On Nov 3, 2010, at 3:32 PM, ivo welch wrote:


thanks, barry and eric.  I didn't do a good job---I did an awful job.



is.defined(never.before.seen$anything)  ## if a list does not

exist, anything in it does not exist either


Except the $ function return NULL rather than an error and you already  
said you were willing to accept a NULL value as being different than  
not-existing.


You may want to look at the difference between `$` and `[` methods of  
accessing values.


You can test for never.before.seen as an object

is.defined - function(x) !(try-error %in% class(try(x)) )

But it won't give your desired result on d$never.before.seen which  
does not throw an error. For that you would need an additional test of  
the sort Iverson is suggesting.


--
David.


 FALSE

how would I define this function?

regards,

/iaw

On Wed, Nov 3, 2010 at 2:48 PM, Barry Rowlingson
b.rowling...@lancaster.ac.uk wrote:
On Wed, Nov 3, 2010 at 6:17 PM, ivo welch ivo.we...@gmail.com  
wrote:

yikes.  this is all my fault.  it was the first thing that I ever
defined when I started using R.

  is.defined - function(name)  
exists(as.character(substitute(name)))


I presume there is something much better...


 You didn't do a good job testing your is.defined :)

 Let's see what happens when you feed it 'nonexisting$garbage'. What
gets passed into 'exists'?

acs=function(name){as.character(substitute(name))}

  acs(nonexisting$garbage)
[1] $   nonexisting garbage

 - and then your exists test is doing effectively exists($) which
exists. Hence TRUE.

 What you are getting here is the expression parsed up as a function
call ($) and its args. You'll see this if you do:

  acs(fix(me))
[1] fix me

Perhaps you meant to deparse it:

  acs=function(name){as.character(deparse(substitute(name)))}
  acs(nonexisting$garbage)
 [1] nonexisting$garbage
  exists(acs(nonexisting$garbage))
 [1] FALSE

But you'd be better off testing list elements with is.null

Barry



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] optim works on command-line but not inside a function

2010-11-03 Thread Berend Hasselman



Damokun wrote:
 
 Dear all, 
 
 I am trying to optimize a logistic function using optim, inside the
 following functions: 
 #Estimating a and b from thetas and outcomes by ML
 
 IRT.estimate.abFromThetaX - function(t, X, inits, lw=c(-Inf,-Inf),
 up=rep(Inf,2)){
   optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, 
   gr=IRT.gradZL, 
   lower=lw, upper=up, t=t, X=X)
   c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) )
 }
 
 #Estimating a and b from thetas and outcomes by ML, avoiding 0*log(0)
 IRT.estimate.abFromThetaX2 - function(tar, Xes, inits, lw=c(-Inf,-Inf),
 up=rep(Inf,2)){
 
   optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, 
   gr=IRT.gradZL, 
   lower=lw, upper=up, t=tar, X=Xes)
   c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) )
 }
 
 The problem is that this does not work: 
 IRT.estimate.abFromThetaX(sx, st, c(0,0))
 Error in optim(inits, method = L-BFGS-B, fn = IRT.llZetaLambdaCorrNan, 
 : 
   L-BFGS-B needs finite values of 'fn'
 But If I try the same optim call on the command line, with the same data,
 it works fine:
 optRes - optim(c(0,0), method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, 
 +   gr=IRT.gradZL, 
 +   lower=c(-Inf, -Inf), upper=c(Inf, Inf), t=st, X=sx)
 optRes
 $par
 [1] -0.6975157  0.7944972
 $convergence
 [1] 0
 $message
 [1] CONVERGENCE: REL_REDUCTION_OF_F = FACTR*EPSMCH
 

In your command line you have set t=st and X=sx.
However in the alternative you do:  IRT.estimate.abFromThetaX(sx, st,
c(0,0))

Therefore you are assigning sx to t and st to X in the
IRT.estimate.abFromThetaX function, which is reversed from your command line
call.

You should switch sx and st in the function call:
IRT.estimate.abFromThetaX(st, sx, c(0,0))

and then all will be well.

best

Berend
If yoou
-- 
View this message in context: 
http://r.789695.n4.nabble.com/optim-works-on-command-line-but-not-inside-a-function-tp3025414p3026099.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] programming questions

thanks, eric---I need a little more clarification.  *yes, I write
functions and then forget them.  so I want them to be self-sufficient.
 I want to write functions that check all their arguments for
validity.)  For example,

  my.fn - function( mylist ) {
  stop.if.not( is.defined(mylist) )  ## ok, superfluous
  stop.if.not( is.defined(mylist$dataframe.in.mylist ))
  stop.if.not( is.defined(mylist$dataframe.in.mylist$a.component.I.need) )
  ### other checks, such as whether the component I need is long
enough, positive, etc.
  ### could be various other operations
  mylist$dataframe.in.mylist$a.component.I.need
  }

so

  my.fn( asd )   ## R gives me an error, asd is not in existence
  my.fn( NULL )  ## second error: the list component
'dataframe.in.mylist' I need is not there
  my.fn( data.frame( some.other.component=1:4 ) )  ## second error;
the list component  'dataframe.in.mylist' I need is not there
  my.fn( list( hello=1, silly=data.frame( x=1:4 ) ) ) ## second error:
dataframe.in.mylist does not exist
  my.fn( list( hello=2, dataframe.in.mylist= data.frame(
a.component.I.need=1:4 ))  ## ok

exists() works on a stringified variable name.  how do I stringify in R?


PS: btw, is it possible to weave documentation into my user function,
so that I can type ?is.defined and I get a doc page that I have
written?  Ala perl pod.  I think I asked this before, and the answer
was no.

/iaw




Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com)
CV Starr Professor of Economics (Finance), Brown University
http://www.ivo-welch.info/





On Wed, Nov 3, 2010 at 3:40 PM, Erik Iverson er...@ccbr.umn.edu wrote:

 alas, should R not come with an is.defined() function?

 ?exists

 a variable may

 never have been created, and this is different from a variable
 existing but holding a NULL.  this can be the case in the global
 environment or in a data frame.

   is.null(never.before.seen)
  Error: objected 'never.before.seen' not found
   is.defined(never.before.seen)  ## I need this, because I do not
 want an error:
  [1] FALSE

 exists(never.before.seen) #notice the quotes
 [1] FALSE


 your acs function doesn't really do what I want, either, because {
 d=data.frame( x=1:4); exists(acs(d$x)) } tells me FALSE .  I really
 need

   d - data.frame( x=1:5, y=1:5 )
   is.defined(d$x)
  TRUE

 with(d, exists(x))

   is.defined(d$z)
  FALSE

 with(d, exists(z))

   is.defined(never.before.seen)
  FALSE

 exists(never.before.seen)

   is.defined(never.before.seen$anything)  ## if a list does not
 exist, anything in it does not exist either
  FALSE

 This one I'm a bit confused about.  If you're
 programming a function, then the user either:

 1) passes in an object, which is bound to a
 local variable, and therefore exists. You can
 do checks on that object to see that it conforms
 to any constraints you have set.

 2) does not pass in the object, in which case
 you can test for that with ?missing.

 Is writing your own functions for others to
 use what you're doing?

 --Erik



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] NFFT on a Zoo?

2010-11-03 Thread Bob Cunningham


On 11/03/2010 12:27 PM, Gabor Grothendieck wrote:

On Wed, Nov 3, 2010 at 2:59 PM, Bob Cunninghamflym...@gmail.com  wrote:
   

I have an irregular time series in a Zoo object, and I've been unable to
find any way to do an FFT on it.  More precisely, I'd like to do an NFFT
(non-equispaced / non-uniform time FFT) on the data.

The data is timestamped samples from a cheap self-logging accelerometer.
  The data is weakly regular, with the following characteristics:
- short gaps every ~20ms
- large gaps every ~200ms
- jitter/noise in the timestamp

The gaps cover ~10% of the acquisition time.  And they occur often enough
that the uninterrupted portions of the data are too short to yield useful
individual FFT results, even without timestamp noise.

My searches have revealed no NFFT support in R, but I'm hoping it may be
known under some other name (just as non-uniform time series are known as
'zoo' rather than 'nts' or 'nuts').

I'm using R through RPy, so any solution that makes use of numpy/scipy would
also work.  And I care more about accuracy than speed, so a non-library
solution in R or Python would also work.

Alternatively, is there a technique by which multiple FFTs over smaller
(incomplete) data regions may be combined to yield an improved view of the
whole?  My experiments have so far yielded only useless results, but I'm
getting ready to try PCA across the set of partial FFTs.

 

Check out the entire thread that starts here.

http://www.mail-archive.com/r-help@r-project.org/msg36349.html
   


Thanks for the instantaneous reply!

While I couldn't follow the details of that discussion, it seems the 
periodogram is intended to detect weak periodic signals in irregular 
data.  What I have is occasional strong signals of varying amplitude, 
duration and spectrum (events) in irregular data.


Are the two cases equivalent from the periodogram perspective?

Each event looks like an impulse with decaying oscillation (a smack 
followed by a fading ring), where the initial impulse can sometimes 
saturate the device.


I also don't yet know the bandwidth of the device: I do know that 
samples are taken at a nominal rate of 640 Hz, and I have about 4 
million samples.


My initial goal is to determine the accuracy of the timestamps:  Is the 
jitter in the time values real or not?  My initial plan was to do 2 
NFFTs:  One with unmodified data, and one with the time quantized 
(gridded) to periods of 1/640 Hz.  If the gridded FFT is 'sharper', then 
I'll know the time jitter is meaningless.


My secondary goal is to determine the signal bandwidth, with the hope of 
using a slower sampling rate, since rates of 160 Hz and below are free 
of gaps.


After that, I need to compare data from two devices taken under 
nominally identical conditions, to see if they record events 
equivalently (magnitude, duration, and frequency spectrum), with the 
goal being to determine a relative calibration for the pair of devices.


Finally, I'll take more data with the devices in separate (but linked) 
locations, to determine the mechanical characteristics of that 
environment.  After applying the calibration determined above, I'll 
again want to compare the events to quantify their differences.


Should I be pursuing analytical methods other than the NFFT?

-BobC

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FW: optim works on command-line but not inside a function

2010-11-03 Thread Diederik Roijers

Well, the function should not be able to be infinite as IRT.llZetaLambdaCorrNan
is a sum of products of either one or zero and log(1+exp(x)) or
log(1+exp(-x)) (these logs are always bigger or equal to log(1)=0) Further
more, I bounded x to be finite to fix my problem (as I expected that it
might try x-Inf. But this did not help.

And it is a mystery to me why it would work on the command line, and not as
part of a function (it is just one call, and exactly the same one too.) (I
tried this in order to find proper intervals and start values where the
error would not arise. But to my surprise it just gave normal values when I
used the same settings as in the function.)

Thanks,
Diederik


On 3 November 2010 15:41, Roijers, D.M. d.m.roij...@students.uu.nl wrote:


 ---
 *From:* Jonathan P Daily[SMTP:jda...@usgs.gov smtp%3ajda...@usgs.gov]
 *Sent:* Wednesday, November 03, 2010 3:26:09 PM
 *To:* Damokun
 *Cc:* r-help@r-project.org; r-help-boun...@r-project.org
 *Subject:* Re: [R] optim works on command-line but not inside a function
 *Auto forwarded by a Rule*


 As the error message says, the values of your function must be finite in
 order to run the algorithm.

 Some part of your loop is passing arguments (inits maybe... you only tried
 (0,0) in the CLI example) that cause IRT.llZetaLambdaCorrNan to be
 infinite.
 --
 Jonathan P. Daily
 Technician - USGS Leetown Science Center
 11649 Leetown Road
 Kearneysville WV, 25430
 (304) 724-4480
 Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly


  From: Damokun dmroi...@students.cs.uu.nl To:
 r-help@r-project.org
 Date: 11/03/2010 10:19 AM Subject:
 [R] optim works on command-line but not inside a function
 Sent by: r-help-boun...@r-project.org
 --




 Dear all,

 I am trying to optimize a logistic function using optim, inside the
 following functions:
 #Estimating a and b from thetas and outcomes by ML

 IRT.estimate.abFromThetaX - function(t, X, inits, lw=c(-Inf,-Inf),
 up=rep(Inf,2)){

  optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan,

  gr=IRT.gradZL,

  lower=lw, upper=up, t=t, X=X)

  c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) )

 }

 #Estimating a and b from thetas and outcomes by ML, avoiding 0*log(0)
 IRT.estimate.abFromThetaX2 - function(tar, Xes, inits, lw=c(-Inf,-Inf),
 up=rep(Inf,2)){

  optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan,

  gr=IRT.gradZL,

  lower=lw, upper=up, t=tar, X=Xes)

  c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) )

 }

 The problem is that this does not work:
  IRT.estimate.abFromThetaX(sx, st, c(0,0))
 Error in optim(inits, method = L-BFGS-B, fn = IRT.llZetaLambdaCorrNan,  :

  L-BFGS-B needs finite values of 'fn'
 But If I try the same optim call on the command line, with the same data,
 it
 works fine:
  optRes - optim(c(0,0), method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan,
 +   gr=IRT.gradZL,
 +   lower=c(-Inf, -Inf), upper=c(Inf, Inf), t=st, X=sx)
  optRes
 $par
 [1] -0.6975157  0.7944972
 $convergence
 [1] 0
 $message
 [1] CONVERGENCE: REL_REDUCTION_OF_F = FACTR*EPSMCH

 Does anyone have an idea what this could be, and what I could try to avoid
 this error? I tried bounding the parameters, with lower=c(-10, -10) and
 upper=... but that made no difference.

 Thanks,
 Diederik Roijers
 Utrecht University MSc student.
 --
 PS: the other functions I am using are:

 #IRT.p is the function that represents the probability
 #of a positive outcome of an item with difficulty b,
 #discriminativity a, in combination with a student with
 #competence theta.

 IRT.p - function(theta, a, b){

   epow - exp(-a*(theta-b))

   result - 1/(1+epow)

   result

 }


 # = IRT.p^-1 ; for usage in the loglikelihood

 IRT.oneOverP - function(theta, a, b){

   epow - exp(-a*(theta-b))

   result - (1+epow)

   result

 }

 # = (1-IRT.p)^-1 ; for usage in the loglikelihood
 IRT.oneOverPneg - function(theta, a, b){

   epow - exp(a*(theta-b))

   result - (1+epow)

   result

 }


 #simulation-based sample generation of thetas and outcomes
 #based on a given a and b. (See IRT.p) The sample-size is n

 IRT.generateSample - function(a, b, n){

   x-rnorm(n, mean=b, sd=b/2)

   t-IRT.p(x,a,b)

   ch-runif(length(t))

   t[t=ch]=1

   t[tch]=0

   cbind(x,t)

 }


 #This loglikelihood function is based on the a and be parameters,
 #and requires thetas as input in X, and outcomes in t
 #prone to give NaN errors due to 0*log(0)

 IRT.logLikelihood2 - function(params, t, X){

   pos- sum(t * log(IRT.p(X,params[1],params[2])))

   neg- sum(  (1-t) * log( (1-IRT.p(X,params[1],params[2])) )  )

   -pos-neg

 }

 #Avoiding NaN problems due to 0*log(0)
 #otherwise equivalent to IRT.logLikelihood2
 IRT.logLikelihood2CorrNan - function(params, t, X){

   pos- sum(t *

Re: [R] programming questions

2010-11-03 Thread Henrik Bengtsson

On Wed, Nov 3, 2010 at 1:04 PM, ivo welch ivo.we...@gmail.com wrote:
 thanks, eric---I need a little more clarification.  *yes, I write
 functions and then forget them.  so I want them to be self-sufficient.
  I want to write functions that check all their arguments for
 validity.)  For example,

  my.fn - function( mylist ) {
      stop.if.not( is.defined(mylist) )  ## ok, superfluous
      stop.if.not( is.defined(mylist$dataframe.in.mylist ))
      stop.if.not( is.defined(mylist$dataframe.in.mylist$a.component.I.need) )
      ### other checks, such as whether the component I need is long
 enough, positive, etc.
      ### could be various other operations
      mylist$dataframe.in.mylist$a.component.I.need
  }

See the Arguments class in R.utils, e.g.

library(R.utils);

my.fn - function(mylist) {
  # Assert a data.frame element exists
  df - Arguments$getInstanceOf(mylist$dataframe.in.mylist, data.frame);

  # Assert x = 0 and of length 45:67.
  x - df$a.component.I.need;
  x - Arguments$getDoubles(x, range=c(0,Inf), length=c(45,67));

   ### could be various other operations
   mylist$dataframe.in.mylist$a.component.I.need
}

/Henrik


 so

  my.fn( asd )   ## R gives me an error, asd is not in existence
  my.fn( NULL )  ## second error: the list component
 'dataframe.in.mylist' I need is not there
  my.fn( data.frame( some.other.component=1:4 ) )  ## second error;
 the list component  'dataframe.in.mylist' I need is not there
  my.fn( list( hello=1, silly=data.frame( x=1:4 ) ) ) ## second error:
 dataframe.in.mylist does not exist
  my.fn( list( hello=2, dataframe.in.mylist= data.frame(
 a.component.I.need=1:4 ))  ## ok

 exists() works on a stringified variable name.  how do I stringify in R?


 PS: btw, is it possible to weave documentation into my user function,
 so that I can type ?is.defined and I get a doc page that I have
 written?  Ala perl pod.  I think I asked this before, and the answer
 was no.

 /iaw



 
 Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com)
 CV Starr Professor of Economics (Finance), Brown University
 http://www.ivo-welch.info/





 On Wed, Nov 3, 2010 at 3:40 PM, Erik Iverson er...@ccbr.umn.edu wrote:

 alas, should R not come with an is.defined() function?

 ?exists

 a variable may

 never have been created, and this is different from a variable
 existing but holding a NULL.  this can be the case in the global
 environment or in a data frame.

   is.null(never.before.seen)
  Error: objected 'never.before.seen' not found
   is.defined(never.before.seen)  ## I need this, because I do not
 want an error:
  [1] FALSE

 exists(never.before.seen) #notice the quotes
 [1] FALSE


 your acs function doesn't really do what I want, either, because {
 d=data.frame( x=1:4); exists(acs(d$x)) } tells me FALSE .  I really
 need

   d - data.frame( x=1:5, y=1:5 )
   is.defined(d$x)
  TRUE

 with(d, exists(x))

   is.defined(d$z)
  FALSE

 with(d, exists(z))

   is.defined(never.before.seen)
  FALSE

 exists(never.before.seen)

   is.defined(never.before.seen$anything)  ## if a list does not
 exist, anything in it does not exist either
  FALSE

 This one I'm a bit confused about.  If you're
 programming a function, then the user either:

 1) passes in an object, which is bound to a
 local variable, and therefore exists. You can
 do checks on that object to see that it conforms
 to any constraints you have set.

 2) does not pass in the object, in which case
 you can test for that with ?missing.

 Is writing your own functions for others to
 use what you're doing?

 --Erik



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] NFFT on a Zoo?

2010-11-03 Thread Mike Marchywka

From: ggrothendi...@gmail.com
Date: Wed, 3 Nov 2010 15:27:13 -0400
To: flym...@gmail.com
CC: r-help@r-project.org; rpy-l...@lists.sourceforge.net
Subject: Re: [R] NFFT on a Zoo?

On Wed, Nov 3, 2010 at 2:59 PM, Bob Cunningham wrote:
I have an irregular time series in a Zoo object, and I've been unable to
find any way to do an FFT on it. More precisely, I'd like to do an NFFT
(non-equispaced / non-uniform time FFT) on the data.

The data is timestamped samples from a cheap self-logging accelerometer.
The data is weakly regular, with the following characteristics:
- short gaps every ~20ms
- large gaps every ~200ms
- jitter/noise in the timestamp

The gaps cover ~10% of the acquisition time. And they occur often enough
that the uninterrupted portions of the data are too short to yield useful
individual FFT results, even without timestamp noise.

My searches have revealed no NFFT support in R, but I'm hoping it may be
known under some other name (just as non-uniform time series are known as
'zoo' rather than 'nts' or 'nuts').

I'm using R through RPy, so any solution that makes use of numpy/scipy would
also work. And I care more about accuracy than speed, so a non-library
solution in R or Python would also work.

Alternatively, is there a technique by which multiple FFTs over smaller
(incomplete) data regions may be combined to yield an improved view of the
whole? My experiments have so far yielded only useless results, but I'm
getting ready to try PCA across the set of partial FFTs.

I'm pretty sure all of this is in Oppenheim and Shaffer meaning it
is also in any newer books. I recall something about averaging
but you'd need to look at details. Alternatively, and this is from
distant memory so maybe someone else can comment, you can just
feed a regularly spaced time series to anyone, go get FFTW for example,
and insert zeroes for missing data. This is equivalent to multiplying
your real data with a window function that is zero at missing points.
I think you can prove that multiplication
in time domain is convolution in FT domain so you can back this out
by deconvolving with your window function spectrum. This probably is not
painless, the window spectrum will have badly placed zeroes etc, but it
may be helpful.
Apaprently this is still a bit of an open issue,

http://books.google.com/books?id=BW1PdOqZo6ACpg=PA2lpg=PA2dq=dft+window+missing+datasource=blots=fSY-iRoCNNsig=30cC0SdkrDcp62iWc-Mv26mfNjIhl=enei=AMTRTNmyMYP88AauxtzKDAsa=Xoi=book_resultct=resultresnum=6ved=0CDEQ6AEwBTgK#v=onepageqf=false

You should be able to do the case of a sine wave with pencil and paper
and see if or how this really would work.

Check out the entire thread that starts here.

http://www.mail-archive.com/r-help@r-project.org/msg36349.html

--
Statistics Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Loop

Hi all, 

 

I managed to do what i want (with the great help of thi mailing list)  manually 
. Now i would like to automate it. I would probably need a for loop for to help 
me with this...but of course  I have no idea how to do that in R.  Bellow is 
the code that i would like to be replicated for a number of times (let say 20). 
I would like to achieve  that w1 would change to w2, w3, w4 ... up to w20 and 
by that create 20 data.frames that I would than bind together with cbind. 

 

(i did it like shown bellow -manually)

 

w1-table(lit$W1)

w1-as.data.frame(w1)

write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.)

w1 - w1[order(w1$Freq, decreasing=TRUE),]

w1-head(w1, 20)

 

 

w2-table(lit$W2)

w2-as.data.frame(w2)

write.table(w2,file=w2.csv,sep=;,row.names=T, dec=.)

w2 - w2[order(w2$Freq, decreasing=TRUE),]

w2-head(w2, 20)

 

.

.

.

Thanks for the help,m

 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] avoiding too many loops - reshaping data

Hello!

I have a data frame like this one:

mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b),
  brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z),
  value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116))
(mydf)

What I need to get is a data frame like the one below - cities as
rows, brands as columns, and the sums of the value within each
city/brand combination in the body of the data frame:

city x   yz
a3   23  336
b7   42  231


I have written a code that involves multiple loops and subindexing -
but it's taking too long.
I am sure there must be a more efficient way of doing it.

Thanks a lot for your hints!


-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoiding too many loops - reshaping data


Hadley's reshape package (google for it) can
do this. There's a nice intro on the site.

 library(reshape)

 cast(melt(mydf, measure.vars = value), city ~ brand,
  fun.aggregate = sum)

  city  x  y   z
1a  3 23 450
2b 12 42 231

Although the numbers differ slightly?

I've heard of the reshape2 package, but have no idea
if that's replaced the reshape package yet.

--Erik


Dimitri Liakhovitski wrote:

Hello!

I have a data frame like this one:

mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b),
  brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z),
  value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116))
(mydf)

What I need to get is a data frame like the one below - cities as
rows, brands as columns, and the sums of the value within each
city/brand combination in the body of the data frame:

city x   yz
a3   23  336
b7   42  231


I have written a code that involves multiple loops and subindexing -
but it's taking too long.
I am sure there must be a more efficient way of doing it.

Thanks a lot for your hints!




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoiding too many loops - reshaping data

2010-11-03 Thread Henrique Dallazuanna

Try this:

 xtabs(value ~ city + brand, mydf)

On Wed, Nov 3, 2010 at 6:23 PM, Dimitri Liakhovitski 
dimitri.liakhovit...@gmail.com wrote:

 Hello!

 I have a data frame like this one:


 mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b),
  brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z),
  value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116))
 (mydf)

 What I need to get is a data frame like the one below - cities as
 rows, brands as columns, and the sums of the value within each
 city/brand combination in the body of the data frame:

 city x   yz
 a3   23  336
 b7   42  231


 I have written a code that involves multiple loops and subindexing -
 but it's taking too long.
 I am sure there must be a more efficient way of doing it.

 Thanks a lot for your hints!


 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoiding too many loops - reshaping data

Thanks a lot!

Yes - I just found the reshape package too - and guess what, my math was wrong!
reshape2 seems like the more up-to-date version of reshape.

Wonder what's faster - xtabs or dcast...
Dimitri

On Wed, Nov 3, 2010 at 4:32 PM, Henrique Dallazuanna www...@gmail.com wrote:
 Try this:

  xtabs(value ~ city + brand, mydf)

 On Wed, Nov 3, 2010 at 6:23 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:

 Hello!

 I have a data frame like this one:


 mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b),
  brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z),
  value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116))
 (mydf)

 What I need to get is a data frame like the one below - cities as
 rows, brands as columns, and the sums of the value within each
 city/brand combination in the body of the data frame:

 city x   y    z
 a    3   23  336
 b    7   42  231


 I have written a code that involves multiple loops and subindexing -
 but it's taking too long.
 I am sure there must be a more efficient way of doing it.

 Thanks a lot for your hints!


 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O




-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoiding too many loops - reshaping data

In reshape2 this does the job:

dcast(mydf,city~brand,sum)


On Wed, Nov 3, 2010 at 4:37 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Thanks a lot!

 Yes - I just found the reshape package too - and guess what, my math was 
 wrong!
 reshape2 seems like the more up-to-date version of reshape.

 Wonder what's faster - xtabs or dcast...
 Dimitri

 On Wed, Nov 3, 2010 at 4:32 PM, Henrique Dallazuanna www...@gmail.com wrote:
 Try this:

  xtabs(value ~ city + brand, mydf)

 On Wed, Nov 3, 2010 at 6:23 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:

 Hello!

 I have a data frame like this one:


 mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b),
  brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z),
  value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116))
 (mydf)

 What I need to get is a data frame like the one below - cities as
 rows, brands as columns, and the sums of the value within each
 city/brand combination in the body of the data frame:

 city x   y    z
 a    3   23  336
 b    7   42  231


 I have written a code that involves multiple loops and subindexing -
 but it's taking too long.
 I am sure there must be a more efficient way of doing it.

 Thanks a lot for your hints!


 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O




 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com




-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] multi-level cox ph with time-dependent covariates

2010-11-03 Thread Mattia Prosperi

Dear all,

I would like to know if it is possible to fit in R a Cox ph model with
time-dependent covariates and to account for hierarchical effects at
the same time. Additionally, I'd like also to know if it would be
possible to perform any feature selection on this model fit.

I have a data set that is composed by multiple marker measurements
(and hundreds of covariates) at different time points from different
tissue samples of different patients. Suppose that the data were
coming from animal model with very few subjects (n=6) that were
followed up given a pathogen exposure, measured several times,
sampling different tissues in the same days, until a certain outcome
was reached (or outcome censored). Suppose that the pathogen can vary
over time (might be a bacteria that selects for drug-resistance) and
that also it can vary across different tissue reservoirs within the
same patient.

In other words: names(data) = patient_id, start_time, stop_time,
tissue_id, pathogen_type, marker1, ..., marker100, ..., outcome

If I had multiple observations per patient at different time
intervals, I would model it like this (hope it is correct)

model-coxph(Surv(start_time,stop_time,outcome)~all_covariates+cluster(patient_id))

But now I have both the patient and the tissue, and hundreds of
different variables. I thought I could use the coxme library, since it
has also a ridge regression feature. Shall I then model nested random
effects by considering both the patient_id and the tissue_id?

Like model-coxme(Surv(start_time,stop_time,outcome) ~ covariates + (1
| patient_id/tissue_id))

Then, how could I shrink the coefficients in order to select a subset
of them with non-neglegible effects? May I also consider the
possibility to run an AIC-based forward-backward selection?

thanks and apologies if I am completely out of the trails,

M.P.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoiding too many loops - reshaping data

Want to thank everyone once more for pointing in reshape direction.
Saved me about 16 hours of looping!
Dimitri

On Wed, Nov 3, 2010 at 4:38 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 In reshape2 this does the job:

 dcast(mydf,city~brand,sum)


 On Wed, Nov 3, 2010 at 4:37 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Thanks a lot!

 Yes - I just found the reshape package too - and guess what, my math was 
 wrong!
 reshape2 seems like the more up-to-date version of reshape.

 Wonder what's faster - xtabs or dcast...
 Dimitri

 On Wed, Nov 3, 2010 at 4:32 PM, Henrique Dallazuanna www...@gmail.com 
 wrote:
 Try this:

  xtabs(value ~ city + brand, mydf)

 On Wed, Nov 3, 2010 at 6:23 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:

 Hello!

 I have a data frame like this one:


 mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b),
  brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z),
  value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116))
 (mydf)

 What I need to get is a data frame like the one below - cities as
 rows, brands as columns, and the sums of the value within each
 city/brand combination in the body of the data frame:

 city x   y    z
 a    3   23  336
 b    7   42  231


 I have written a code that involves multiple loops and subindexing -
 but it's taking too long.
 I am sure there must be a more efficient way of doing it.

 Thanks a lot for your hints!


 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O




 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com




 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com




-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loop

Hi, 

Thanks for the help and the manuals. Will come very handy i am sure. 

But regarding the code i don't hink this is what i wantbasically i would 
like to repeat bellow code :

w1-table(lit$W1)
w1-as.data.frame(w1)
write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.)
w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20)

20 times, where W1-20 (capital letters) are the fields in a data.frame called 
lit and w1-20 are the data.frames being created.

Hope that explains it better, 

m


-Original Message-
From: Patrick Burns [mailto:pbu...@pburns.seanet.com]
Sent: Wednesday, November 03, 2010 9:30 PM
To: Matevž Pavlič
Subject: Re: [R] Loop

If I understand properly, you'll want
something like:

lit[[w2]]

instead of

lit$w2

more accurately:

for(i in 1:20) {
vari - paste(w, i)
lit[[vari]]

...
}

The two documents mentioned in my
signature may help you.

On 03/11/2010 20:23, Matevž Pavlič wrote:
 Hi all,



 I managed to do what i want (with the great help of thi mailing list)  
 manually . Now i would like to automate it. I would probably need a for loop 
 for to help me with this...but of course  I have no idea how to do that in R. 
  Bellow is the code that i would like to be replicated for a number of times 
 (let say 20). I would like to achieve  that w1 would change to w2, w3, w4 ... 
 up to w20 and by that create 20 data.frames that I would than bind together 
 with cbind.



 (i did it like shown bellow -manually)



 w1-table(lit$W1)

 w1-as.data.frame(w1)

 write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.)

 w1- w1[order(w1$Freq, decreasing=TRUE),]

 w1-head(w1, 20)





 w2-table(lit$W2)

 w2-as.data.frame(w2)

 write.table(w2,file=w2.csv,sep=;,row.names=T, dec=.)

 w2- w2[order(w2$Freq, decreasing=TRUE),]

 w2-head(w2, 20)



 .

 .

 .

 Thanks for the help,m








   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


--
Patrick Burns
pbu...@pburns.seanet.com
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] multivariate Poisson distribution

2010-11-03 Thread Jourdan Gold





Hello, from a search of the archives and functions, I am looking for 
information on creating random correlated counts from a multivariate Poisson 
distribution.Â  I can not seem to find a function that doesÂ this. Perhaps, it 
has not yet Â been created. Has anyone created an R package that does this. 

Â  

thanks, 

Â  

Jourdan Gold 



Â  

Â 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] biding rows while merging at the same time

Hello!

I have 2 data frames like this (well, actually, I have 200 of them):

df1-data.frame(location=c(loc 1,loc 2,loc
3),date=c(1/1/2010,1/1/2010,1/1/2010), a=1:3,b=11:13,c=111:113)
df2-data.frame(location=c(loc 1,loc 2,loc
3),date=c(2/1/2010,2/1/2010,2/1/2010),
a=4:6,c=114:116,d=c(1,11,111))
(df1)
(df2)

I am trying to just rbind them, which is impossible, because not every
column is present in every data frame.
I can't merge them -
merge(df1,df2,by.x=location,by.y=location,all.x=T,all.y=T) -
because it kinda cbinds them.
What I need is something that looks like this:

location   datea  b  c  d
loc 1   1/1/2010  1  11   111   NA
loc 2   1/1/2010  2  12   112   NA
loc 3   1/1/2010  3  13   113   NA
loc 1   2/1/2010  3  NA  114  1
loc 2   2/1/2010  5  NA  115  11
loc 3   2/1/2010  6  NA  116   111

Thanks a lot for your suggestions!

-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] biding rows while merging at the same time

Never mind - I found it in reshape package: rbind.fill
I wonder if it's still in reshape2.
Dimitri

On Wed, Nov 3, 2010 at 5:34 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Hello!

 I have 2 data frames like this (well, actually, I have 200 of them):

 df1-data.frame(location=c(loc 1,loc 2,loc
 3),date=c(1/1/2010,1/1/2010,1/1/2010), a=1:3,b=11:13,c=111:113)
 df2-data.frame(location=c(loc 1,loc 2,loc
 3),date=c(2/1/2010,2/1/2010,2/1/2010),
 a=4:6,c=114:116,d=c(1,11,111))
 (df1)
 (df2)

 I am trying to just rbind them, which is impossible, because not every
 column is present in every data frame.
 I can't merge them -
 merge(df1,df2,by.x=location,by.y=location,all.x=T,all.y=T) -
 because it kinda cbinds them.
 What I need is something that looks like this:

 location   date    a  b      c      d
 loc 1   1/1/2010  1  11   111   NA
 loc 2   1/1/2010  2  12   112   NA
 loc 3   1/1/2010  3  13   113   NA
 loc 1   2/1/2010  3  NA  114  1
 loc 2   2/1/2010  5  NA  115  11
 loc 3   2/1/2010  6  NA  116   111

 Thanks a lot for your suggestions!

 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com




-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] biding rows while merging at the same time


Just

merge(df1, df2, all = TRUE)

does it, yes?

Dimitri Liakhovitski wrote:

Hello!

I have 2 data frames like this (well, actually, I have 200 of them):

df1-data.frame(location=c(loc 1,loc 2,loc
3),date=c(1/1/2010,1/1/2010,1/1/2010), a=1:3,b=11:13,c=111:113)
df2-data.frame(location=c(loc 1,loc 2,loc
3),date=c(2/1/2010,2/1/2010,2/1/2010),
a=4:6,c=114:116,d=c(1,11,111))
(df1)
(df2)

I am trying to just rbind them, which is impossible, because not every
column is present in every data frame.
I can't merge them -
merge(df1,df2,by.x=location,by.y=location,all.x=T,all.y=T) -
because it kinda cbinds them.
What I need is something that looks like this:

location   datea  b  c  d
loc 1   1/1/2010  1  11   111   NA
loc 2   1/1/2010  2  12   112   NA
loc 3   1/1/2010  3  13   113   NA
loc 1   2/1/2010  3  NA  114  1
loc 2   2/1/2010  5  NA  115  11
loc 3   2/1/2010  6  NA  116   111

Thanks a lot for your suggestions!



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loop



On Nov 3, 2010, at 5:03 PM, Matevž Pavlič wrote:


Hi,

Thanks for the help and the manuals. Will come very handy i am sure.

But regarding the code i don't hink this is what i wantbasically  
i would like to repeat bellow code :


w1-table(lit$W1)
w1-as.data.frame(w1)


It appears you are not reading for meaning. Burns has advised you how  
to construct column names and use them in your initial steps. The `$`  
function is quite limited in comparison to `[[` , so he was showing  
you a method that would be more effective.  BTW the as.data.frame step  
is unnecessary, since the first thing write.table does is coerce an  
object to a data.frame. The write.table name is misleading. It  
should be write.data.frame. You cannot really write tables with  
write.table.


You would also use:

 file=paste(vari, csv, sep=.) as the file argument to write.table


write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.)


What are these next actions supposed to do after the file is written?  
Are you trying to store a group of related w objects that will later  
be indexed in sequence? If so, then a list would make more sense.


--
David.


w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20)

20 times, where W1-20 (capital letters) are the fields in a  
data.frame called lit and w1-20 are the data.frames being created.


Hope that explains it better,



m

-Original Message-
From: Patrick Burns [mailto:pbu...@pburns.seanet.com]
Subject: Re: [R] Loop

If I understand properly, you'll want
something like:

lit[[w2]]

instead of

lit$w2

more accurately:

for(i in 1:20) {
vari - paste(w, i)
lit[[vari]]

...
}

The two documents mentioned in my
signature may help you.

On 03/11/2010 20:23, Matevž Pavlič wrote:

Hi all,

I managed to do what i want (with the great help of thi mailing  
list)  manually . Now i would like to automate it. I would probably  
need a for loop for to help me with this...but of course  I have no  
idea how to do that in R.  Bellow is the code that i would like to  
be replicated for a number of times (let say 20). I would like to  
achieve  that w1 would change to w2, w3, w4 ... up to w20 and by  
that create 20 data.frames that I would than bind together with  
cbind.


(i did it like shown bellow -manually)

w1-table(lit$W1)
w1-as.data.frame(w1)
write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.)
w1- w1[order(w1$Freq, decreasing=TRUE),]
w1-head(w1, 20)

w2-table(lit$W2)

w2-as.data.frame(w2)

write.table(w2,file=w2.csv,sep=;,row.names=T, dec=.)

w2- w2[order(w2$Freq, decreasing=TRUE),]

w2-head(w2, 20)
.
.
.

Thanks for the help,m






David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] biding rows while merging at the same time



On Nov 3, 2010, at 5:38 PM, Dimitri Liakhovitski wrote:


Never mind - I found it in reshape package: rbind.fill
I wonder if it's still in reshape2.


Look in plyr.

--
David.

Dimitri


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Auto-killing processes spawned by foreach::doMC

2010-11-03 Thread Steve Lianoglou

Hi all,

Sometimes I'll find myself ctrl-c-ing like a madman to kill some
code that's parallelized via foreach/doMC when I realized that I just
set my cpu off to do something boneheaded, and it will keep doing that
thing for a while.

In these situations, since I interrupted its normal execution,
foreach/doMC doesn't clean up after itself by killing the processes
that were spawned. Furthermore, I believe that when I quit my main R
session, the spawned processes still remain (there, but idle).

I can go in via terminal (or some task manager/activity monitor) and
kill them manually, but does foreach (or something else (maybe
multicore?)) keep track of the process IDs that it spawned?

Is there some simple doCleanup() function I can write to get these R
processes and kill them automagically?

For what it's worth, I'm running on linux  os x, R-2.12 and the
latest versions of foreach/doMC/multicore (though, I feel like this
has been true since I've started using foreach/doMC way back when).

Thanks,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to handle 'g...@gtdata' ?

2010-11-03 Thread karena


I have a few questions about GenABEL, gwaa data.

1) is there a universal way that most GenABEL people use to add more
individuals into a 'gwaa' data? For example, I have a 'gwaa' data, but I
need to add some dummy parents, for 'g...@phdata', it's easy to add these
rows, but for 'g...@gtdata', I think I need to create SNP data as '0 0 0 0
0.' for all the dummy parents first. I am using the function
'convert.snp.ped', so I need a 'pedfile' of this format:

#ped id fa mo sex trait snp1.allele1 snp1.allele2 snp2.allele1 snp2.allele2
...#

1 1 0 0 1 2 0 0 0 0 ...
1 2 0 0 1 0 0 0 0 0 ...
1 3 0 0 2 1 0 0 0 0 ...
.
.
100 101 0 0 2 1 0 0 0 0 ...

If we use the 1M microarray, usually, after QC, there will be ~800 thousands
SNPs, so this file is really huge. I created this matrix in R, and then try
to export this by using 'write.table(pedfile, file='pedfile', col.names=F,
row.names=F, quote=F), but seems like it's taking for ever, because the size
of this matrix is too large.  Anyone can tell me, how to create a
'gwaa' data efficiently?

2) Is there any way to add genotypic data to 'g...@gtdata' directly, without
converting data of other format to 'g...@gtdata' first?

thank you very much!

karena

-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-to-handle-gwaa-gtdata-tp3026206p3026206.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoiding too many loops - reshaping data

Here is the summary of methods. tapply is the fastest!

library(reshape)

system.time(for(i in 1:1000)cast(melt(mydf, measure.vars = value),
city ~ brand,fun.aggregate = sum))
  user  system elapsed

 18.400.00   18.44

library(reshape2)
system.time(for(i in 1:1000)dcast(mydf,city ~ brand, sum))
  user  system elapsed
 12.360.02   12.37


system.time(for(i in 1:1000)xtabs(value ~ city + brand, mydf))

 user  system elapsed

  2.450.002.47


system.time(for(i in 1:1000)tapply(mydf$value,mydf[c('city','brand')],sum))

  user  system elapsed

  0.780.000.79

Dimitri


On Wed, Nov 3, 2010 at 4:32 PM, Henrique Dallazuanna www...@gmail.com wrote:
 Try this:

  xtabs(value ~ city + brand, mydf)

 On Wed, Nov 3, 2010 at 6:23 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:

 Hello!

 I have a data frame like this one:


 mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b),
  brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z),
  value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116))
 (mydf)

 What I need to get is a data frame like the one below - cities as
 rows, brands as columns, and the sums of the value within each
 city/brand combination in the body of the data frame:

 city x   y    z
 a    3   23  336
 b    7   42  231


 I have written a code that involves multiple loops and subindexing -
 but it's taking too long.
 I am sure there must be a more efficient way of doing it.

 Thanks a lot for your hints!


 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O




-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rd installation (not markup language) primer?

I have a set of functions that I always load on startup.  for example,
there is my now infamous is.defined() function.

I would like to add some documentation for these functions, so that I can do a

   ?is.defined

inside R.  The documentation tells me how to mark up Rd files is very
good, but I wonder how one installs them for access by the R
executable (on OSX for me).  Do I drop them into a special directory?
Which ones are allowed?  Do I need to package everything into a
library, or can I just add Rd files for source'd files?  Do I parse
the Rd files for R, or does R parse the Rd files on demand?

so, is there a primer on installing Rd files?

/iaw

Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multivariate Poisson distribution

2010-11-03 Thread Ben Bolker

Jourdan Gold jgold at uoguelph.ca writes:

 
 
 Hello, from a search of the archives and functions, 
 I am looking for information on creating random
 correlated counts from a multivariate Poisson distribution.Â  
 I can not seem to find a function that
 doesÂ this. Perhaps, it has not yet Â been created. 
 Has anyone created an R package that does this. 

  As far as I know this is a bit tricky (although I would be
happy to hear of simple solutions).
  Two possibilities are (1) generate a multivariate normal
distribution (e.g. MASS::mvrnorm), exponentiate it, and
take Poisson deviates [hard to specify what the final correlation
is]; (2) use copulas (library(sos); findFn(copula).
Haven't tried library(sos); findFn(correlated Poisson) but
you could ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rd installation (not markup language) primer?