Re: [R] slow dget

2010-11-06 Thread jim holtman
dput/dget were not intended to save/restore large objects.  Understand
what is happening in the use of dput/dget.  dput is creating a text
file that can reconstitute the object with dget.  dget is having to
read the file in and then parse it:

 dget
function (file)
eval(parse(file = file))
environment: namespace:base

This can be a complex process if there object is large and complex.

save/load basically take the binary object and save it with little
additional processing and the load is just as fast.

In general, most of the functions can be used both correctly and
incorrectly.  So should a warning for every potential
condition/criteria be put in the help file?  Probably not.  It is hard
to protect the user against him/herself.

So what you are doing in seeing how long alternatives take is a good
learning tool and will help you improve your use of the features.

On Fri, Nov 5, 2010 at 11:16 PM, Jack Tanner i...@hotmail.com wrote:
 I have a data structure that is fast to dput(), but very slow to dget(). On
 disk, the file is about 35MB.

 system.time(dget(r.txt))
   user  system elapsed
  142.93    1.27  192.84

 The same data structure is fast to save() and fast to load(). The .RData file 
 on
 disk is about 12MB.

 system.time(load(r.RData))
   user  system elapsed
   4.89    0.08    7.82

 I imagine that this is a known speed issue with dget, and that the recommended
 solution is to use load, which is fine with me. If so, perhaps a note to this
 effect could be added to the dget help page.

 All timings above using

 R version 2.12.0 (2010-10-15)
 Platform: i386-pc-mingw32/i386 (32-bit)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help required to remove \\N

2010-11-06 Thread Mohan L
Dear All,

I have .csv file it looks like this :
rawdata - read.csv(file='/home/Mohan/Rworks/tmp/VMList_User.txt',sep='\t'
, header=FALSE)

 head(rawdata,n=5)
  TenantDomain Owner  Current State
1\\N  ROOTadmin Running
2\\N  ROOTadmin Stopped
3\\N  ROOTadmin Running
4\\N  ROOTadmin Running
5\\N ROOTadmin Running
20  DEMO   ROOTadmin Stopped
21  DEMOROOT   admin Stopped
22  Demo ROOT   admin Stopped


The first column contain the \\n up to 19 row. I need to replace the
\\N value to  Blankspace .  Any help will  really appreciated.

Thanks for your time.

Thanks  Rg
Mohan L

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About 5.1 Arrays

2010-11-06 Thread Stephen Liu
Hi Richard,

 ## for an array with
 ## dim(a) == c(3,4,2)
 ## a[i,j,k] means select the element in position
 ##i + (j-1)*3 + (k-1)*3*4


My understanding;

e.g.
1)
dim(a) == c(3,4,2)

3 + (4-1)*3 + (2-1)*3*4
3+9+12=24

2)
## dim(a) == c(1,2,1)

1 + (2-1)*3 + (1-1)*3*4
1+3+0=4

3)
## dim(a) == c(2,3,1)

2 + (3-1)*3 + (1-1)*3*4
2+6+0=8

etc.


It is NOT always the product of i*j*k as I thought before.  Thanks for your 
explanation.  



What are the value of 3 and 4?  The values of position and dimension?

e.g.

 a - sample(24)
 a
 [1] 22 18 17 10 24  1 11 13  9 19 20  8  2 21 23 16  7 14 12 15  4  5  3  6
 dim(a) - c(3,4,2)
 a
, , 1

 [,1] [,2] [,3] [,4]   -- positions 1, 2, 3, 4 of the second dimension ?
[1,]   22   10   11   19
[2,]   18   24   13   20
[3,]   17198
^  the first dimension ?

, , 2

 [,1] [,2] [,3] [,4]
[1,]2   16   125
[2,]   217   153
[3,]   23   1446

?  If I'm wrong pls correct me.  TIA


Now I'm going to digest Joshua's advice.


B.R.
Stephen L





From: RICHARD M. HEIBERGER r...@temple.edu

Cc: Daniel Nordlund djnordl...@frontier.com; r-help@r-project.org
Sent: Sat, November 6, 2010 12:48:35 AM
Subject: Re: [R] About 5.1 Arrays


Continuing with Daniel's example, but with different data values



a - sample(24)
a
dim(a) - c(3,4,2)
a
as.vector(a)

## for an array with
## dim(a) == c(3,4,2)
## a[i,j,k] means select the element in position
##i + (j-1)*3 + (k-1)*3*4

index - function(i,j,k) {
   i + (j-1)*3 + (k-1)*3*4
}

## find the vector position described by row 2, column 1, layer 2
index(2,1,2)## this is the position in the original vector
a[2,1,2]## this is the value in that position with 3D indexing
a[index(2,1,2)] ## this is the same value with 1D vector indexing
a[14]   ## this is the same value with 1D vector indexing

## find the position in row 3, column 4, layer 1
index(3,4,1)## this is the position in the original vector
a[3,4,1]## this is the value in that position with 3D indexing
a[index(3,4,1)] ## this is the same value with 1D vector indexing
a[12]   ## this is the same value with 1D vector indexing


index(1,1,1)## this is the position in the original vector
index(2,1,1)## this is the position in the original vector
index(3,1,1)## this is the position in the original vector
index(1,2,1)## this is the position in the original vector
index(2,2,1)## this is the position in the original vector
index(3,2,1)## this is the position in the original vector
index(1,3,1)## this is the position in the original vector
index(2,3,1)## this is the position in the original vector
index(3,3,1)## this is the position in the original vector
index(1,4,1)## this is the position in the original vector
index(2,4,1)## this is the position in the original vector
index(3,4,1)## this is the position in the original vector
index(1,1,2)## this is the position in the original vector
index(2,1,2)## this is the position in the original vector
index(3,1,2)## this is the position in the original vector
index(1,2,2)## this is the position in the original vector
index(2,2,2)## this is the position in the original vector
index(3,2,2)## this is the position in the original vector
index(1,3,2)## this is the position in the original vector
index(2,3,2)## this is the position in the original vector
index(3,3,2)## this is the position in the original vector
index(1,4,2)## this is the position in the original vector
index(2,4,2)## this is the position in the original vector
index(3,4,2)## this is the position in the original vector



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] anova(lme.model)

2010-11-06 Thread Sibylle Stöckli
Dear R users

Topic: Linear effect model fitting using the nlme package (recomended by 
Pinheiro et al. 2008 for unbalanced data set).

The R help provides much info about the controversy to use the anova(lme.model) 
function to present numerator df and F values. Additionally different p-values 
calculated by lme and anova are reported. However, I come across the same 
problem, and I would very much appreciate some R help to fit an anova function 
to get similar p-values compared to the lme function and additionally to 
provide corresponding F-values. I tried to use contrasts and to deal with the 
‚unbalanced data set’.

Thanks
Sibylle

 Kaltenborn-read.table(Kaltenborn_YEARS.txt, na.strings=*, header=TRUE)
 
 
 library(nlme)

 model5c-lme(asin(sqrt(PropMortality))~Diversity+ 
 Management+Species+Height+Height*Diversity, data=Kaltenborn, 
 random=~1|Plot/SubPlot, na.action=na.omit, weights=varPower(form=~Diversity), 
 subset=Kaltenborn$ADDspecies!=1, method=ML)

 summary(model5c)
Linear mixed-effects model fit by maximum likelihood
 Data: Kaltenborn 
  Subset: Kaltenborn$ADDspecies != 1 
AIC   BIC   logLik
  -249.3509 -205.4723 137.6755

Random effects:
 Formula: ~1 | Plot
(Intercept)
StdDev:  0.06162279

 Formula: ~1 | SubPlot %in% Plot
(Intercept)   Residual
StdDev:  0.03942785 0.05946185

Variance function:
 Structure: Power of variance covariate
 Formula: ~Diversity 
 Parameter estimates:
power 
0.7302087 
Fixed effects: asin(sqrt(PropMortality)) ~ Diversity + Management + Species +   
   Height + Height * Diversity 
  Value  Std.Error  DF   t-value p-value
(Intercept)   0.5422893 0.05923691 163  9.154585  0.
Diversity-0.0734688 0.02333159  14 -3.148896  0.0071
Managementm+  0.0217734 0.02283375  30  0.953562  0.3479
Managementu  -0.0557160 0.02286694  30 -2.436532  0.0210
SpeciesPab   -0.2058763 0.02763737 163 -7.449198  0.
SpeciesPm 0.0308005 0.02827782 163  1.089210  0.2777
SpeciesQp 0.0968051 0.02689327 163  3.599602  0.0004
Height   -0.0017579 0.00031667 163 -5.551251  0.
Diversity:Height  0.0005122 0.00014443 163  3.546270  0.0005
 Correlation: 
 (Intr) Dvrsty Mngmn+ Mngmnt SpcsPb SpcsPm SpcsQp Height
Diversity-0.867 
Managementm+ -0.173 -0.019  
Managementu  -0.206  0.005  0.499   
SpeciesPab   -0.253  0.085  0.000  0.035
SpeciesPm-0.239  0.058  0.001  0.064  0.521 
SpeciesQp-0.250  0.041 -0.001  0.032  0.502  0.506  
Height   -0.518  0.532 -0.037 -0.004  0.038  0.004  0.033   
Diversity:Height  0.492 -0.581  0.031 -0.008 -0.149 -0.099 -0.069 -0.904

Standardized Within-Group Residuals:
Min  Q1 Med  Q3 Max 
-2.99290873 -0.60522612 -0.05756772  0.62163049  2.80811502 

Number of Observations: 216
Number of Groups: 
 Plot SubPlot %in% Plot 
   1648 

 anova(model5c)
 numDF denDF   F-value p-value
(Intercept)  1   163 244.67887  .0001
Diversity114   1.53025  0.2364
Management   230   6.01972  0.0063
Species  3   163  51.86699  .0001
Height   1   163  30.08090  .0001
Diversity:Height 1   163  12.57603  0.0005


--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] calculate probability

2010-11-06 Thread Jumlong Vongprasert

 Dear All
I have some problem with calculate probability.
Assume I have data with normal distribution with mean = 5 sd = 2.
I want to approximate probability = 2.4.
I used pnorm(2.4, 5, 2) - pnorm(2.4, 5, 2, lower.tail = FLASE), correct 
or not.

Many Thanks
Jumlong

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help required to remove \\N

2010-11-06 Thread Henrique Dallazuanna
Try this:

gsub(N, BlankSpace, rawdata$Tenant)

On Sat, Nov 6, 2010 at 8:15 AM, Mohan L l.mohanphys...@gmail.com wrote:

 Dear All,

 I have .csv file it looks like this :
 rawdata - read.csv(file='/home/Mohan/Rworks/tmp/VMList_User.txt',sep='\t'
 , header=FALSE)

  head(rawdata,n=5)
  TenantDomain Owner  Current State
 1\\N  ROOTadmin Running
 2\\N  ROOTadmin Stopped
 3\\N  ROOTadmin Running
 4\\N  ROOTadmin Running
 5\\N ROOTadmin Running
 20  DEMO   ROOTadmin Stopped
 21  DEMOROOT   admin Stopped
 22  Demo ROOT   admin Stopped


 The first column contain the \\n up to 19 row. I need to replace the
 \\N value to  Blankspace .  Any help will  really appreciated.

 Thanks for your time.

 Thanks  Rg
 Mohan L

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculate probability

2010-11-06 Thread Ted Harding
On 06-Nov-10 11:16:28, Jumlong Vongprasert wrote:
 Dear All
 I have some problem with calculate probability.
 Assume I have data with normal distribution with mean = 5 sd = 2.
 I want to approximate probability = 2.4.
 I used pnorm(2.4, 5, 2) - pnorm(2.4, 5, 2, lower.tail = FLASE),
 correct or not.
 Many Thanks
 Jumlong

[A]
Not correct because FLASE should be FALSE.

[B]
Not correct because
a) pnorm(2.4, 5, 2) is the probability that a value from a Normal
distribution with mean 5 will be less than 2.4, and this is less
than 1/2 (since 2.4 is less than the mean).
b) pnorm(2.4, 5, 2, lower.tail = FALSE) is the probability that
such a value will be greater than 2.4, and this is greater than 1/2
(for the same reason.
c) The difference will therefore be ( 1/2) - ( 1/2) which will
be less than 0, so cannot be a probability.

[C]
Not correct because a value sampled from a Normal distribution
(which is a continuous distribution) has probability 0 of being
exactly equal to any given value (e.g. 2.4); so I think your
question does not express what you want to know.

One possibility which could make your question realistic is
that the value 2.4 that you are interested in is a value
sampled from Normal(mean=5, sd=2) **that has been rounded to
1 decimal place** and so could have been any value between
2.35 and 2.45; in that case it makes sense to ask what is the
probability of a value in this range from Normal(mean=5, sd=2).

This would be

  pnorm(2.45, 5, 2) - pnorm(2.35, 5, 2) = 0.008569045

Of course, the degree of rounding may be different -- for
example rounding to *even* values of the first decimal place,
i.e. to values ... , 2.0, 2.2, 2.4, 2.6, 2.8, ...
in which case the event whose probability you want is that
the sampled value is between 2.3 and 2.5, whose probability is

  pnorm(2.5, 5, 2) - pnorm(2.3, 5, 2) = 0.01714178

Since the question you are really interested in cannot be
identified from what you have asked (see examples above),
you should try to make your question clear and definite!

Hoping this helps,
Ted.


E-Mail: (Ted Harding) ted.hard...@wlandres.net
Fax-to-email: +44 (0)870 094 0861
Date: 06-Nov-10   Time: 12:06:17
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extracting elements of a particular slot from S4 object

2010-11-06 Thread Megh Dal
Hi there, can anyone tell me how to extract to values of a particular slot for 
some S4 object? Let take following example:

 library(fOptions)
 val -GBSOption(TypeFlag = c, S = 60, X = 65, Time = 1/4, r = 0.08, b = 
 0.08, sigma = 0.30)
 val

Title:
 Black Scholes Option Valuation 

Call:
 GBSOption(TypeFlag = c, S = 60, X = 65, Time = 1/4, r = 0.08, 
 b = 0.08, sigma = 0.3)

Parameters:
  Value:
 TypeFlag c 
 S60
 X65
 Time 0.25  
 r0.08  
 b0.08  
 sigma0.3   

Option Price:
 2.133372 

Description:
 Sat Nov 06 19:25:39 2010 

Here I have tried with following however slapped with some error:


 val@Option Price
Error: no slot of name Option Price for this object of class fOPTION

What is the ideal way to do that?

Thanks,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting elements of a particular slot from S4 object

2010-11-06 Thread Henrique Dallazuanna
Try this: v...@price



On Sat, Nov 6, 2010 at 11:41 AM, Megh Dal megh700...@yahoo.com wrote:

 Hi there, can anyone tell me how to extract to values of a particular slot
 for some S4 object? Let take following example:

  library(fOptions)
  val -GBSOption(TypeFlag = c, S = 60, X = 65, Time = 1/4, r = 0.08, b =
 0.08, sigma = 0.30)
  val

 Title:
  Black Scholes Option Valuation

 Call:
  GBSOption(TypeFlag = c, S = 60, X = 65, Time = 1/4, r = 0.08,
 b = 0.08, sigma = 0.3)

 Parameters:
  Value:
  TypeFlag c
  S60
  X65
  Time 0.25
  r0.08
  b0.08
  sigma0.3

 Option Price:
  2.133372

 Description:
  Sat Nov 06 19:25:39 2010

 Here I have tried with following however slapped with some error:


  val@Option Price
 Error: no slot of name Option Price for this object of class fOPTION

 What is the ideal way to do that?

 Thanks,

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Removing NA in ggplot

2010-11-06 Thread Ottar Kvindesland
Hi list,

I just got stuck with this one:

In Data I have the sets age (numbers 1 to 99 and NA) and gender (M, F and
NA). Then getting some nice plots using

ggplot(data, aes(age[na.exclude(gender)])) +
 geom_histogram( binwidth = 3, aes(y = ..density.. ), fill = lightblue )
+
  facet_grid( gender ~ .)

I am trying to get a faceted graph of age distribution excluding the NA data
for gender

Unfortunately I end up with the error message:

Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 206, 219

How do i Wash out NA's in this situation?


Regards

ottar

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing NA in ggplot

2010-11-06 Thread Jeff Newmiller
Create a subset of your data that excludes the NAs before you feed it to ggplot.

Ottar Kvindesland ottar.kvindesl...@gmail.com wrote:

Hi list,

I just got stuck with this one:

In Data I have the sets age (numbers 1 to 99 and NA) and gender (M, F
and
NA). Then getting some nice plots using

ggplot(data, aes(age[na.exclude(gender)])) +
geom_histogram( binwidth = 3, aes(y = ..density.. ), fill = lightblue
)
+
  facet_grid( gender ~ .)

I am trying to get a faceted graph of age distribution excluding the NA
data
for gender

Unfortunately I end up with the error message:

Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 206, 219

How do i Wash out NA's in this situation?


Regards

ottar

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
---
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About 5.1 Arrays

2010-11-06 Thread Stephen Liu
Hi Joshua,

Thanks for your advice.

1)
Re your advice:-[quote]
 a3d
, , 1 --- this is the first position of the third dimension

 [,1] [,2] [,3] [,4]  --- positions 1, 2, 3, 4 of the second dimension
[1,]147   10
[2,]258   11
[3,]369   12
^  the first dimension

, , 2 --- the second position of the third dimension
...
[/quote]

Where is the third dimension?


2)
Re your advice:-[quote]
so you can think that in the original vector a:
1 maps to a[1, 1, 1] in the 3d array
2 maps to a[2, 1, 1].
3 maps to a[3, 1, 1]
4 maps to a[1, 2, 1]
12 maps to a[3, 4, 1]
20 maps to a[2, 3, 2]
24 maps to a[3, 4, 2]
[/quote]

My finding;

# 1 maps to a[1, 1, 1] in the 3d array
 a3d - array(a, dim = c(1, 1, 1))
 a3d
, , 1

 [,1]
[1,]1

Correct

# 2 maps to a[2, 1, 1].
 a3d - array(a, dim = c(2, 1, 1))
 a3d
, , 1

 [,1]
[1,]1
[2,]2

Correct

# 3 maps to a[3, 1, 1]
 a3d - array(a, dim = c(3, 1, 1))
 a3d
, , 1

 [,1]
[1,]1
[2,]2
[3,]3

Correct

# 4 maps to a[1, 2, 1]
 a3d - array(a, dim = c(1, 2, 1))
 a3d
, , 1

 [,1] [,2]
[1,]12

Incorrect.  It is 2


# 12 maps to a[3, 4, 1]
 a3d - array(a, dim = c(3, 4, 1))
 a3d
, , 1

 [,1] [,2] [,3] [,4]
[1,]147   10
[2,]258   11
[3,]369   12

Correct

# 20 maps to a[2, 3, 2]
 a3d - array(a, dim = c(2, 3, 2))
 a3d
, , 1

 [,1] [,2] [,3]
[1,]135
[2,]246

, , 2

 [,1] [,2] [,3]
[1,]79   11
[2,]8   10   12

Incorrect.  It is 12


#  24 maps to a[3, 4, 2]
 a3d - array(a, dim = c(3, 4, 2))
 a3d
, , 1

 [,1] [,2] [,3] [,4]
[1,]147   10
[2,]258   11
[3,]369   12

, , 2

 [,1] [,2] [,3] [,4]
[1,]   13   16   19   22
[2,]   14   17   20   23
[3,]   15   18   21   24

Correct.

If I'm wrong, pls correct me.  Thanks


B.R.
Stephen




- Original Message 
From: Joshua Wiley jwiley.ps...@gmail.com
To: Stephen Liu sati...@yahoo.com
Cc: r-help@r-project.org
Sent: Sat, November 6, 2010 12:48:27 AM
Subject: Re: [R] About 5.1 Arrays

On Fri, Nov 5, 2010 at 9:17 AM, Stephen Liu sati...@yahoo.com wrote:
 Hi Daniel,

 Thanks for your detail advice.  I completely understand your explain.

 But I can't resolve what does a stand for there?

the a just represents some vector.  It is the name of the object
that stores your data.  Like you might tell someone to go look in a
book to find some information.


 a[1,1,1] is 1 * 1 * 1 = 1
 a[2,1,1] is 2 * 1 * 1 = 2
 a[2,4,2] is 2 * 4 * 2 = 16
 a[3,4,2] is 3 * 4 * 2 = 24

That is the basic idea, but it may not be the most helpful way to
think of it because it depends on the length of the each dimension.
For example

a[1, 2, 1] is not 1 * 2 * 1 = 2
a[1, 1, 2] is not 1 * 1 * 2 = 2

in the little 3d array I show below, it would actually be

a[1, 2, 1] = 4
a[1, 1, 2] = 13


 ?


 B.R.
 Stephen L


 - Original Message 
 From: Daniel Nordlund djnordl...@frontier.com
 To: r-help@r-project.org
 Sent: Fri, November 5, 2010 11:54:15 PM
 Subject: Re: [R] About 5.1 Arrays

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Stephen Liu
 Sent: Friday, November 05, 2010 7:57 AM
 To: Steve Lianoglou
 Cc: r-help@r-project.org
 Subject: Re: [R] About 5.1 Arrays

 Hi Steve,

  It's not clear what you're having problems understanding. By
  setting the dim attribute of your (1d) vector, you are changing
  itsdimenensions.

 I'm following An Introduction to R to learn R

 On

 5.1 Arrays
 http://cran.r-project.org/doc/manuals/R-intro.html#Vectors-and-assignment


 It mentions:-
 ...
 For example if the dimension vector for an array, say a, is c(3,4,2) then
 there
 are 3 * 4 * 2 = 24 entries in a and the data vector holds them in the
 order
 a[1,1,1], a[2,1,1], ..., a[2,4,2], a[3,4,2].


 I don't understand on  =24 entries in a and the data vector holds
 them in
 the order a[1,1,1], a[2,1,1], ..., a[2,4,2], a[3,4,2].  the order
 a[1,1,1],
 a[2,1,1], ..., a[2,4,2], a[3,4,2]?  What does it mean the order a[1,1,1],
 a[2,1,1], ..., a[2,4,2], a[3,4,2]?

because it is actually stored as a 1 dimensional vector, it is just
telling you the order.  For example, given some vector a that
contains the numbers 1 through 24, you could reshape this into a three
dimensional object.  It would be stored like:

# make a vector a and an array (built from a) called a3d
 a - 1:24
 a3d - array(a, dim = c(3, 4, 2))
 a
[1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
 a3d
, , 1 --- this is the first position of the third dimension

 [,1] [,2] [,3] [,4]  --- positions 1, 2, 3, 4 of the second dimension
[1,]147   10
[2,]258   11
[3,]369   12
^  the first dimension

, , 2 --- the second position of the third dimension

 [,1] [,2] [,3] [,4]
[1,]   13   16   19   22
[2,]   14   17   20   23
[3,]   15   18   21   24


a[1, 1, 1] is the first element of dimension 

Re: [R] anova(lme.model)

2010-11-06 Thread Bert Gunter
Sounds to me like you should really be seeking help from your local
statistician, not this list. What you request probably cannot be done.

What is wrong with what you get from lme, whose results seem fairly
clear whether the P values are accurate or not?

Cheers,
Bert





On Sat, Nov 6, 2010 at 4:04 AM, Sibylle Stöckli
sibylle.stoec...@gmx.ch wrote:
 Dear R users

 Topic: Linear effect model fitting using the nlme package (recomended by 
 Pinheiro et al. 2008 for unbalanced data set).

 The R help provides much info about the controversy to use the 
 anova(lme.model) function to present numerator df and F values. Additionally 
 different p-values calculated by lme and anova are reported. However, I come 
 across the same problem, and I would very much appreciate some R help to fit 
 an anova function to get similar p-values compared to the lme function and 
 additionally to provide corresponding F-values. I tried to use contrasts and 
 to deal with the ‚unbalanced data set’.

 Thanks
 Sibylle

 Kaltenborn-read.table(Kaltenborn_YEARS.txt, na.strings=*, header=TRUE)


 library(nlme)

 model5c-lme(asin(sqrt(PropMortality))~Diversity+ 
 Management+Species+Height+Height*Diversity, data=Kaltenborn, 
 random=~1|Plot/SubPlot, na.action=na.omit, 
 weights=varPower(form=~Diversity), subset=Kaltenborn$ADDspecies!=1, 
 method=ML)

 summary(model5c)
 Linear mixed-effects model fit by maximum likelihood
  Data: Kaltenborn
  Subset: Kaltenborn$ADDspecies != 1
        AIC       BIC   logLik
  -249.3509 -205.4723 137.6755

 Random effects:
  Formula: ~1 | Plot
        (Intercept)
 StdDev:  0.06162279

  Formula: ~1 | SubPlot %in% Plot
        (Intercept)   Residual
 StdDev:  0.03942785 0.05946185

 Variance function:
  Structure: Power of variance covariate
  Formula: ~Diversity
  Parameter estimates:
    power
 0.7302087
 Fixed effects: asin(sqrt(PropMortality)) ~ Diversity + Management + Species + 
      Height + Height * Diversity
                      Value  Std.Error  DF   t-value p-value
 (Intercept)       0.5422893 0.05923691 163  9.154585  0.
 Diversity        -0.0734688 0.02333159  14 -3.148896  0.0071
 Managementm+      0.0217734 0.02283375  30  0.953562  0.3479
 Managementu      -0.0557160 0.02286694  30 -2.436532  0.0210
 SpeciesPab       -0.2058763 0.02763737 163 -7.449198  0.
 SpeciesPm         0.0308005 0.02827782 163  1.089210  0.2777
 SpeciesQp         0.0968051 0.02689327 163  3.599602  0.0004
 Height           -0.0017579 0.00031667 163 -5.551251  0.
 Diversity:Height  0.0005122 0.00014443 163  3.546270  0.0005
  Correlation:
                 (Intr) Dvrsty Mngmn+ Mngmnt SpcsPb SpcsPm SpcsQp Height
 Diversity        -0.867
 Managementm+     -0.173 -0.019
 Managementu      -0.206  0.005  0.499
 SpeciesPab       -0.253  0.085  0.000  0.035
 SpeciesPm        -0.239  0.058  0.001  0.064  0.521
 SpeciesQp        -0.250  0.041 -0.001  0.032  0.502  0.506
 Height           -0.518  0.532 -0.037 -0.004  0.038  0.004  0.033
 Diversity:Height  0.492 -0.581  0.031 -0.008 -0.149 -0.099 -0.069 -0.904

 Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max
 -2.99290873 -0.60522612 -0.05756772  0.62163049  2.80811502

 Number of Observations: 216
 Number of Groups:
             Plot SubPlot %in% Plot
               16                48

 anova(model5c)
                 numDF denDF   F-value p-value
 (Intercept)          1   163 244.67887  .0001
 Diversity            1    14   1.53025  0.2364
 Management           2    30   6.01972  0.0063
 Species              3   163  51.86699  .0001
 Height               1   163  30.08090  .0001
 Diversity:Height     1   163  12.57603  0.0005


 --

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Where to get rcom for Linux

2010-11-06 Thread Stephen Liu
Hi folks,

Debian 600 64-bit

Is rcom for Linux available? 

rcom
rcom: R COM Client Interface and internal COM Server
http://cran.r-project.org/web/packages/rcom/index.html

If YES please advise where to get it.

TIA

B.R.
Stephen L




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where to get rcom for Linux

2010-11-06 Thread Shige Song
isn't COM a Windows-only technology?

Shige

On Sat, Nov 6, 2010 at 12:12 PM, Stephen Liu sati...@yahoo.com wrote:
 Hi folks,

 Debian 600 64-bit

 Is rcom for Linux available?

 rcom
 rcom: R COM Client Interface and internal COM Server
 http://cran.r-project.org/web/packages/rcom/index.html

 If YES please advise where to get it.

 TIA

 B.R.
 Stephen L




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] compare and replace

2010-11-06 Thread Robert Ruser
Hello R Users,
I'm wondering if there exists any elegant and simple way to do the
following: I have a data.frame X fills in numbers. I have a vector y with
numbers as well. Every value in X that is equal to any values in y should
be replaced by e.g. 1. Any idea?

Robert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compare and replace

2010-11-06 Thread Ben Bolker
Robert Ruser robert.ruser at gmail.com writes:

 
 Hello R Users,
 I'm wondering if there exists any elegant and simple way to do the
 following: I have a data.frame X fills in numbers. I have a vector y with
 numbers as well. Every value in X that is equal to any values in y should
 be replaced by e.g. 1. Any idea?
 
 Robert
 

  If the data frames are completely filled with numbers you
should be able to operate on them as matrices (?as.matrix,
?as.data.frame).

X - matrix(1:16,nrow=4)
Y - matrix(rep(2:5,4),nrow=4)

(X[X %in% c(Y)] - 1)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compare and replace

2010-11-06 Thread Erik Iverson

On 11/06/2010 11:36 AM, Robert Ruser wrote:

Hello R Users,
I'm wondering if there exists any elegant and simple way to do the
following: I have a data.frame X fills in numbers. I have a vector y with
numbers as well. Every value in X that is equal to any values in y should
be replaced by e.g. 1. Any idea?


Keep in mind FAQ 7.31 when programming this...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] table with values as dots in increasing sizes

2010-11-06 Thread Michael Friendly

This is a tableplot, available on R-Forge at
https://r-forge.r-project.org/projects/tableplot/

install.packages(tableplot, repos=http://R-Forge.R-project.org;)
will install, as long as you are using R 2.12.x; otherwise, you'll
have to download the source package and install from source.


-Michael


On 11/5/2010 4:45 AM, fugelpitch wrote:


I was just thinking of a way to present data and if it is possible in R.

I have a data frame that looks as follows (this is just mockup data).

df
location,species1,species2,species3,species4,species5
loc1,0.44,0.28,0.37,-0.24,0.41
loc2,0.54,0.62,0.34,0.52,0.71
loc3,-0.33,0.75,-0.34,0.48,0.61

location is a factor while all the species are numerical vectors.

I would like to present this as a table (or something that looks like a
table) but instead of the numbers I would like to present circles (pch = 19)
that increases in size with increasing number. Is it also possible to make
it change color if the value is negative. (E.g. larger blue circles
represent larger +values while larger red circles represent larger -values)?


Jonas


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compare and replace

2010-11-06 Thread Robert Ruser
Thank you vary much Ben and Erik.
It's exactly what I want. Below is my a little modified example.

set.seed(12345)
X = sample(c(40:60),40,replace=TRUE)
x = matrix(X,nc=5)
y = c(40,43,55,60)
x[x %in% y] - -1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About 5.1 Arrays

2010-11-06 Thread Daniel Nordlund
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Stephen Liu
 Sent: Saturday, November 06, 2010 7:38 AM
 To: Joshua Wiley
 Cc: r-help@r-project.org
 Subject: Re: [R] About 5.1 Arrays
 
 Hi Joshua,
 
 Thanks for your advice.
 
 1)
 Re your advice:-[quote]
  a3d
 , , 1 --- this is the first position of the third dimension
 
  [,1] [,2] [,3] [,4]  --- positions 1, 2, 3, 4 of the second
 dimension
 [1,]147   10
 [2,]258   11
 [3,]369   12
 ^  the first dimension
 
 , , 2 --- the second position of the third dimension
 ...
 [/quote]
 
 Where is the third dimension?
 
 
 2)
 Re your advice:-[quote]
 so you can think that in the original vector a:
 1 maps to a[1, 1, 1] in the 3d array
 2 maps to a[2, 1, 1].
 3 maps to a[3, 1, 1]
 4 maps to a[1, 2, 1]
 12 maps to a[3, 4, 1]
 20 maps to a[2, 3, 2]
 24 maps to a[3, 4, 2]
 [/quote]
 
 My finding;
 
 # 1 maps to a[1, 1, 1] in the 3d array
  a3d - array(a, dim = c(1, 1, 1))
  a3d
 , , 1
 
  [,1]
 [1,]1
 
 Correct
 
 # 2 maps to a[2, 1, 1].
  a3d - array(a, dim = c(2, 1, 1))
  a3d
 , , 1
 
  [,1]
 [1,]1
 [2,]2
 
 Correct
 
 # 3 maps to a[3, 1, 1]
  a3d - array(a, dim = c(3, 1, 1))
  a3d
 , , 1
 
  [,1]
 [1,]1
 [2,]2
 [3,]3
 
 Correct
 
 # 4 maps to a[1, 2, 1]
  a3d - array(a, dim = c(1, 2, 1))
  a3d
 , , 1
 
  [,1] [,2]
 [1,]12
 
 Incorrect.  It is 2
 
 
 # 12 maps to a[3, 4, 1]
  a3d - array(a, dim = c(3, 4, 1))
  a3d
 , , 1
 
  [,1] [,2] [,3] [,4]
 [1,]147   10
 [2,]258   11
 [3,]369   12
 
 Correct
 
 # 20 maps to a[2, 3, 2]
  a3d - array(a, dim = c(2, 3, 2))
  a3d
 , , 1
 
  [,1] [,2] [,3]
 [1,]135
 [2,]246
 
 , , 2
 
  [,1] [,2] [,3]
 [1,]79   11
 [2,]8   10   12
 
 Incorrect.  It is 12
 
 
 #  24 maps to a[3, 4, 2]
  a3d - array(a, dim = c(3, 4, 2))
  a3d
 , , 1
 
  [,1] [,2] [,3] [,4]
 [1,]147   10
 [2,]258   11
 [3,]369   12
 
 , , 2
 
  [,1] [,2] [,3] [,4]
 [1,]   13   16   19   22
 [2,]   14   17   20   23
 [3,]   15   18   21   24
 
 Correct.
 
 If I'm wrong, pls correct me.  Thanks
 
 
 B.R.
 Stephen
 

Stephen,

I am correcting you. :-)  You are using dim() incorrectly, and not accessing 
the array correctly.  In all of your examples you should be using dim(3,4,2).  
Then you need to specify the indexes of the array element you want to look at. 
So, to use your example

 a-1:24
 a3d - array(a, dim = c(3,4,2))
 a3d
, , 1

 [,1] [,2] [,3] [,4]
[1,]147   10
[2,]258   11
[3,]369   12

, , 2

 [,1] [,2] [,3] [,4]
[1,]   13   16   19   22
[2,]   14   17   20   23
[3,]   15   18   21   24

 
 # 1 maps to a[1, 1, 1] in the 3d array
 a3d[1, 1, 1]
[1] 1
 
 # 2 maps to a[2, 1, 1].
 a3d[2, 1, 1]
[1] 2
 
 # 3 maps to a[3, 1, 1]
 a3d[3, 1, 1]
[1] 3
 
 # 4 maps to a[1, 2, 1]
 a3d[1, 2, 1]
[1] 4


Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova(lme.model)

2010-11-06 Thread Mike Marchywka

 Date: Sat, 6 Nov 2010 07:45:26 -0700
 From: gunter.ber...@gene.com
 To: sibylle.stoec...@gmx.ch
 CC: r-help@r-project.org
 Subject: Re: [R] anova(lme.model)

 Sounds to me like you should really be seeking help from your local
 statistician, not this list. What you request probably cannot be done.


I'm still bringing my install up to speed so I can't immediately
read the cited R stuff below but it sounds like the OP
mentions a controversy documented in the R packages. Is there
a list for discussing these topics? Offhand that seems legitimate
for a user help list unless you want people to believe that 
 it came out of a computer so it must be right, whatever a P value
is. 



 What is wrong with what you get from lme, whose results seem fairly
 clear whether the P values are accurate or not?

 Cheers,
 Bert





 On Sat, Nov 6, 2010 at 4:04 AM, Sibylle Stöckli
  wrote:
  Dear R users
 
  Topic: Linear effect model fitting using the nlme package (recomended by 
  Pinheiro et al. 2008 for unbalanced data set).
 
  The R help provides much info about the controversy to use the 
  anova(lme.model) function to present numerator df and F values. 
  Additionally different p-values calculated by lme and anova are reported. 
  However, I come across the same problem, and I would very much appreciate 
  some R help to fit an anova function to get similar p-values compared to 
  the lme function and additionally to provide corresponding F-values. I 
  tried to use contrasts and to deal with the ‚unbalanced data set’.
 
  Thanks
  Sibylle
 
  Kaltenborn-read.table(Kaltenborn_YEARS.txt, na.strings=*, header=TRUE)
 
 
  library(nlme)
 
  model5c-lme(asin(sqrt(PropMortality))~Diversity+ 
  Management+Species+Height+Height*Diversity, data=Kaltenborn, 
  random=~1|Plot/SubPlot, na.action=na.omit, 
  weights=varPower(form=~Diversity), subset=Kaltenborn$ADDspecies!=1, 
  method=ML)
 
  summary(model5c)
  Linear mixed-effects model fit by maximum likelihood
   Data: Kaltenborn
   Subset: Kaltenborn$ADDspecies != 1
 AIC   BIC   logLik
   -249.3509 -205.4723 137.6755
 
  Random effects:
   Formula: ~1 | Plot
 (Intercept)
  StdDev:  0.06162279
 
   Formula: ~1 | SubPlot %in% Plot
 (Intercept)   Residual
  StdDev:  0.03942785 0.05946185
 
  Variance function:
   Structure: Power of variance covariate
   Formula: ~Diversity
   Parameter estimates:
 power
  0.7302087
  Fixed effects: asin(sqrt(PropMortality)) ~ Diversity + Management + Species 
  +  Height + Height * Diversity
   Value  Std.Error  DF   t-value p-value
  (Intercept)   0.5422893 0.05923691 163  9.154585  0.
  Diversity-0.0734688 0.02333159  14 -3.148896  0.0071
  Managementm+  0.0217734 0.02283375  30  0.953562  0.3479
  Managementu  -0.0557160 0.02286694  30 -2.436532  0.0210
  SpeciesPab   -0.2058763 0.02763737 163 -7.449198  0.
  SpeciesPm 0.0308005 0.02827782 163  1.089210  0.2777
  SpeciesQp 0.0968051 0.02689327 163  3.599602  0.0004
  Height   -0.0017579 0.00031667 163 -5.551251  0.
  Diversity:Height  0.0005122 0.00014443 163  3.546270  0.0005
   Correlation:
  (Intr) Dvrsty Mngmn+ Mngmnt SpcsPb SpcsPm SpcsQp Height
  Diversity-0.867
  Managementm+ -0.173 -0.019
  Managementu  -0.206  0.005  0.499
  SpeciesPab   -0.253  0.085  0.000  0.035
  SpeciesPm-0.239  0.058  0.001  0.064  0.521
  SpeciesQp-0.250  0.041 -0.001  0.032  0.502  0.506
  Height   -0.518  0.532 -0.037 -0.004  0.038  0.004  0.033
  Diversity:Height  0.492 -0.581  0.031 -0.008 -0.149 -0.099 -0.069 -0.904
 
  Standardized Within-Group Residuals:
 Min  Q1 Med  Q3 Max
  -2.99290873 -0.60522612 -0.05756772  0.62163049  2.80811502
 
  Number of Observations: 216
  Number of Groups:
  Plot SubPlot %in% Plot
1648
 
  anova(model5c)
  numDF denDF   F-value p-value
  (Intercept)  1   163 244.67887  .0001
  Diversity114   1.53025  0.2364
  Management   230   6.01972  0.0063
  Species  3   163  51.86699  .0001
  Height   1   163  30.08090  .0001
  Diversity:Height 1   163  12.57603  0.0005
 
 

 --
 Bert Gunter
 Genentech Nonclinical Biostatistics







Mike Marchywka | V.P. Technology

415-264-8477
marchy...@phluant.com

Online Advertising and Analytics for Mobile
http://www.phluant.com


  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3-way interaction simple slopes

2010-11-06 Thread Michael Wood
Can anyone show me how to test for significant simple slopes of a 3-way
interaction, with covariates.

my equation
tmod-(glm(PCL~ rank.f + gender.f + MONTHS + CEXPOSE.M + bf.m +
MONTHS*CEXPOSE.M*bf.m,
data=mhatv, family=gaussian ,na.action=na.omit))

Thank you
Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using changing names in loop in R

2010-11-06 Thread Tuatara

Hello everybody, 

I have usually solved this problem by repeating lines of codes instead of a
loop, but it's such a waste of time, I thought I should really learn how to
do it with loops:

What I want to do:

Say, I have several data files that differ only in a number, e.g. data
points (or vector, or matrix...) Data_1, Data_2, Data_3,... and I want to
manipulate them 

e.g. a simple sum of several data points

data - c(NA,n)
for (i in 1:n){
data[i] - Data_i + Data_[i-1]
  } 

I know that the above code doesn't work, and I don't want to combine the
files into one vector to solve the problem etc. - I would just like to know
who make sure R recognizes the extension _i. I have the same problem for
say, reading in datafiles that only differ by one digit in the extension,
and I want to (instead of repeating code) combine the process in a loop.

I hope I made myself clear to what my problem is.

Thanks for your help,

//F
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Using-changing-names-in-loop-in-R-tp3030132p3030132.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using changing names in loop in R

2010-11-06 Thread Daniel Nordlund
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Tuatara
 Sent: Saturday, November 06, 2010 9:22 AM
 To: r-help@r-project.org
 Subject: [R] Using changing names in loop in R
 
 
 Hello everybody,
 
 I have usually solved this problem by repeating lines of codes instead of
 a
 loop, but it's such a waste of time, I thought I should really learn how
 to
 do it with loops:
 
 What I want to do:
 
 Say, I have several data files that differ only in a number, e.g. data
 points (or vector, or matrix...) Data_1, Data_2, Data_3,... and I want to
 manipulate them
 
 e.g. a simple sum of several data points
 
 data - c(NA,n)
 for (i in 1:n){
 data[i] - Data_i + Data_[i-1]
   }
 
 I know that the above code doesn't work, and I don't want to combine the
 files into one vector to solve the problem etc. - I would just like to
 know
 who make sure R recognizes the extension _i. I have the same problem for
 say, reading in datafiles that only differ by one digit in the extension,
 and I want to (instead of repeating code) combine the process in a loop.
 
 I hope I made myself clear to what my problem is.
 
 Thanks for your help,
 

This is one of those cases where a commented, self-contained, reproducible 
example would be very helpful in helping you.  You mention you have several 
data files, but I see no reference to data files in your code.  Did you mean 
data frames?  What is Data_i?  A data frame or something else?

You said you normally do this by repeating lines of code.  Can you show us a 
simple example?  Someone should be able to show you how to optimize it.

Dan

Daniel Nordlund
Bothell, WA USA
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Prettier axis labels when using log scales in Lattice

2010-11-06 Thread Marc Paterno
Hello,

I am trying to alter the way in which lattice functions (specifically xyplot) 
print the axis labels when one uses the 'scales' parameter.
I can obtain the effect I want by using
  scales=list(y=list(log=10, labels=expression(yvalues)))
where yvalues are the values that would have been printed as the y-axis labels 
if the labels argument had not been present. To help clarify what I am 
looking for, compare the first of the following plots with the second:

data(iris)
xyplot(Sepal.Length~Sepal.Width, iris, scales=list(y=list(log=10)))

xyplot(Sepal.Length~Sepal.Width, iris, scales=list(y=list(log=10, 
labels=expression(10^0.65,10^0.7,10^0.75,10^0.8,10^0.85,10^0.85,10^0.9

The second is the effect I am trying to achieve. Is there a way to do this 
without explicitly entering the expressions to be printed on the y-axis?

thanks,
Marc Paterno

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Prettier axis labels when using log scales in Lattice

2010-11-06 Thread Gabor Grothendieck
On Sat, Nov 6, 2010 at 3:02 PM, Marc Paterno pate...@fnal.gov wrote:
 Hello,

 I am trying to alter the way in which lattice functions (specifically xyplot) 
 print the axis labels when one uses the 'scales' parameter.
 I can obtain the effect I want by using
  scales=list(y=list(log=10, labels=expression(yvalues)))
 where yvalues are the values that would have been printed as the y-axis 
 labels if the labels argument had not been present. To help clarify what I 
 am looking for, compare the first of the following plots with the second:

 data(iris)
 xyplot(Sepal.Length~Sepal.Width, iris, scales=list(y=list(log=10)))

 xyplot(Sepal.Length~Sepal.Width, iris, scales=list(y=list(log=10, 
 labels=expression(10^0.65,10^0.7,10^0.75,10^0.8,10^0.85,10^0.85,10^0.9

 The second is the effect I am trying to achieve. Is there a way to do this 
 without explicitly entering the expressions to be printed on the y-axis?


Try:

library(latticeExtra)
xyplot(Sepal.Length~Sepal.Width, iris, scales=list(y=list(log=10)),
   yscale.components = yscale.components.logpower)


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using changing names in loop in R

2010-11-06 Thread Liviu Andronic
On Sat, Nov 6, 2010 at 5:22 PM, Tuatara franziskabro...@gmail.com wrote:

 Hello everybody,

 I have usually solved this problem by repeating lines of codes instead of a
 loop, but it's such a waste of time, I thought I should really learn how to
 do it with loops:


Would the following construct help?
 for(i in 1:10) assign(paste('x', i, sep=''), c(i:10))
 ls()
 [1] ipkg  tbbt x1   x10  x2   x3   x4   x5   x6
[11] x7   x8   x9
 for(i in 1:10) print(get(paste('x', i, sep='')))
 [1]  1  2  3  4  5  6  7  8  9 10
[1]  2  3  4  5  6  7  8  9 10
[1]  3  4  5  6  7  8  9 10
[1]  4  5  6  7  8  9 10
[1]  5  6  7  8  9 10
[1]  6  7  8  9 10
[1]  7  8  9 10
[1]  8  9 10
[1]  9 10
[1] 10

Read ?assign, ?get, but also this fortune:
 fortune('assign')

The only people who should use the assign function are those who fully
understand why you should never use the assign function.
   -- Greg Snow
  R-help (July 2009)

I haven't yet figured out why I should heed this. Regards
Liviu


 What I want to do:

 Say, I have several data files that differ only in a number, e.g. data
 points (or vector, or matrix...) Data_1, Data_2, Data_3,... and I want to
 manipulate them

 e.g. a simple sum of several data points

data - c(NA,n)
for (i in 1:n){
data[i] - Data_i + Data_[i-1]
                  }

 I know that the above code doesn't work, and I don't want to combine the
 files into one vector to solve the problem etc. - I would just like to know
 who make sure R recognizes the extension _i. I have the same problem for
 say, reading in datafiles that only differ by one digit in the extension,
 and I want to (instead of repeating code) combine the process in a loop.

 I hope I made myself clear to what my problem is.

 Thanks for your help,

 //F
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Using-changing-names-in-loop-in-R-tp3030132p3030132.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova(lme.model)

2010-11-06 Thread Ben Bolker
Mike Marchywka marchywka at hotmail.com writes:

 
 
  Date: Sat, 6 Nov 2010 07:45:26 -0700
  From: gunter.berton at gene.com
  To: sibylle.stoeckli at gmx.ch
  CC: r-help at r-project.org
  Subject: Re: [R] anova(lme.model)
 
  Sounds to me like you should really be seeking help from your local
  statistician, not this list. What you request probably cannot be done.
 
 I'm still bringing my install up to speed so I can't immediately
 read the cited R stuff below but it sounds like the OP
 mentions a controversy documented in the R packages. Is there
 a list for discussing these topics? Offhand that seems legitimate
 for a user help list unless you want people to believe that 
  it came out of a computer so it must be right, whatever a P value
 is. 

  It's not documented within the packages themselves, 
it's documented in the r-sig-mixed-models archives, and at 
http://rwiki.sciviews.org/doku.php?id=guides:lmer-tests,
and in the R FAQ
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-are-p_002dvalues-not-displayed-when-using-lmer_0028_0029_003f
and here
https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html

  cheers
   Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hashing and environments

2010-11-06 Thread Levy, Roger
Hi,

I'm trying to write a general-purpose lexicon class and associated methods 
for storing and accessing information about large numbers of specific words 
(e.g., their frequencies in different genres).  Crucial to making such a class 
practically useful is to get hashing working correctly so that information 
about specific words can be accessed quickly.  But I've never really understood 
very well how hashing works, so I'm having trouble.

Here is an example of what I've done so far:

***

setClass(Lexicon,representation(e=environment))
setMethod(initialize,Lexicon,function(.Object,wfreqs) {
.obj...@e - new.env(hash=T,parent=emptyenv())
assign(wfreqs,wfreqs,envir=.obj...@e)
return(.Object)
})

## function to access word frequencies
wfreq - function(lexicon,word) {
return(get(wfreqs,envir=lexi...@e)[word])
}

## example of use
my.lexicon - new(Lexicon,wfreqs=c(the=2,person=1))
wfreq(my.lexicon,the)

***

However, testing indicates that the way I have set this up does not achieve the 
intended benefits of having the environment hashed:

***

sample.wfreqs - trunc(runif(1e5,max=100))
names(sample.wfreqs) - as.character(1:length(sample.wfreqs))
lex - new(Lexicon,wfreqs=sample.wfreqs)
words.to.lookup - trunc(runif(100,min=1,max=1e5))
## look up the words directly from the sample.wfreqs vector
system.time({
for(i in words.to.lookup)
sample.wfreqs[as.character(i)]
},gcFirst=TRUE)
## look up the words through the wfreq() function; time approx the same
system.time({
for(i in words.to.lookup)
wfreq(lex,as.character(i))
},gcFirst=TRUE)

***

I'm guessing that the problem is that the indexing of the wfreqs vector in my 
wfreq() function is not happening inside the actual lexicon's environment.  
However, I have not been able to figure out the proper call to get the lookup 
to happen inside the lexicon's environment.  I've tried

wfreq1 - function(lexicon,word) {
return(eval(wfreqs[word],envir=lexi...@e))
}

which I'd thought should work, but this gives me an error:

 wfreq1(my.lexicon,'the')
Error in eval(wfreqs[word], envir = lexi...@e) : 
  object 'wfreqs' not found

Any advice would be much appreciated!

Best  many thanks in advance,

Roger

--

Roger Levy  Email: rl...@ucsd.edu
Assistant Professor Phone: 858-534-7219
Department of Linguistics   Fax:   858-534-4789
UC San DiegoWeb:   http://ling.ucsd.edu/~rlevy

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SMATR common slopes test

2010-11-06 Thread Eugenio Larios
Hi All,

I am confused with SMATR's test for common slope. My null hypothesis here is
that all slopes are parallel (common slopes?), right?
So if I get a p value  0.05 means that we can have confidence to reject it?
That slopes are different?
Or the other way around? it means that we have statistical confidence that
the slopes are parallel?
thanks
-- 
Eugenio Larios
PhD Student
University of Arizona.
Ecology  Evolutionary Biology.
(520) 481-2263
elari...@email.arizona.edu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] saddle points in optim

2010-11-06 Thread Jonathan Phillips
Hi,
I've been trying to use optim to minimise least squares for a
function, and then get a guess at the error using the hessian matrix
(calculated from numDeriv::hessian, which I read in some other r-help
post was meant to be more accurate than the hessian given in optim).

To get the standard error estimates, I'm calculating
sqrt(diag(solve(x))), hope that's correct.

I've found that using numDeriv's hessian gets me some NaNs for errors,
whereas the one from optim gets me numbers for all parameters.  If I
look for eigenvalues for numDeriv::hessian, I get two negative numbers
(and six positive - I'm fitting to eight parameters), so does this
mean that optim hasn't converged correctly, and has hit a saddle
point?  If so, is there any way I could assist it to find the minimum?

Thanks,
Jon Phillips

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] I want to call some R function in C code

2010-11-06 Thread 刘力平
Hi, all:

I want the user to give a function g as the parameter of our function f. In
function f, we use user's function g to compute something.  Since our
function f is implemented in C++, how do I further pass this function to C++
code?

Thank you very much!

best
Liping LIU

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] likelyhood maximization problem with polr

2010-11-06 Thread tjb


blackscorpio wrote:
 
 Dear community,
 
 I am currently trying to fit an ordinal logistic regression model with the
 polr function. I often get the same error message :
 
 attempt to find suitable starting values failed, for example with :
 ...
 Does anyone have a clue ?
 

Yes. The code that generates a starting value for the optimization is
flakey. lrm in the Design library appear to be more robust.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/likelyhood-maximization-problem-with-polr-tp2528818p3030397.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to generate multivariate uniform distribution random numbers?

2010-11-06 Thread michael
I wish to generate 100 by 1 vector of x1 and x2 both are uniform distributed
with covariance matrix \Sigma.

Thanks,

Michael

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bugs and misfeatures in polr(MASS).... fixed!

2010-11-06 Thread tjb

The start value generation code in polr is also known to fail quite
frequently. For example, against the Iris data as recently posted to this
list by blackscorpio (  Sep 6, 2010).

 polr(Species~Sepal_Length+Sepal_Width+Petal_Length+Petal_Width,data=iris)
Error in polr(Species ~ Sepal_Length + Sepal_Width + Petal_Length +
Petal_Width,  : 
  attempt to find suitable starting values failed
In addition: Warning messages:
1: In glm.fit(X, y1, wt, family = binomial(), offset = offset) :
  algorithm did not converge
2: In glm.fit(X, y1, wt, family = binomial(), offset = offset) :
  fitted probabilities numerically 0 or 1 occurred

I suggest that simply setting the coefficients beta to zero and the
cutpoints zeta to sensible values will always produce a feasible starting
point for non-pathological data. Here is my code that does this:

if(missing(start)) {
  # try something that should always work -tjb
  u - as.integer(table(y))
  u - (cumsum(u)/sum(u))[1:q]
  zetas -
 switch(method,
logistic= qlogis(u),
probit=   qnorm(u),
cauchit=  qcauchy(u),
cloglog=  -log(-log(u)) )
  s0 - c(rep(0,pc),zetas[1],log(diff(zetas)))

Using this start code the problem is not manifested. 

 source('fixed-polr.R')
 polr(Species~Sepal_Length+Sepal_Width+Petal_Length+Petal_Width,data=iris)
Call:
polr(formula = Species ~ Sepal_Length + Sepal_Width + Petal_Length + 
Petal_Width, data = iris)

Coefficients:
Sepal_Length  Sepal_Width Petal_Length  Petal_Width 
   -2.466724-6.671515 9.43168918.270058 

Intercepts:
   setosa|versicolor versicolor|virginica 
4.08018942.639320 

Residual Deviance: 11.89857 
AIC: 23.89857 

My change would also likely fix the problem reported by Kevin Coombes on May
6, 2010.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/bugs-and-misfeatures-in-polr-MASS-fixed-tp3024677p3030405.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bugs and misfeatures in polr(MASS).... fixed!

2010-11-06 Thread tjb

Note that that the enhancements in my original post solve the unresolved
problem of Chaehyung Ahn (22 Mar 2005) whose data I reproduce:

y,x,lx
0,3.2e-02,-1.49485
0,3.2e-02,-1.49485
0,1.0e-01,-1.0
0,1.0e-01,-1.0
0,3.2e-01,-0.49485
0,3.2e-01,-0.49485
1,1.0e+00,0.0
0,1.0e+00,0.0
1,3.2e+00,0.50515
1,3.2e+00,0.50515
0,1.0e+01,1.0
1,1.0e+01,1.0
1,3.2e+01,1.50515
2,3.2e+01,1.50515
2,1.0e+02,2.0
1,1.0e+02,2.0
2,3.2e+02,2.50515
1,3.2e+02,2.50515
2,1.0e+03,3.0
2,1.0e+03,3.0

Using the MASS version we get

 ahn$y-as.factor(ahn$y)
 summary(polr(y~lx,data=ahn))

Re-fitting to get Hessian

Error in optim(s0, fmin, gmin, method = BFGS, hessian = Hess, ...) : 
  initial value in 'vmmin' is not finite

Whereas,

 source('fixed-polr.R')
 summary(polr(y~lx,data=ahn))

Re-fitting to get Hessian

Call:
polr(formula = y ~ lx, data = ahn)

Coefficients:
   Value Std. Error t value
lx 2.421 0.8146   2.971

Intercepts:
Value  Std. Error t value
0|1 0.5865 0.8118 0.7224 
1|2 4.8966 1.7422 2.8106 

Residual Deviance: 20.43286 
AIC: 26.43286 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/bugs-and-misfeatures-in-polr-MASS-fixed-tp3024677p3030411.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hashing and environments

2010-11-06 Thread William Dunlap
I would make make an environemnt called wfreqsEnv
whose entry names are your words and whose entry
values are the information about the words.  I find
it convenient to use [[ to make it appear to be
a list (instead of using exists(), assign(), and get()).
E.g., the following enters the 100,000 words from a
list of 17,576 and records their id numbers and the
number of times each is found in the sample.

 wfreqsEnv - new.env(hash=TRUE, parent = emptyenv())
 words - do.call(paste, c(list(sep=), expand.grid(LETTERS,
letters, letters)))
# length(words) == 17576
 set.seed(1)
 samp - sample(seq_along(words), size=10, replace=TRUE)
 system.time(for(i in samp) {
+word - words[i]
+if (is.null(wfreqsEnv[[word]])) { # new entry
+wfreqsEnv[[word]] - list(Count=1, EntryNo=i)
+} else { # update existing entry
+wfreqsEnv[[word]]$Count - wfreqsEnv[[word]]$Count + 1
+}
+})
   user  system elapsed 
   2.280.002.14 
(The time, in seconds, is from an ancient Windows laptop, c. 2002.)

Here is a small check that we are getting what we expect:
 words[14736]
[1] Tuv
 wfreqsEnv[[Tuv]]
$Count
[1] 8

$EntryNo
[1] 14736

 sum(samp==14736)
[1] 8

If we do this with a non-hashed environment we get the same
answers but the elapsed time is now 34.81 seconds instead of
2.14.  If you make wfreqEnv be a list instead of an environment
then that time is 74.12 seconds (and the answers are the same).

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Levy, Roger
 Sent: Saturday, November 06, 2010 1:39 PM
 To: r-help@r-project.org
 Subject: [R] Hashing and environments
 
 Hi,
 
 I'm trying to write a general-purpose lexicon class and 
 associated methods for storing and accessing information 
 about large numbers of specific words (e.g., their 
 frequencies in different genres).  Crucial to making such a 
 class practically useful is to get hashing working correctly 
 so that information about specific words can be accessed 
 quickly.  But I've never really understood very well how 
 hashing works, so I'm having trouble.
 
 Here is an example of what I've done so far:
 
 ***
 
 setClass(Lexicon,representation(e=environment))
 setMethod(initialize,Lexicon,function(.Object,wfreqs) {
   .obj...@e - new.env(hash=T,parent=emptyenv())
   assign(wfreqs,wfreqs,envir=.obj...@e)
   return(.Object)
   })
 
 ## function to access word frequencies
 wfreq - function(lexicon,word) {
   return(get(wfreqs,envir=lexi...@e)[word])
 }
 
 ## example of use
 my.lexicon - new(Lexicon,wfreqs=c(the=2,person=1))
 wfreq(my.lexicon,the)
 
 ***
 
 However, testing indicates that the way I have set this up 
 does not achieve the intended benefits of having the 
 environment hashed:
 
 ***
 
 sample.wfreqs - trunc(runif(1e5,max=100))
 names(sample.wfreqs) - as.character(1:length(sample.wfreqs))
 lex - new(Lexicon,wfreqs=sample.wfreqs)
 words.to.lookup - trunc(runif(100,min=1,max=1e5))
 ## look up the words directly from the sample.wfreqs vector
 system.time({
   for(i in words.to.lookup)
   sample.wfreqs[as.character(i)]
   },gcFirst=TRUE)
 ## look up the words through the wfreq() function; time 
 approx the same
 system.time({
   for(i in words.to.lookup)
   wfreq(lex,as.character(i))
   },gcFirst=TRUE)
 
 ***
 
 I'm guessing that the problem is that the indexing of the 
 wfreqs vector in my wfreq() function is not happening inside 
 the actual lexicon's environment.  However, I have not been 
 able to figure out the proper call to get the lookup to 
 happen inside the lexicon's environment.  I've tried
 
 wfreq1 - function(lexicon,word) {
   return(eval(wfreqs[word],envir=lexi...@e))
 }
 
 which I'd thought should work, but this gives me an error:
 
  wfreq1(my.lexicon,'the')
 Error in eval(wfreqs[word], envir = lexi...@e) : 
   object 'wfreqs' not found
 
 Any advice would be much appreciated!
 
 Best  many thanks in advance,
 
 Roger
 
 --
 
 Roger Levy  Email: rl...@ucsd.edu
 Assistant Professor Phone: 858-534-7219
 Department of Linguistics   Fax:   858-534-4789
 UC San DiegoWeb:   http://ling.ucsd.edu/~rlevy
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using changing names in loop in R

2010-11-06 Thread Joshua Wiley
Hi,

If you have data that is similar enough to warrant only changing the
extension (i.e., 1, 2, etc.) and that you (at least at times) wish to
perform operations on together, it is time to start thinking of a more
flexible framework.  Fortunately, such a framework already exists in
lists.  Lists let you keep diverse data structures (i.e., you do not
have to combine everything into a simple vector).  Lists make it
tremendously easy to do many tasks (including the two you mentioned).
For example, suppose I want to read in 10 files and then do some
manipulations:

## initialize an empty list
dat - vector(mode = list, length = 10)

for(i in 1:10) {
  dat[[i]] - read.table(paste(myfilename, i, sep = ''), header = TRUE, etc.)
}

# presumably these are data frames at this point

Now everything is nicely stored, dat[[1]] is even intuitively similar
to Data_1, Data_2, etc.  At this point, suppose I want to create some
new stuff:

dat[[11]] - matrix(1, ncol = 1, nrow = 5) # add in a matrix

for(i in 1:n) {
  dat[[i]] - dat[[i]] + dat[[i - 1]]
}

dat[[12]] - 5 # just a little vector

Another great advantage of lists is that it is possible to name their
elements.  This can make things more meaningful or aid the memory.
However, even when an element is named, it can still be accessed by
its index

# name the first element 'price'
names(dat)[[1]] - price

now I could access it as any of these:

dat$price
dat[[price]]
dat[[1]]

If that weren't enough, you can easily use functions on every element
of a list with constructs such as lapply(), no for loop required!

lapply(X = dat, FUN = mean, na.rm = TRUE)

It is possible to not use lists and still do what you are after, but
frankly it is messier, more prone to error, and less effective in many
cases.  It is generally a very nice feature of the assignment operator
that it is aware of its environment and does not go assigning or
overwriting things where you do not expect.  You're left to the wolves
and your own wits with assign().

HTH,

Josh

On Sat, Nov 6, 2010 at 9:22 AM, Tuatara franziskabro...@gmail.com wrote:

 Hello everybody,

 I have usually solved this problem by repeating lines of codes instead of a
 loop, but it's such a waste of time, I thought I should really learn how to
 do it with loops:

 What I want to do:

 Say, I have several data files that differ only in a number, e.g. data
 points (or vector, or matrix...) Data_1, Data_2, Data_3,... and I want to
 manipulate them

 e.g. a simple sum of several data points

data - c(NA,n)
for (i in 1:n){
data[i] - Data_i + Data_[i-1]
                  }

 I know that the above code doesn't work, and I don't want to combine the
 files into one vector to solve the problem etc. - I would just like to know
 who make sure R recognizes the extension _i. I have the same problem for
 say, reading in datafiles that only differ by one digit in the extension,
 and I want to (instead of repeating code) combine the process in a loop.

 I hope I made myself clear to what my problem is.

 Thanks for your help,

 //F
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Using-changing-names-in-loop-in-R-tp3030132p3030132.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SMATR common slopes test

2010-11-06 Thread Kevin Middleton

Eugenio -

 I am confused with SMATR's test for common slope. My null hypothesis here is
 that all slopes are parallel (common slopes?), right?
 So if I get a p value  0.05 means that we can have confidence to reject it?
 That slopes are different?
 Or the other way around? it means that we have statistical confidence that
 the slopes are parallel?

Try this:

set.seed(5)
n - 20
x - rnorm(n)

y1 - 2 * x + rnorm(n)
y2 - 2 * x + rnorm(n)
y3 - 4 * x + rnorm(n)

# Slopes approximately equal
slope.com(x = c(x, x), y = c(y1, y2), groups = rep(c(1,2), each = n))

#$p
#[1] 0.4498037

# Slopes of 2 and 4
slope.com(x = c(x, x), y = c(y1, y3), groups = rep(c(1,2), each = n))

#$p
#[1] 0.0003850332


Cheers,
Kevin

-
Kevin M. Middleton
Department of Biology
California State University San Bernardino

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to generate multivariate uniform distribution random

2010-11-06 Thread Ted Harding
On 06-Nov-10 21:54:41, michael wrote:
 I wish to generate 100 by 1 vector of x1 and x2 both are uniform
 distributed with covariance matrix \Sigma.
 
 Thanks,
 Michael

First, some comments.

1. I don't think you mean a 100 by 1 vector of x1 and x2 since
   you have two variables. 100 by 2 perhaps?

2. Given the range over which x1 is to be uniformly distributed
   (say of length A) and the range for x2 (say of length B),
   then by virtue of the uniform distribution the variance of x1
   will be (A^2)/12 and the variance of x2 will be (B^2)/12,
   so you don't have a choice about these. So your only free
   parameter is the covariance (or the correlation) between
   x1 and x2.

That said, it is a curious little problem. Here is one simple
solution (though it may have properties that you do not want).
Using X and Y for x1 and x2 (for simplicity), and using ranges
(-1/2,1/2) for the ranges of X and Y (you could rescale later):

1. Generate X uniformly distributed on (-1/2,1/2).
2. With probability p (0  p  1) let Y=X.
   Otherwise, let Y be independently uniform on (-1/2,1/2).

This can be implemented by the following code:

  n  - 100
  p  - 0.6# (say)
  X  - runif(n)-1/2
  Z  - runif(n)-1/2
  U  - rbinom(n,1,p)  # =1 where Y=X, = 0 when Y independent of X
  Y  - X*U + Z*(1-U)
  XY - cbind(X,Y) # your n by 2 matrix of bivariate (X,Y)

Now the marginal distributions of X and Y are both uniform.
var(X) = 1/12 and var(Y) = 1/12. Since the means are 0,
cov(X,Y) = Exp(X*Y) = p*Exp(X^2) = p/12
cor(X,Y) = cov(X,Y)/sqrt(var(X)*var(Y)) = (p/12)/(1/12) = p.

So all you need to do to get a desired correlation rho between
X and Y is to set p = rho.

The one thing you may not like about this is the diagonal line
you will get for the cases where X=Y:

  plot(X,Y)

Test:

  n - 10
  p  - 0.6 # (say)
  X  - runif(n)-1/2
  Z  - runif(n)-1/2
  U  - rbinom(n,1,p)
  Y  - X*U + Z*(1-U)
  var(X)
  # [1] 0.08340525  # theory: var = 1/12 = 0.0833
   var(Y)
[1] 0.08318914  # theory: var = 1/12 = 0.0833
   cov(X,Y)
[1] 0.04953733  # theory: cov = p/12 = 0.6/12 = 0.05 
   cor(X,Y)
[1] 0.5947063   # theory: cor = p= 0.6

It would be interesting to see a solution which did not involve
having cases with X=Y!

Hoping this helps,
Ted.


E-Mail: (Ted Harding) ted.hard...@wlandres.net
Fax-to-email: +44 (0)870 094 0861
Date: 06-Nov-10   Time: 23:07:26
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hashing and environments

2010-11-06 Thread Kjetil Halvorsen
some of this can be automated using the CRAN package
hash.

Kjetil

On Sat, Nov 6, 2010 at 10:43 PM, William Dunlap wdun...@tibco.com wrote:
 I would make make an environemnt called wfreqsEnv
 whose entry names are your words and whose entry
 values are the information about the words.  I find
 it convenient to use [[ to make it appear to be
 a list (instead of using exists(), assign(), and get()).
 E.g., the following enters the 100,000 words from a
 list of 17,576 and records their id numbers and the
 number of times each is found in the sample.

 wfreqsEnv - new.env(hash=TRUE, parent = emptyenv())
 words - do.call(paste, c(list(sep=), expand.grid(LETTERS,
 letters, letters)))
 # length(words) == 17576
 set.seed(1)
 samp - sample(seq_along(words), size=10, replace=TRUE)
 system.time(for(i in samp) {
 +    word - words[i]
 +    if (is.null(wfreqsEnv[[word]])) { # new entry
 +        wfreqsEnv[[word]] - list(Count=1, EntryNo=i)
 +    } else { # update existing entry
 +        wfreqsEnv[[word]]$Count - wfreqsEnv[[word]]$Count + 1
 +    }
 +})
   user  system elapsed
   2.28    0.00    2.14
 (The time, in seconds, is from an ancient Windows laptop, c. 2002.)

 Here is a small check that we are getting what we expect:
 words[14736]
 [1] Tuv
 wfreqsEnv[[Tuv]]
 $Count
 [1] 8

 $EntryNo
 [1] 14736

 sum(samp==14736)
 [1] 8

 If we do this with a non-hashed environment we get the same
 answers but the elapsed time is now 34.81 seconds instead of
 2.14.  If you make wfreqEnv be a list instead of an environment
 then that time is 74.12 seconds (and the answers are the same).

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of Levy, Roger
 Sent: Saturday, November 06, 2010 1:39 PM
 To: r-help@r-project.org
 Subject: [R] Hashing and environments

 Hi,

 I'm trying to write a general-purpose lexicon class and
 associated methods for storing and accessing information
 about large numbers of specific words (e.g., their
 frequencies in different genres).  Crucial to making such a
 class practically useful is to get hashing working correctly
 so that information about specific words can be accessed
 quickly.  But I've never really understood very well how
 hashing works, so I'm having trouble.

 Here is an example of what I've done so far:

 ***

 setClass(Lexicon,representation(e=environment))
 setMethod(initialize,Lexicon,function(.Object,wfreqs) {
       .obj...@e - new.env(hash=T,parent=emptyenv())
       assign(wfreqs,wfreqs,envir=.obj...@e)
       return(.Object)
       })

 ## function to access word frequencies
 wfreq - function(lexicon,word) {
       return(get(wfreqs,envir=lexi...@e)[word])
 }

 ## example of use
 my.lexicon - new(Lexicon,wfreqs=c(the=2,person=1))
 wfreq(my.lexicon,the)

 ***

 However, testing indicates that the way I have set this up
 does not achieve the intended benefits of having the
 environment hashed:

 ***

 sample.wfreqs - trunc(runif(1e5,max=100))
 names(sample.wfreqs) - as.character(1:length(sample.wfreqs))
 lex - new(Lexicon,wfreqs=sample.wfreqs)
 words.to.lookup - trunc(runif(100,min=1,max=1e5))
 ## look up the words directly from the sample.wfreqs vector
 system.time({
       for(i in words.to.lookup)
               sample.wfreqs[as.character(i)]
       },gcFirst=TRUE)
 ## look up the words through the wfreq() function; time
 approx the same
 system.time({
       for(i in words.to.lookup)
               wfreq(lex,as.character(i))
       },gcFirst=TRUE)

 ***

 I'm guessing that the problem is that the indexing of the
 wfreqs vector in my wfreq() function is not happening inside
 the actual lexicon's environment.  However, I have not been
 able to figure out the proper call to get the lookup to
 happen inside the lexicon's environment.  I've tried

 wfreq1 - function(lexicon,word) {
       return(eval(wfreqs[word],envir=lexi...@e))
 }

 which I'd thought should work, but this gives me an error:

  wfreq1(my.lexicon,'the')
 Error in eval(wfreqs[word], envir = lexi...@e) :
   object 'wfreqs' not found

 Any advice would be much appreciated!

 Best  many thanks in advance,

 Roger

 --

 Roger Levy                      Email: rl...@ucsd.edu
 Assistant Professor             Phone: 858-534-7219
 Department of Linguistics       Fax:   858-534-4789
 UC San Diego                    Web:   http://ling.ucsd.edu/~rlevy

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



Re: [R] SMATR common slopes test

2010-11-06 Thread Eugenio Larios
great thanks a lot!

On Sat, Nov 6, 2010 at 3:54 PM, Kevin Middleton k...@csusb.edu wrote:


 Eugenio -

  I am confused with SMATR's test for common slope. My null hypothesis here
 is
  that all slopes are parallel (common slopes?), right?
  So if I get a p value  0.05 means that we can have confidence to reject
 it?
  That slopes are different?
  Or the other way around? it means that we have statistical confidence
 that
  the slopes are parallel?

 Try this:

 set.seed(5)
 n - 20
 x - rnorm(n)

 y1 - 2 * x + rnorm(n)
 y2 - 2 * x + rnorm(n)
 y3 - 4 * x + rnorm(n)

 # Slopes approximately equal
 slope.com(x = c(x, x), y = c(y1, y2), groups = rep(c(1,2), each = n))

 #$p
 #[1] 0.4498037

 # Slopes of 2 and 4
 slope.com(x = c(x, x), y = c(y1, y3), groups = rep(c(1,2), each = n))

 #$p
 #[1] 0.0003850332


 Cheers,
 Kevin

 -
 Kevin M. Middleton
 Department of Biology
 California State University San Bernardino




-- 
Eugenio Larios
PhD Student
University of Arizona.
Ecology  Evolutionary Biology.
(520) 481-2263
elari...@email.arizona.edu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to generate multivariate uniform distribution random

2010-11-06 Thread michael
Ted,

  Thanks for your help, it is right on the money!

for your comments:
1. Yes I mean 100 by 2, each variable x1, x2 is 100 by 1.
2. The correlation is the only free parameter.

Michael



On Sat, Nov 6, 2010 at 7:07 PM, Ted Harding ted.hard...@wlandres.netwrote:

 On 06-Nov-10 21:54:41, michael wrote:
  I wish to generate 100 by 1 vector of x1 and x2 both are uniform
  distributed with covariance matrix \Sigma.
 
  Thanks,
  Michael

 First, some comments.

 1. I don't think you mean a 100 by 1 vector of x1 and x2 since
   you have two variables. 100 by 2 perhaps?

 2. Given the range over which x1 is to be uniformly distributed
   (say of length A) and the range for x2 (say of length B),
   then by virtue of the uniform distribution the variance of x1
   will be (A^2)/12 and the variance of x2 will be (B^2)/12,
   so you don't have a choice about these. So your only free
   parameter is the covariance (or the correlation) between
   x1 and x2.

 That said, it is a curious little problem. Here is one simple
 solution (though it may have properties that you do not want).
 Using X and Y for x1 and x2 (for simplicity), and using ranges
 (-1/2,1/2) for the ranges of X and Y (you could rescale later):

 1. Generate X uniformly distributed on (-1/2,1/2).
 2. With probability p (0  p  1) let Y=X.
   Otherwise, let Y be independently uniform on (-1/2,1/2).

 This can be implemented by the following code:

  n  - 100
  p  - 0.6# (say)
  X  - runif(n)-1/2
  Z  - runif(n)-1/2
  U  - rbinom(n,1,p)  # =1 where Y=X, = 0 when Y independent of X
  Y  - X*U + Z*(1-U)
  XY - cbind(X,Y) # your n by 2 matrix of bivariate (X,Y)

 Now the marginal distributions of X and Y are both uniform.
 var(X) = 1/12 and var(Y) = 1/12. Since the means are 0,
 cov(X,Y) = Exp(X*Y) = p*Exp(X^2) = p/12
 cor(X,Y) = cov(X,Y)/sqrt(var(X)*var(Y)) = (p/12)/(1/12) = p.

 So all you need to do to get a desired correlation rho between
 X and Y is to set p = rho.

 The one thing you may not like about this is the diagonal line
 you will get for the cases where X=Y:

  plot(X,Y)

 Test:

  n - 10
  p  - 0.6 # (say)
  X  - runif(n)-1/2
  Z  - runif(n)-1/2
  U  - rbinom(n,1,p)
  Y  - X*U + Z*(1-U)
  var(X)
  # [1] 0.08340525  # theory: var = 1/12 = 0.0833
var(Y)
 [1] 0.08318914  # theory: var = 1/12 = 0.0833
cov(X,Y)
 [1] 0.04953733  # theory: cov = p/12 = 0.6/12 = 0.05
cor(X,Y)
 [1] 0.5947063   # theory: cor = p= 0.6

 It would be interesting to see a solution which did not involve
 having cases with X=Y!

 Hoping this helps,
 Ted.

 
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Fax-to-email: +44 (0)870 094 0861
 Date: 06-Nov-10   Time: 23:07:26
 -- XFMail --


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing NA in ggplot

2010-11-06 Thread Ottar Kvindesland
OK, any reason why ggplot2 does not allow filtering of NA?


ottar

On 6 November 2010 15:23, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote:

 Create a subset of your data that excludes the NAs before you feed it to
 ggplot.

 Ottar Kvindesland ottar.kvindesl...@gmail.com wrote:

 Hi list,
 
 I just got stuck with this one:
 
 In Data I have the sets age (numbers 1 to 99 and NA) and gender (M, F
 and
 NA). Then getting some nice plots using
 
 ggplot(data, aes(age[na.exclude(gender)])) +
 geom_histogram( binwidth = 3, aes(y = ..density.. ), fill = lightblue
 )
 +
   facet_grid( gender ~ .)
 
 I am trying to get a faceted graph of age distribution excluding the NA
 data
 for gender
 
 Unfortunately I end up with the error message:
 
 Error in data.frame(..., check.names = FALSE) :
 arguments imply differing number of rows: 206, 219
 
 How do i Wash out NA's in this situation?
 
 
 Regards
 
 ottar
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
 Go...
  Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 ---
 Sent from my phone. Please excuse my brevity.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing NA in ggplot

2010-11-06 Thread Joshua Wiley
On Sat, Nov 6, 2010 at 4:43 PM, Ottar Kvindesland
ottar.kvindesl...@gmail.com wrote:
 OK, any reason why ggplot2 does not allow filtering of NA?

It is not so much that ggplot2 does not allow the filtering of NA
values, it is that you need to use data from the dataset you
specified.  By subsetting in aes() rather than in data, ggplot2 has
differing datasets that it is being told to work with, so it returns
an error (I'm sure that is a simplification, but the general point).

Do your exclusion in the data argument.  I imagine something like
this, but untested since I have nothing to test it on.

ggplot(data[na.exclude(gender), ], aes(age)) +
geom_histogram( binwidth = 3, aes(y = ..density.. ), fill = lightblue ) +
facet_grid( gender ~ .)

HTH,

Josh



 ottar

 On 6 November 2010 15:23, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote:

 Create a subset of your data that excludes the NAs before you feed it to
 ggplot.

 Ottar Kvindesland ottar.kvindesl...@gmail.com wrote:

 Hi list,
 
 I just got stuck with this one:
 
 In Data I have the sets age (numbers 1 to 99 and NA) and gender (M, F
 and
 NA). Then getting some nice plots using
 
 ggplot(data, aes(age[na.exclude(gender)])) +
 geom_histogram( binwidth = 3, aes(y = ..density.. ), fill = lightblue
 )
 +
   facet_grid( gender ~ .)
 
 I am trying to get a faceted graph of age distribution excluding the NA
 data
 for gender
 
 Unfortunately I end up with the error message:
 
 Error in data.frame(..., check.names = FALSE) :
 arguments imply differing number of rows: 206, 219
 
 How do i Wash out NA's in this situation?
 
 
 Regards
 
 ottar
 
        [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 ---
 Jeff Newmiller                        The     .       .  Go Live...
 DCN:jdnew...@dcn.davis.ca.us        Basics: ##.#.       ##.#.  Live
 Go...
                                      Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
 /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
 ---
 Sent from my phone. Please excuse my brevity.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using changing names in loop in R

2010-11-06 Thread Tuatara

A more detailed example:

Say I would like to read in data files that are set-up identically and have
identical (but somewhat) different text names (see below):

data_1 - read.csv(data1.txt)
data_2 - read.csv(data2.txt)
data_3 - read.csv(data3.txt)

How do I automate this process?

(I assume the way I make R understand that the data file extension is to be
read as a number rather than a string is the same for things like applying
functions to matrices with different extensions, e.g. data_i, i = 1,2,3)
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Using-changing-names-in-loop-in-R-tp3030132p3030412.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to generate multivariate uniform distribution random

2010-11-06 Thread G. Jay Kerns
Dear Michael,

On Sat, Nov 6, 2010 at 7:27 PM, michael tufemich...@gmail.com wrote:
 Ted,

      Thanks for your help, it is right on the money!

 for your comments:
 1. Yes I mean 100 by 2, each variable x1, x2 is 100 by 1.
 2. The correlation is the only free parameter.

 Michael



I like Ted's solution.  If all you are looking for is unif(0,1), you
could use the Probability Integral Transform;  something like this:

set.seed(1)

library(MASS)
S - matrix(c(1, 0.9, 0.9, 1), nrow = 2)
X - mvrnorm(100, mu = c(0,0), Sigma = S)
Y - pnorm(X)

var(Y)
cor(Y)

You could also use copulas, but those depend on contributed packages
(and you can read more about them on the CRAN Task View for
probability distributions).

Hope this helps,
Jay


__
G. Jay Kerns, Ph.D.
Youngstown State University
http://people.ysu.edu/~gkerns/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using changing names in loop in R

2010-11-06 Thread Joshua Wiley
On Sat, Nov 6, 2010 at 3:13 PM, Tuatara franziskabro...@gmail.com wrote:

 A more detailed example:

 Say I would like to read in data files that are set-up identically and have
 identical (but somewhat) different text names (see below):

 data_1 - read.csv(data1.txt)
 data_2 - read.csv(data2.txt)
 data_3 - read.csv(data3.txt)

 How do I automate this process?

Well, the (nearly) verbatim automated duplicate would be:

for(i in 1:3) {
  assign(x = paste(data, i, sep = _), value =
read.csv(paste(data, i, .txt, sep = '')))
}

but the preferred way would be:

dat - lapply(1:3, function(x) {read.csv(paste(data, x, .txt, sep = ''))})

which would read in and store all three files in one convenient list.

Josh


 (I assume the way I make R understand that the data file extension is to be
 read as a number rather than a string is the same for things like applying
 functions to matrices with different extensions, e.g. data_i, i = 1,2,3)
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Using-changing-names-in-loop-in-R-tp3030132p3030412.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to generate multivariate uniform distribution random

2010-11-06 Thread michael
Jay,

  Yes I'm looking for unif(0,1) and your method works just fine. I
suppose your method should work for dimensions greater than 2, am I right?

Michael

On Sat, Nov 6, 2010 at 8:05 PM, G. Jay Kerns gke...@ysu.edu wrote:

 Dear Michael,

 On Sat, Nov 6, 2010 at 7:27 PM, michael tufemich...@gmail.com wrote:
  Ted,
 
   Thanks for your help, it is right on the money!
 
  for your comments:
  1. Yes I mean 100 by 2, each variable x1, x2 is 100 by 1.
  2. The correlation is the only free parameter.
 
  Michael
 
 

 I like Ted's solution.  If all you are looking for is unif(0,1), you
 could use the Probability Integral Transform;  something like this:

 set.seed(1)

 library(MASS)
 S - matrix(c(1, 0.9, 0.9, 1), nrow = 2)
 X - mvrnorm(100, mu = c(0,0), Sigma = S)
 Y - pnorm(X)

 var(Y)
 cor(Y)

 You could also use copulas, but those depend on contributed packages
 (and you can read more about them on the CRAN Task View for
 probability distributions).

 Hope this helps,
 Jay


 __
 G. Jay Kerns, Ph.D.
 Youngstown State University
 http://people.ysu.edu/~gkerns/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to generate multivariate uniform distribution random

2010-11-06 Thread G. Jay Kerns
On Sat, Nov 6, 2010 at 8:22 PM, michael tufemich...@gmail.com wrote:
 Jay,

   Yes I'm looking for unif(0,1) and your method works just fine. I
 suppose your method should work for dimensions greater than 2, am I right?

 Michael


Yes, but it gets that much more tricky to specify the covariance
matrix.  Two ways around this are to suppose that Sigma has a
simplified correlation structure, or again, to use copulas.

Jay

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where to get rcom for Linux

2010-11-06 Thread Stephen Liu
- Original Message 

From: Shige Song shiges...@gmail.com
To: Stephen Liu sati...@yahoo.com
Cc: r-help@r-project.org
Sent: Sun, November 7, 2010 12:17:42 AM
Subject: Re: [R] Where to get rcom for Linux

 isn't COM a Windows-only technology?

Hi Shige

Thanks.

I see.  I was surprised for unable to find it after having turned over the 
whole 
Internet World.  Which Windows version Win7/Vista/Win Server 2008 will be more 
suitable running RBloomberg?

B.R.
Stephen L


On Sat, Nov 6, 2010 at 12:12 PM, Stephen Liu sati...@yahoo.com wrote:
 Hi folks,

 Debian 600 64-bit

 Is rcom for Linux available?

 rcom
 rcom: R COM Client Interface and internal COM Server
 http://cran.r-project.org/web/packages/rcom/index.html

 If YES please advise where to get it.

 TIA

 B.R.
 Stephen L




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] postscript window size

2010-11-06 Thread threshold

Dear R users, simple figure: 

postscript(file=~/Desktop/figure.ps, horizontal=T, width=20, height=10)
par(mfcol=c(2,5))
plot(rnorm(100), type='l')
plot(rnorm(100), type='l')
plot(rnorm(100), type='l')
plot(rnorm(100), type='l')
plot(rnorm(100), type='l')
##-
plot(rnorm(100), type='l')
plot(rnorm(100), type='l')
plot(rnorm(100), type='l')
plot(rnorm(100), type='l')
plot(rnorm(100), type='l')
dev.off()

Does not resize when saved on Desktop. I could ignore 'width' and 'height',
and the same result. Particular figures plot(rnorm()) are too rectangular,
when I want to have them more like squares (stretched horizontally).  

Working on Ubuntu and need to paste 'figure.ps' into latex (Kile). 
Thanks a lot, robert

-- 
View this message in context: 
http://r.789695.n4.nabble.com/postscript-window-size-tp3030514p3030514.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] postscript window size

2010-11-06 Thread Joshua Wiley
Hi Robert,

You need to add paper = special to the postscript() call.

postscript(file=~/Desktop/figure.ps, horizontal=TRUE,
   width=20, height=10, paper = special)
plot()
...
plot()
dev.off()

Otherwise it is reset because you are specifying a size outside of
what can fit on the default paper size.

Once that is changed, it should work (tested both ways on Debian :))

Cheers,

Josh

On Sat, Nov 6, 2010 at 6:23 PM, threshold r.kozar...@gmail.com wrote:

 Dear R users, simple figure:

 postscript(file=~/Desktop/figure.ps, horizontal=T, width=20, height=10)
 par(mfcol=c(2,5))
 plot(rnorm(100), type='l')
 plot(rnorm(100), type='l')
 plot(rnorm(100), type='l')
 plot(rnorm(100), type='l')
 plot(rnorm(100), type='l')
 ##-
 plot(rnorm(100), type='l')
 plot(rnorm(100), type='l')
 plot(rnorm(100), type='l')
 plot(rnorm(100), type='l')
 plot(rnorm(100), type='l')
 dev.off()

 Does not resize when saved on Desktop. I could ignore 'width' and 'height',
 and the same result. Particular figures plot(rnorm()) are too rectangular,
 when I want to have them more like squares (stretched horizontally).

 Working on Ubuntu and need to paste 'figure.ps' into latex (Kile).
 Thanks a lot, robert

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/postscript-window-size-tp3030514p3030514.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] can't load nlme on windoze 7

2010-11-06 Thread Mike Marchywka

Hi,

I've got a problem that sounds a lot like this,

http://r.789695.n4.nabble.com/Re-R-R-2-12-0-hangs-while-loading-RGtk2-on-FreeBSD-td3005929.html

under windoze 7.

but it seems to hang with this stack trace,

#0  0x77830190 in ntdll!LdrFindResource_U ()

   from /cygdrive/c/Windows/system32/ntdll.dll




building goes as follows,

$ ./R CMD INSTALL --no-test-load nlme_3.1-97.tar.gz
* installing to library 'C:/pfs/R/R-2.11.1/library'
* installing *source* package 'nlme' ...
** libs
  making DLL ...
gcc -IC:/pfs/R/R-2.11.1/include -O3 -Wall  -std=gnu99 -c corStruct.c -
o corStruct.o
gcc -IC:/pfs/R/R-2.11.1/include -O3 -Wall  -std=gnu99 -c gnls.c -o gnl
s.o
gcc -IC:/pfs/R/R-2.11.1/include -O3 -Wall  -std=gnu99 -c init.c -o ini
t.o
gcc -IC:/pfs/R/R-2.11.1/include -O3 -Wall  -std=gnu99 -c matrix.c -o m
atrix.o
gcc -IC:/pfs/R/R-2.11.1/include -O3 -Wall  -std=gnu99 -c nlOptimizer.c
 -o nlOptimizer.o
gcc -IC:/pfs/R/R-2.11.1/include -O3 -Wall  -std=gnu99 -c nlme.c -o nlm
e.o
gcc -IC:/pfs/R/R-2.11.1/include -O3 -Wall  -std=gnu99 -c nlmefit.c -o
nlmefit.o
gcc -IC:/pfs/R/R-2.11.1/include -O3 -Wall  -std=gnu99 -c nls.c -o nls.
o
gcc -IC:/pfs/R/R-2.11.1/include -O3 -Wall  -std=gnu99 -c pdMat.c -o pd
Mat.o
gcc -shared -s -static-libgcc -o nlme.dll tmp.def corStruct.o gnls.o init.o matr
ix.o nlOptimizer.o nlme.o nlmefit.o nls.o pdMat.o -LC:/pfs/R/R-2.11.1/bin -lR
installing to C:/pfs/R/R-2.11.1/library/nlme/libs
  ... done
** R
** data
**  moving datasets to lazyload DB
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices ...

* DONE (nlme)


$ gcc --version
gcc (GCC) 4.3.4 20090804 (release) 1
Copyright (C) 2008 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


$ gdb
GNU gdb 6.8.0.20080328-cvs (cygwin-special)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type show copying
and show warranty for details.
This GDB was configured as i686-pc-cygwin.
(gdb) target exec R.exe
(gdb) run
Starting program: /cygdrive/c/pfs/R/R-2.11.1/bin/R.exe
[New thread 20844.0x5368]
Error: dll starting at 0x7742 not found.
Error: dll starting at 0x769c not found.
Error: dll starting at 0x7742 not found.
Error: dll starting at 0x7754 not found.
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
Error: dll starting at 0x4a0b not found.

R version 2.11.1 (2010-05-31)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

 library(nlme)
[New thread 20844.0x5154]
[Switching to thread 20844.0x5154]
Quit
(gdb) bt
#0  0x77830190 in ntdll!LdrFindResource_U ()
   from /cygdrive/c/Windows/system32/ntdll.dll
(gdb)






Mike Marchywka | V.P. Technology

415-264-8477
marchy...@phluant.com

Online Advertising and Analytics for Mobile
http://www.phluant.com


  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3-way interaction simple slopes

2010-11-06 Thread David Winsemius


On Nov 6, 2010, at 9:06 AM, Michael Wood wrote:

Can anyone show me how to test for significant simple slopes of a 3- 
way

interaction, with covariates.



You might start by defining what you mean by simple slopes when  
discussing a model with a three way interaction. You might also  
include a description of the variables involved.




my equation
tmod-(glm(PCL~ rank.f + gender.f + MONTHS + CEXPOSE.M + bf.m +
MONTHS*CEXPOSE.M*bf.m,
data=mhatv, family=gaussian ,na.action=na.omit))

Thank you
Mike



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About 5.1 Arrays

2010-11-06 Thread Stephen Liu
Hi Daniel,


 I am correcting you. :-)  You are using dim() incorrectly, and not  accessing 
the array correctly.  

 In all of your examples you should be  using dim(3,4,2).  Then you need to 
specify the indexes 

 of the array  element you want to look at. So, to use your example

Thanks for your correction.  So the index to be used in my example should be 
(3,4,2) only.


But still I'm not very clear re your advice on follows

 a-1:24
 a3d - array(a, dim = c(3,4,2))
 a3d
, , 1

 [,1] [,2] [,3] [,4]
[1,]147   10
[2,]258   11
[3,]369   12

, , 2

 [,1] [,2] [,3] [,4]
[1,]   13   16   19   22
[2,]   14   17   20   23
[3,]   15   18   21   24

 
 # 1 maps to a[1, 1, 1] in the 3d array
 a3d[1, 1, 1]
[1] 1
 
 # 2 maps to a[2, 1, 1].
 a3d[2, 1, 1]
[1] 2
 
 # 3 maps to a[3, 1, 1]
 a3d[3, 1, 1]
[1] 3
 
 # 4 maps to a[1, 2, 1]
 a3d[1, 2, 1]
[1] 4


What does it mean;

[1] 1
[1] 2
[1] 3
[1] 4

as mentioned ?

Anyway I'll move/continue on the manual to see what will happen.

B.R.
Stephen L






- Original Message 
From: Daniel Nordlund djnordl...@frontier.com
To: r-help@r-project.org
Sent: Sun, November 7, 2010 2:08:04 AM
Subject: Re: [R] About 5.1 Arrays

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Stephen Liu
 Sent: Saturday, November 06, 2010 7:38 AM
 To: Joshua Wiley
 Cc: r-help@r-project.org
 Subject: Re: [R] About 5.1 Arrays
 
 Hi Joshua,
 
 Thanks for your advice.
 
 1)
 Re your advice:-[quote]
  a3d
 , , 1 --- this is the first position of the third dimension
 
  [,1] [,2] [,3] [,4]  --- positions 1, 2, 3, 4 of the second
 dimension
 [1,]147   10
 [2,]258   11
 [3,]369   12
 ^  the first dimension
 
 , , 2 --- the second position of the third dimension
 ...
 [/quote]
 
 Where is the third dimension?
 
 
 2)
 Re your advice:-[quote]
 so you can think that in the original vector a:
 1 maps to a[1, 1, 1] in the 3d array
 2 maps to a[2, 1, 1].
 3 maps to a[3, 1, 1]
 4 maps to a[1, 2, 1]
 12 maps to a[3, 4, 1]
 20 maps to a[2, 3, 2]
 24 maps to a[3, 4, 2]
 [/quote]
 
 My finding;
 
 # 1 maps to a[1, 1, 1] in the 3d array
  a3d - array(a, dim = c(1, 1, 1))
  a3d
 , , 1
 
  [,1]
 [1,]1
 
 Correct
 
 # 2 maps to a[2, 1, 1].
  a3d - array(a, dim = c(2, 1, 1))
  a3d
 , , 1
 
  [,1]
 [1,]1
 [2,]2
 
 Correct
 
 # 3 maps to a[3, 1, 1]
  a3d - array(a, dim = c(3, 1, 1))
  a3d
 , , 1
 
  [,1]
 [1,]1
 [2,]2
 [3,]3
 
 Correct
 
 # 4 maps to a[1, 2, 1]
  a3d - array(a, dim = c(1, 2, 1))
  a3d
 , , 1
 
  [,1] [,2]
 [1,]12
 
 Incorrect.  It is 2
 
 
 # 12 maps to a[3, 4, 1]
  a3d - array(a, dim = c(3, 4, 1))
  a3d
 , , 1
 
  [,1] [,2] [,3] [,4]
 [1,]147   10
 [2,]258   11
 [3,]369   12
 
 Correct
 
 # 20 maps to a[2, 3, 2]
  a3d - array(a, dim = c(2, 3, 2))
  a3d
 , , 1
 
  [,1] [,2] [,3]
 [1,]135
 [2,]246
 
 , , 2
 
  [,1] [,2] [,3]
 [1,]79   11
 [2,]8   10   12
 
 Incorrect.  It is 12
 
 
 #  24 maps to a[3, 4, 2]
  a3d - array(a, dim = c(3, 4, 2))
  a3d
 , , 1
 
  [,1] [,2] [,3] [,4]
 [1,]147   10
 [2,]258   11
 [3,]369   12
 
 , , 2
 
  [,1] [,2] [,3] [,4]
 [1,]   13   16   19   22
 [2,]   14   17   20   23
 [3,]   15   18   21   24
 
 Correct.
 
 If I'm wrong, pls correct me.  Thanks
 
 
 B.R.
 Stephen
 

Stephen,

I am correcting you. :-)  You are using dim() incorrectly, and not accessing 
the 
array correctly.  In all of your examples you should be using dim(3,4,2).  Then 
you need to specify the indexes of the array element you want to look at. So, 
to 
use your example

 a-1:24
 a3d - array(a, dim = c(3,4,2))
 a3d
, , 1

 [,1] [,2] [,3] [,4]
[1,]147   10
[2,]258   11
[3,]369   12

, , 2

 [,1] [,2] [,3] [,4]
[1,]   13   16   19   22
[2,]   14   17   20   23
[3,]   15   18   21   24

 
 # 1 maps to a[1, 1, 1] in the 3d array
 a3d[1, 1, 1]
[1] 1
 
 # 2 maps to a[2, 1, 1].
 a3d[2, 1, 1]
[1] 2
 
 # 3 maps to a[3, 1, 1]
 a3d[3, 1, 1]
[1] 3
 
 # 4 maps to a[1, 2, 1]
 a3d[1, 2, 1]
[1] 4


Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hashing and environments

2010-11-06 Thread Levy, Roger
Wow, that is perfect: the hash package is exactly what I needed.  Thank you!

Roger

On Nov 6, 2010, at 4:09 PM, Kjetil Halvorsen wrote:

 some of this can be automated using the CRAN package
 hash.
 
 Kjetil
 
 On Sat, Nov 6, 2010 at 10:43 PM, William Dunlap wdun...@tibco.com wrote:
 I would make make an environemnt called wfreqsEnv
 whose entry names are your words and whose entry
 values are the information about the words.  I find
 it convenient to use [[ to make it appear to be
 a list (instead of using exists(), assign(), and get()).
 E.g., the following enters the 100,000 words from a
 list of 17,576 and records their id numbers and the
 number of times each is found in the sample.
 
 wfreqsEnv - new.env(hash=TRUE, parent = emptyenv())
 words - do.call(paste, c(list(sep=), expand.grid(LETTERS,
 letters, letters)))
 # length(words) == 17576
 set.seed(1)
 samp - sample(seq_along(words), size=10, replace=TRUE)
 system.time(for(i in samp) {
 +word - words[i]
 +if (is.null(wfreqsEnv[[word]])) { # new entry
 +wfreqsEnv[[word]] - list(Count=1, EntryNo=i)
 +} else { # update existing entry
 +wfreqsEnv[[word]]$Count - wfreqsEnv[[word]]$Count + 1
 +}
 +})
   user  system elapsed
   2.280.002.14
 (The time, in seconds, is from an ancient Windows laptop, c. 2002.)
 
 Here is a small check that we are getting what we expect:
 words[14736]
 [1] Tuv
 wfreqsEnv[[Tuv]]
 $Count
 [1] 8
 
 $EntryNo
 [1] 14736
 
 sum(samp==14736)
 [1] 8
 
 If we do this with a non-hashed environment we get the same
 answers but the elapsed time is now 34.81 seconds instead of
 2.14.  If you make wfreqEnv be a list instead of an environment
 then that time is 74.12 seconds (and the answers are the same).
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com
 
 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of Levy, Roger
 Sent: Saturday, November 06, 2010 1:39 PM
 To: r-help@r-project.org
 Subject: [R] Hashing and environments
 
 Hi,
 
 I'm trying to write a general-purpose lexicon class and
 associated methods for storing and accessing information
 about large numbers of specific words (e.g., their
 frequencies in different genres).  Crucial to making such a
 class practically useful is to get hashing working correctly
 so that information about specific words can be accessed
 quickly.  But I've never really understood very well how
 hashing works, so I'm having trouble.
 
 Here is an example of what I've done so far:
 
 ***
 
 setClass(Lexicon,representation(e=environment))
 setMethod(initialize,Lexicon,function(.Object,wfreqs) {
   .obj...@e - new.env(hash=T,parent=emptyenv())
   assign(wfreqs,wfreqs,envir=.obj...@e)
   return(.Object)
   })
 
 ## function to access word frequencies
 wfreq - function(lexicon,word) {
   return(get(wfreqs,envir=lexi...@e)[word])
 }
 
 ## example of use
 my.lexicon - new(Lexicon,wfreqs=c(the=2,person=1))
 wfreq(my.lexicon,the)
 
 ***
 
 However, testing indicates that the way I have set this up
 does not achieve the intended benefits of having the
 environment hashed:
 
 ***
 
 sample.wfreqs - trunc(runif(1e5,max=100))
 names(sample.wfreqs) - as.character(1:length(sample.wfreqs))
 lex - new(Lexicon,wfreqs=sample.wfreqs)
 words.to.lookup - trunc(runif(100,min=1,max=1e5))
 ## look up the words directly from the sample.wfreqs vector
 system.time({
   for(i in words.to.lookup)
   sample.wfreqs[as.character(i)]
   },gcFirst=TRUE)
 ## look up the words through the wfreq() function; time
 approx the same
 system.time({
   for(i in words.to.lookup)
   wfreq(lex,as.character(i))
   },gcFirst=TRUE)
 
 ***
 
 I'm guessing that the problem is that the indexing of the
 wfreqs vector in my wfreq() function is not happening inside
 the actual lexicon's environment.  However, I have not been
 able to figure out the proper call to get the lookup to
 happen inside the lexicon's environment.  I've tried
 
 wfreq1 - function(lexicon,word) {
   return(eval(wfreqs[word],envir=lexi...@e))
 }
 
 which I'd thought should work, but this gives me an error:
 
 wfreq1(my.lexicon,'the')
 Error in eval(wfreqs[word], envir = lexi...@e) :
   object 'wfreqs' not found
 
 Any advice would be much appreciated!
 
 Best  many thanks in advance,
 
 Roger
 
 --
 
 Roger Levy  Email: rl...@ucsd.edu
 Assistant Professor Phone: 858-534-7219
 Department of Linguistics   Fax:   858-534-4789
 UC San DiegoWeb:   http://ling.ucsd.edu/~rlevy
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 

[R] is this matrix symmetric

2010-11-06 Thread Jun Shen
Hi,

I have this symmetric matrix, at least I think so.

 col1  col2  col3
[1,] 0.20 0.05 0.06
[2,] 0.05 0.10 0.03
[3,] 0.06 0.03 0.08

or

structure(c(0.2, 0.05, 0.06, 0.05, 0.1, 0.03, 0.06, 0.03, 0.08
), .Dim = c(3L, 3L), .Dimnames = list(NULL, c(var1, var2,
var3)))

But isSymmetric() doesn't agree. Any comment? I am on R 2.10.1 Thanks.

Jun

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.