date:20100322

The updated package has been submitted to CRAN and has begun to propagate to
CRAN mirrors.

Package: WriteXLS

Version: 1.9.0

Description: Cross-platform perl based R function to create Excel (XLS) files
from one or more data frames. Each data frame will be written to a separate
named worksheet in the Excel spreadsheet. The worksheet name will be the name
of the data frame it contains or can be specified by the user.

Author(s): Marc Schwartz marc_schwa...@me.com
Maintainer: Marc Schwartz marc_schwa...@me.com

License: GPL (=2)

URL:http://r-forge.r-project.org/projects/writexls/

Key changes since version 1.8.1:

New arguments:

1. 'AdjWidth' for approximate auto column width adjustments to the longest
(widest) entry in each column. This is approximate because the built-in AutoFit
functions are not accessible from Perl. The approximation used will typically
result in a column width that is somewhat too wide rather than too narrow and
is based upon using the default font of Arial 10. Default is FALSE.

2. 'AutoFilter' for setting up autofiltering for each column. Default is FALSE.

3. 'BoldHeaderRow' to add bold font to header row entries. Default is FALSE.

4. 'FreezeRow' and 'FreezeCol' to set up frozen panes in each worksheet.
Default values are 0 and 0, where there are no frozen panes created.

The above new options will apply to ALL worksheets created in the XLS file.

Please note that after researching the potential for being able to append new
worksheets to an existing XLS file, this does not appear to be a robust option
via Perl. The combination of the required Perl packages Spreadsheet::ParseExcel
and Spreadsheet::WriteExcel does not support the preservation of many
pre-existing worksheet objects as noted in:

http://search.cpan.org/~jmcnamara/Spreadsheet-WriteExcel-2.37/lib/Spreadsheet/WriteExcel.pm#MODIFYING_AND_REWRITING_EXCEL_FILES

These include embedded graphics, cell formulae, macros, etc. which would be
lost during the worksheet appending process. Via Perl, it appears that one
cannot simply open an XLS file, add a new worksheet and then close the file.
One has to open the existing file, read each existing worksheet, write the
existing worksheets to a new file, append the new worksheets to the new file
and then close both files. Thus, given this limitation using Perl and the
potential for compromising the content of existing XLS files, there are no
plans at present to add the ability to append new worksheets to an existing
file to this package.

Thanks and regards,

Marc Schwartz

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SEM PACKAGE

2010-03-22 Thread John Fox

Dear Isaac,

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
 Behalf Of Isaac SAGAON TEYSSIER
 Sent: March-22-10 8:31 AM
 To: r-help@r-project.org
 Subject: [R] SEM PACKAGE

 Dear all,

 I would like to know if it is possible to estimate multi-group SEM by
using
 R...

Not a present with the sem package, but take a look at OpenMx
http://openmx.psyc.virginia.edu/, currently under development.

Regards,
 John

John Fox
Senator William McMaster 
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox

 Thank you

 _
 ?Cuanto espacio necesitas para guardar tus emails? Con Hotmail tienes 5GB
y
 puede ampliarse a mas.

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] importing .bil files

On Mon, Mar 22, 2010 at 12:09 PM, Sebastian Leuzinger
sebastian.leuzin...@env.ethz.ch wrote:
 Dear list

 Has anyone got a recipie at hand to import .bil files into R? From what I
 understand the .bil files I got contain layered matricies which I would lke
 to make available in R as an array or list.

 GIS people seem to be familiar with the .bil format but I am not using any
 GIS software and would prefer to deal with the data in R.

 I use the latest version of R on Mac OSX  10.5.8.

 GIS and spatial data formats can often be handled by readGDAL (for
raster grids) from the rgdal package.

 .bil files seem to be handled by the Ehdr driver in GDAL:

http://www.gdal.org/frmt_various.html

 so if your rgdal package has that driver (run gdalDrivers() to see)
then you may be sorted.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Gamma parametrization

2010-03-22 Thread Randall Wrong

Thank you very much Jay.

2010/3/19 G. Jay Kerns gke...@ysu.edu

 Dear Randall,

 On Fri, Mar 19, 2010 at 10:24 AM, Randall Wrong randall.wr...@gmail.com
 wrote:
  Dear R users,
 
  ?rgamma gives me :
 
rgamma(n, shape, rate = 1, scale = 1/rate)
 
rate: an alternative way to specify the scale.
 
The Gamma distribution with parameters shape = a and
scale = s has density
f(x)= 1/(s^a Gamma(a)) x^(a-1) e^-(x/s)
 
  Should I understand that scale=1/rate ? Is it written somewhere ?

 You are kidding, right?  It is written 8 lines above your question, by
 my count.  :-)

 Perhaps you meant rate = 1/scale.

 
  Then rgamma(n, shape=a, scale = s) should be equivalent to rgamma(n,
  shape=a, rate =1/s).

 Yep:
 dgamma(2, shape = 3, scale = 4)
 dgamma(2, shape = 3, rate = 1/4)

 
  I don't find this very clear.
 
  Thanks for your help.
 
  Randall
 

 The point is that some books (and software) parameterize by the
 'scale', and a whole other bunch parameterize by the 'rate'.  The
 reader (and user) always needs to be careful that the version used is
 the one expected. And the help file says that S doesn't have a 'scale'
 parameter at all.

 Just be careful, and you should be fine.  And IMHO, given that the PDF
 of the density is shown it is reasonably clear as-is.

 Best,
 Jay





 ***
 G. Jay Kerns, Ph.D.
 Associate Professor
 Department of Mathematics  Statistics
 Youngstown State University
 Youngstown, OH 44555-0002 USA
 Office: 1035 Cushwa Hall
 Phone: (330) 941-3310 Office (voice mail)
 -3302 Department
 -3170 FAX
 VoIP: gjke...@ekiga.net
 E-mail: gke...@ysu.edu
 http://people.ysu.edu/~gkerns/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fixed effects regression

2010-03-22 Thread Doran, Harold

Well, the best approach is not to model so many fixed effects. But, if you 
must, there are a few options. First, have you considered treating them as 
random effects and using a mixed effects linear model? 

If you must build such a large model matrix for the fixed effects, the best 
thing to do is to use some functions in the Matrix namespace to use sparse 
matrices. For instance,

fm - Matrix:::lm.fit.sparse(sparse.model.matrix(~data$yourFactor), 
data$yourOutcomeVariable)

where data$yourFactor is the factor variable with the postal IDs and 
data$yourOutcomeVariable is the DV for the regression.


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Roy Lowrance
Sent: Sunday, March 21, 2010 8:01 PM
To: r-help@r-project.org
Subject: [R] fixed effects regression

Hi All:

I am trying to move a model from Stata to R.

It is a linear regression model with about 90,000 indicator variables.

What is the best approach to follow in R?

- Roy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to analyze repeated measures count data?

2010-03-22 Thread René Mayer


Dear R community,

I've data-set with reaction times and count data (answers - yes, no)  
of N subjects under conditions A, B.

For the analysis reaction time I used aov.

fit.rt = aov(rt ~ A * B + Error(subjects/(A*B)), data = m )

But how do I analyze the frequencies correctly?

example fable of frequencies from one subject:

, , = A1

B1  B2  B3
  yes   31 3619
  no22 2710
, ,  = A2

  B1   B2B3
  yes   22 2710
  no31 3619

Is a generalized linear model the right method?
How do I specify the same model for the count data (frequencies) in glm?

is this right: glm(count~A*B*answer+(1|subject),family=poisson)?

Regards, René

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to analyze repeated measures count data?

2010-03-22 Thread Wincent

Not glm, it should be glmer in lme4 package.

Ronggui

On 22 March 2010 22:31, René Mayer ma...@psychologie.tu-dresden.de wrote:
 Dear R community,

 I've data-set with reaction times and count data (answers - yes, no) of N
 subjects under conditions A, B.
 For the analysis reaction time I used aov.

 fit.rt = aov(rt ~ A * B + Error(subjects/(A*B)), data = m )

 But how do I analyze the frequencies correctly?

 example fable of frequencies from one subject:

 , , = A1

        B1      B2      B3
  yes   31     36    19
  no    22     27    10
 , ,  = A2

      B1       B2    B3
  yes   22     27    10
  no    31     36    19

 Is a generalized linear model the right method?
 How do I specify the same model for the count data (frequencies) in glm?

 is this right: glm(count~A*B*answer+(1|subject),family=poisson)?

 Regards, René

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Wincent Ronggui HUANG
Doctoral Candidate
Dept of Public and Social Administration
City University of Hong Kong
http://asrr.r-forge.r-project.org/rghuang.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] a simple statistic question

Hi, Please suggest a method to answer below questions:


Factory_ID   Factory_Location   Factory_Size   Total_Sample
Good_Sample   Fair_Sample   Bad_Sample
--

1  City_A  Big
100  9010 10
2  City_A  Big
120 5535 30
3  City_A  Small
80  40 2515

4  City_A  Small
75  50 1510
5  City_B  Big
150  80 3040
6  City_B  Big
120  55 2540
7  City_B  Big
125  40 80  5
8  City_B  Big
100 60 2515
9  City_B  Small
70   45 15 10
10City_B  Small
85   65   5 15


(1) Is there a statistically significant different between City_A and City_B
for the amount of Good_Quality_Sample that they produce?
(2) Is there a statistically significant different between Big and Small
factories for the amount of Good_Quality_Sample that they produce?

I don't think that t-test works here because the Total_Sample (i.e., the
total number of samples) from each factories is different.
I don't like to pool data from individual factory together. For example, I
don't like to pool Factory 1 and 2 together, because the variance among
individual Factory can be quite big in real data.


Thank you

Xiang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] add information above bars of a barplot()

2010-03-22 Thread Martin Batholdy

hi,


I have a barplot with six clusters of four bars each.
Now I would like to add the exact value of each bar as a number above the bar.

I hoped to get some tips here.
I could simply add text at the different positions, but I don't understand how 
the margins on the x-axis are calculated
(how can I get / calculate the x-ticks of a barplot?).

Also I would like to code this flexible enough so that it still works when I 
have more bars in each cluster.



thanks for any suggestions!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using dev.copy

2010-03-22 Thread Dan Davison

I'm working over an ssh connection without X11 graphics. I'm making a
plot, the first stage of drawing which takes a long time. I want to
experiment with adding details. Here is what I was hoping to do, which
results in error.

## Draw the master plot on png dev 2
png(file=master.png)
plot(1:10)

## Save a copy on png dev 3
png(file=copy1.png)
dev.set(2)
dev.copy(which=3)

## Add details to copy, write to disk and view
abline(v=5)
Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) : 
  plot.new has not been called yet

Can someone tell me how to do this correctly?

Thanks a lot,

Dan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] calculate response probabilities using sem-analysis

2010-03-22 Thread Tryntsje Wesselius

 Hi everyone,

I just conducted a structural equation model for estimating a response
model. This model should predict the probability that someone is responding
to a direct mailing. I used the sem package for this. When I have my
coefficients I want to know how well my model predicts the probability of
response. How can I calculate these probabilities?
I tried to use the unstandardized coefficients, just like a regression
coefficient in the following formula:
Y = b1*x1 + b2*x2
But then I have values larger than 1, so that aren't probabilities. Does
anyone dealt with this problem before?
You can be of great help to me!!

Kind regards,

Tryntsje

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a simple statistic question

2010-03-22 Thread Joshua Wiley

Dear Xiang,

Unequal sample size is not a problem for t-tests.  If I understand
correctly, you do not want to pool your data because you believe the
variance of individual factories is heterogenous.  Are you willing to
pool the means?  You could calculate the variance for factories
individually and then pool the variances using the weighted.mean()
function (variance of each factory weighted by its sample size minus
1).  Then you could just compare the means between all the factories
from City A and B or Big and Small factories.  Another option could be
to use an ANOVA (see ?aov).  This should let you keep your data broken
down into subgroups.

If you have specific theories, I would also recommend looking into
using contrast weights.  With contrasts, you would end up basically
doing a one-sample t-test but it would be testing whether your theory
(given by the weights you assigned) fit the data well.  The nice thing
about it, is you can include a lot of predictions (e.g., that there
will be more good samples than bad samples and that big factories will
be better than small factories and that City A will be better than
City B) all in one test.

HTH,


Joshua


On Mon, Mar 22, 2010 at 7:47 AM, Xiang Gao xianggao2...@gmail.com wrote:
 Hi, Please suggest a method to answer below questions:


 Factory_ID   Factory_Location   Factory_Size       Total_Sample
 Good_Sample   Fair_Sample   Bad_Sample
 --

 1                  City_A                      Big
 100                      90                        10                 10
 2                  City_A                      Big
 120                     55                        35                 30
 3                  City_A                      Small
 80                      40                         25                15

 4                  City_A                      Small
 75                      50                         15                10
 5                  City_B                      Big
 150                      80                         30                40
 6                  City_B                      Big
 120                      55                         25                40
 7                  City_B                      Big
 125                      40                         80                  5
 8                  City_B                      Big
 100                     60                         25                15
 9                  City_B                      Small
 70                       45                         15                 10
 10                City_B                      Small
 85                       65                           5                 15
 

 (1) Is there a statistically significant different between City_A and City_B
 for the amount of Good_Quality_Sample that they produce?
 (2) Is there a statistically significant different between Big and Small
 factories for the amount of Good_Quality_Sample that they produce?

 I don't think that t-test works here because the Total_Sample (i.e., the
 total number of samples) from each factories is different.
 I don't like to pool data from individual factory together. For example, I
 don't like to pool Factory 1 and 2 together, because the variance among
 individual Factory can be quite big in real data.


 Thank you

 Xiang

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Senior in Psychology
University of California, Riverside
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] add information above bars of a barplot()

2010-03-22 Thread Rubén Roa


-Mensaje original-
De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En 
nombre de Martin Batholdy
Enviado el: lunes, 22 de marzo de 2010 15:53
Para: r help
Asunto: [R] add information above bars of a barplot()

hi,


I have a barplot with six clusters of four bars each.
Now I would like to add the exact value of each bar as a number above the bar.

I hoped to get some tips here.
I could simply add text at the different positions, but I don't understand how 
the margins on the x-axis are calculated (how can I get / calculate the x-ticks 
of a barplot?).

Also I would like to code this flexible enough so that it still works when I 
have more bars in each cluster.



thanks for any suggestions!





If you are barplotting x

barplot(x)
text(x=barplot(x),y=x,label=format(x),po=3)

should get you closer to what you want.

HTH

Rubén

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculate response probabilities using sem-analysis

2010-03-22 Thread Jarrett Byrnes


Did you back-calculate to estimate an intercept?
Alternately, I've been working on a function that takes a fitted sem  
and gets predicted values given an input.  Contact me off-list and  
I'll send it to you.


On Mar 22, 2010, at 8:37 AM, Tryntsje Wesselius wrote:


Hi everyone,

I just conducted a structural equation model for estimating a response
model. This model should predict the probability that someone is  
responding

to a direct mailing. I used the sem package for this. When I have my
coefficients I want to know how well my model predicts the  
probability of

response. How can I calculate these probabilities?
I tried to use the unstandardized coefficients, just like a regression
coefficient in the following formula:
Y = b1*x1 + b2*x2
But then I have values larger than 1, so that aren't probabilities.  
Does

anyone dealt with this problem before?
You can be of great help to me!!

Kind regards,

Tryntsje

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a simple statistic question

Dear Joshua,

Thank you so much for such fast reply. Here is my thought:

I don't know if it is fair to compare means because the total samples from
each factory can be very different (like, In Factory_5 with 150 total
samples vs. Factory_9 with 70 total samples). Maybe it is more fair to
compare frequency of Good_samples than compare means. But the frequency is
bounded by 100%. Is there any way to deal with frequency? I appreciate your
input!

Xiang

On Mon, Mar 22, 2010 at 10:41 AM, Joshua Wiley jwiley.ps...@gmail.comwrote:

 Dear Xiang,

 Unequal sample size is not a problem for t-tests.  If I understand
 correctly, you do not want to pool your data because you believe the
 variance of individual factories is heterogenous.  Are you willing to
 pool the means?  You could calculate the variance for factories
 individually and then pool the variances using the weighted.mean()
 function (variance of each factory weighted by its sample size minus
 1).  Then you could just compare the means between all the factories
 from City A and B or Big and Small factories.  Another option could be
 to use an ANOVA (see ?aov).  This should let you keep your data broken
 down into subgroups.

 If you have specific theories, I would also recommend looking into
 using contrast weights.  With contrasts, you would end up basically
 doing a one-sample t-test but it would be testing whether your theory
 (given by the weights you assigned) fit the data well.  The nice thing
 about it, is you can include a lot of predictions (e.g., that there
 will be more good samples than bad samples and that big factories will
 be better than small factories and that City A will be better than
 City B) all in one test.

 HTH,


 Joshua


 On Mon, Mar 22, 2010 at 7:47 AM, Xiang Gao xianggao2...@gmail.com wrote:
  Hi, Please suggest a method to answer below questions:
 
 
  Factory_ID   Factory_Location   Factory_Size   Total_Sample
  Good_Sample   Fair_Sample   Bad_Sample
 
 --
 
  1  City_A  Big
  100  9010 10
  2  City_A  Big
  120 5535 30
  3  City_A  Small
  80  40 2515
 
  4  City_A  Small
  75  50 1510
  5  City_B  Big
  150  80 3040
  6  City_B  Big
  120  55 2540
  7  City_B  Big
  125  40 80  5
  8  City_B  Big
  100 60 2515
  9  City_B  Small
  70   45 15 10
  10City_B  Small
  85   65   5
 15
 
 
 
  (1) Is there a statistically significant different between City_A and
 City_B
  for the amount of Good_Quality_Sample that they produce?
  (2) Is there a statistically significant different between Big and Small
  factories for the amount of Good_Quality_Sample that they produce?
 
  I don't think that t-test works here because the Total_Sample (i.e., the
  total number of samples) from each factories is different.
  I don't like to pool data from individual factory together. For example,
 I
  don't like to pool Factory 1 and 2 together, because the variance among
  individual Factory can be quite big in real data.
 
 
  Thank you
 
  Xiang
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Joshua Wiley
 Senior in Psychology
 University of California, Riverside
 http://www.joshuawiley.com/




-- 
Xiang Gao, Ph.D.
Department of Biology
University of North Texas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help

[R] Embed R code in C++

2010-03-22 Thread mans


Hi, 
Can anyone  tell me how to embed R code in a C++ file. 

I am actually using a mac running on the OSX 10.6.2 and the IDE Xcode
Version 3.2 and I would like to embed the basic function like geometric,
binomial, normal and hyper geometric distributions in a sample cpp file.

I heard about the library RInside and i have downloaded the source code for
mac but i do not know how to build it in order to use it with my IDE XCode.

Could anyone help me step by step because I am new in programming to show me
how to get this done? 

Thanks for your help. 

Mans. 
-- 
View this message in context: 
http://n4.nabble.com/Embed-R-code-in-C-tp1677784p1677784.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Lag Function

2010-03-22 Thread Downey, Patrick

Can anyone tell me what's going on here?

x - matrix(data=c(1,2,3,4,5),ncol=1)
x1 - lag(x,k=1)
x
x1
x - x1

That's with x specified as a column vector, but the same thing happens when
it's a row vector.

x - c(1,2,3,4,5)
x1 - lag(x,k=1)
x
x1
x - x1

When the documentation says Vector or matrix arguments x are coerced to
time series. What does that mean?

Thank you,
Mitch 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] IRT simulation repeated

2010-03-22 Thread Greg Snow

Or use the replicate function (which is basically a wrapper for lapply).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of jim holtman
 Sent: Thursday, March 18, 2010 9:53 AM
 To: Helena
 Cc: r-h...@stat.math.ethz.ch
 Subject: Re: [R] IRT simulation repeated
 
 result - lapply(1:100, yourFunction)
 
 On Thu, Mar 18, 2010 at 9:05 AM, Helena helenaguchen...@hotmail.com
 wrote:
 
  Hello R:
   i work on an IRT simulation research. I've written a code to
 generate a
  single dataset.As i will repeat simulating the data 100 times under
 every
  condition, how can i write the R code to make it run the single
 simulation
  code 100 times and save the generate results each time?
Thanks so much~
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.htmlhttp://www.r-
 project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 What is the problem that you are trying to solve?
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] add information above bars of a barplot()

2010-03-22 Thread William Dunlap

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Martin Batholdy
 Sent: Monday, March 22, 2010 7:53 AM
 To: r help
 Subject: [R] add information above bars of a barplot()

 hi,

 I have a barplot with six clusters of four bars each.
 Now I would like to add the exact value of each bar as a 
 number above the bar.

 I hoped to get some tips here.
 I could simply add text at the different positions, but I 
 don't understand how the margins on the x-axis are calculated
 (how can I get / calculate the x-ticks of a barplot?).

 Also I would like to code this flexible enough so that it 
 still works when I have more bars in each cluster.

You didn't say how you made the original barplot, but
here is one way use barplot()'s return value (the
x coordinates of the bar centers) to add text a little
above the top of each bar:

  z - rbind(log2(1:10), sqrt(1:10), (1:10)/3) # data matrix
  barX - barplot(z, beside=TRUE)
  text(cex=.5, x=barX, y=z+par(cxy)[2]/2, round(z,2), xpd=TRUE)

The xpd=TRUE means to not plot the text even if it is outside
of the plot area and par(cxy) gives the size of a typical
character in the current user coordinate system.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 thanks for any suggestions!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find a rectangle of maximal area

2010-03-22 Thread Hans W Borchers

Hans W Borchers hwborchers at googlemail.com writes:
 
 For an application in image processing -- using R for statistical purposes --
 I need to solve the following task:
 
 Given n (e.g. n = 100 or 200) points in the unit square, more or less
 randomly distributed. Find a rectangle of maximal area within the square
 that does not contain any of these points in its interior.
 
 If a, b are height and width of the rectangle, other constraints may have to
 be imposed such as  a, b = 0.5  and/or  0.5 = a/b = 2.0 . The rectangle
 is allowed to touch the border of the square.

And yes, the sides of the rectangle shall be parallel to the sides of the
enclosing unit square (which could be a rectangle of some size, too).

 snip
 
 Thanks in advance for any suggestions,
 Hans Werner

Erwin Kalvelagen erwin.kalvelagen at gmail.com writes:

 I solved this with a simple minded MINLP formulation using BARON
 (a global solver). 
 This seems to produce solutions relatively quickly
 (somewhat slower for n=200). 
 Actually this solved easier than I expected. See:

Dear Erwin,

yes, it is possible to emulate an exhaustive search by applying binary
variables and utilizing an MI(N)LP solver. What did you need the'non-
linearity' for? (I am asking as you did not disclose your model.)

The examples on your blog do not take into account that the ratio of longer
to shorter side length of the rectangle shall be smaller than 2. Would it be
difficult to add this restriction to your model?

Unfortunately, there is no free MINLP solver available. Formerly I have called
a Python program to utilize solvers at NEOS. Probably it would be possible to 
write a similar R function to do this.

Still I believe that a clever approach might be possible avoiding the need to
call a commercial solver. I am getting this hope from one of Jon Bentley's
articles in the series Programming Pearls.

Regards,
Hans Werner

P.S.: If you copy my request into your blog, wouldn't it be nice to add a
  pointer back to the R-help entry where this question has been asked?

 http://yetanothermathprogrammingconsultant.blogspot.com/2010/03/
 looks-difficult-to-me-2.html

 
 Erwin Kalvelagen
 Amsterdam Optimization Modeling Group
 erwin at amsterdamoptimization.com
 http://amsterdamoptimization.com


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lag Function

2010-03-22 Thread Erik Iverson


Hello,

Downey, Patrick wrote:

Can anyone tell me what's going on here?

x - matrix(data=c(1,2,3,4,5),ncol=1)
x1 - lag(x,k=1)
x
x1
x - x1

That's with x specified as a column vector, but the same thing happens when
it's a row vector.

x - c(1,2,3,4,5)
x1 - lag(x,k=1)
x
x1
x - x1



I'm not sure what you're expecting to happen.  Can you clarify what 
needs explaining?  My guess is that the 'lag' function is not doing what 
you expect, but you don't say what you expect.




When the documentation says Vector or matrix arguments x are coerced to
time series. What does that mean?


Time series are a class of objects in R, it means if you don't pass the 
lag function a time series object, it's going to try to turn it into one.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plot symbols on dendrogram leaves

2010-03-22 Thread Wade Wall

Hi all,

I am wondering if there is a way to plot symbols onto the leaves of a
dendrogram.  Thanks for any help.

Wade

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] add information above bars of a barplot()

2010-03-22 Thread William Dunlap

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
 Sent: Monday, March 22, 2010 9:31 AM
 To: Martin Batholdy; r help
 Subject: Re: [R] add information above bars of a barplot()

  -Original Message-
  From: r-help-boun...@r-project.org 
  [mailto:r-help-boun...@r-project.org] On Behalf Of Martin Batholdy
  Sent: Monday, March 22, 2010 7:53 AM
  To: r help
  Subject: [R] add information above bars of a barplot()

  hi,

  I have a barplot with six clusters of four bars each.
  Now I would like to add the exact value of each bar as a 
  number above the bar.

  I hoped to get some tips here.
  I could simply add text at the different positions, but I 
  don't understand how the margins on the x-axis are calculated
  (how can I get / calculate the x-ticks of a barplot?).

  Also I would like to code this flexible enough so that it 
  still works when I have more bars in each cluster.

 You didn't say how you made the original barplot, but
 here is one way use barplot()'s return value (the
 x coordinates of the bar centers) to add text a little
 above the top of each bar:

   z - rbind(log2(1:10), sqrt(1:10), (1:10)/3) # data matrix
   barX - barplot(z, beside=TRUE)
   text(cex=.5, x=barX, y=z+par(cxy)[2]/2, round(z,2), xpd=TRUE)

 The xpd=TRUE means to not plot the text even if it is outside
 : I meant either not clip or plot
 of the plot area and par(cxy) gives the size of a typical
 character in the current user coordinate system.

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com 

  thanks for any suggestions!

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lag Function

2010-03-22 Thread Gabor Grothendieck

It seems to mean that it adds a Tsp attribute but it does not change
the class to ts:

 dput(lag(1:3))
structure(1:3, .Tsp = c(0, 2, 1))

Try this:

 ts(1:3) - structure(lag(1:3), class = ts)
Time Series:
Start = 1
End = 2
Frequency = 1
[1] -1 -1

or

 ts(1:3) - lag(ts(1:3))
Time Series:
Start = 1
End = 2
Frequency = 1
[1] -1 -1



On Mon, Mar 22, 2010 at 12:15 PM, Downey, Patrick pdow...@urban.org wrote:
 Can anyone tell me what's going on here?

 x - matrix(data=c(1,2,3,4,5),ncol=1)
 x1 - lag(x,k=1)
 x
 x1
 x - x1

 That's with x specified as a column vector, but the same thing happens when
 it's a row vector.

 x - c(1,2,3,4,5)
 x1 - lag(x,k=1)
 x
 x1
 x - x1

 When the documentation says Vector or matrix arguments x are coerced to
 time series. What does that mean?

 Thank you,
 Mitch

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error message

2010-03-22 Thread Mathew, Abraham T



I'm recoding variables and running a logit. Unfortunately, I get the following 
error.



data04$V043114
part - data04$V043114
attributes(part)
summary(part)

partb  part
partb[part %in% levels(part)[4]] - NA
partb[part %in% levels(part)[5]] - NA
partb[part %in% levels(part)[6]] - NA
partb[part %in% levels(part)[7]] - NA
partb - factor(partb)

attributes(partb)
summary(partb)
table(partb)
table(part, partb)
cbind(part, partb)

partisan041 - partb
partisan042 - as.numeric(partb)

summary(partisan041)
summary(partisan042)


Then when I try to run the logit model using Zelig, I get an error message

anes04one - zelig(trade041a ~ age042 + education042 + personal042 + economy042 
+ partisan042 + employment042 + union042 + home042 + market042 + race042 + 
income042 + gender042, model=logit, data=data04)

#Error in model.frame.default(formula = trade041a ~ age042 + education042 +  : 
#  variable lengths differ (found for 'partisan042')


Can anyone help???
Abraham M.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help needed with boxplot

2010-03-22 Thread kathy_BJ


I am new to R, can anyone help with boxplot for a dataset like:
file1 col1 col2 col3 col4 col5
050350005 101 56.625 48.318 RED
051010002 106 50.625 46.990 GREEN
051190007 25 65.875 74.545 BLUE
051191002 246 52.875 57.070 RED
220050004 55 70 80.274 BLUE
220150008 75 67.750 62.749 RED
220170001 77 65.750 54.307 GREEN
file2
col1 col2 col3 col4 col5
050350005 101 56.625 57 RED
051010002 106 50.625 77 GREEN
051190007 25 65.875 51.6 BLUE
051191002 246 52.875 55.070 RED
220050004 55 70 32 BLUE
220150008 75 67.750 32.49 RED

for each color (red,green and blue), I need to compare file1 and file2 by
making box plot with MB and RMSE for (col4-col3) for file1 and file2 by
dividing col2 in different group: if col220,20=col250, 50 = col2 70,
col2 =70. That is, for the boxplot, the x is (20, 20-50,50-70, 70), while
y is MB (and RMSE) of the difference of col4 and col3

I hope I didn't confuse anybody. Thank you so much

-- 
View this message in context: 
http://n4.nabble.com/help-needed-with-boxplot-tp1677678p1677678.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] maxNR - Error in p(a, b) : element 1 is empty; the part of the args list of '*' being evaluated was: (b, t)

2010-03-22 Thread 4-real


Hello everyone...
We were trying to implement the Newton-Raphson method in R, and estimate the
parameters a and b, of a function, F, however we can't seem to implement
this the right way. Hope you can show me the right way to do this. I think
what we want R to do is to read the data from the website and then peform
maxNR on the function, F. Btw the version of R being used is RGui for
Windows if it helps to know this.

R-code below:

 library(maxLik)
 require(maxLik)
 
 x -
 read.table('http://www.math.ku.dk/kurser/2008-09/blok4/stat2/doku/data/Eksempel_6_3.txt',
 header = TRUE);
 t - log(x$Koncentration);
 X - x$Status;
 
 p - function(a,b) exp(a+b*t)/(1+exp(a+b*t));
 S - sum(X);
 SP - sum(t*X);
 
 F - function(a,b) {
+ c(sum(p(a,b)) - S,
+   sum(t*p(a,b)) - SP)
+ }
 
 
 z - maxNR(F, start=1, print.level=2)
Error in p(a, b) : element 1 is empty;
   the part of the args list of '*' being evaluated was:
   (b, t)
 


Thanks and best regards.
-- 
View this message in context: 
http://n4.nabble.com/maxNR-Error-in-p-a-b-element-1-is-empty-the-part-of-the-args-list-of-being-evaluated-was-b-t-tp1677790p1677790.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] calculate response probabilities using sem-analysis

2010-03-22 Thread Tryntsje Wesselius

Hi everyone,

I just conducted a structural equation model for estimating a response
model. This model should predict the probability that someone is responding
to a direct mailing. I used the sem package for this. When I have my
coefficients I want to know how well my model predicts the probability of
response. How can I calculate these probabilities?
I tried to use the unstandardized coefficients, just like a regression
coefficient in the following formula:
Y = b1*x1 + b2*x2
But then I have values larger than 1, so that aren't probabilities. Does
anyone dealt with this problem before?
You can be of great help to me!!

Kind regards,

Tryntsje

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find a rectangle of maximal area

On Mon, Mar 22, 2010 at 4:28 PM, Hans W Borchers
hwborch...@googlemail.com wrote:

 Still I believe that a clever approach might be possible avoiding the need to
 call a commercial solver. I am getting this hope from one of Jon Bentley's
 articles in the series Programming Pearls.


Is this the 'Largest Empty Rectangle' problem?

http://en.wikipedia.org/wiki/Largest_empty_rectangle

I had a look at some of the references from Wikipedia, but they all
follow a similar pattern, one I have noticed in many computer science
journal articles:

 1. State a problem that looks tricky.
 2. Say We have an efficient algorithm for the problem stated in #1
 3. Proceed to derive, using much algebra and theory, the efficient algorithm.
 4. Stop.

The idea of actually producing some dirty, filthy, actual code to
implement their shiny algorithms never seems to cross their minds.

 I also found a similar question from 2008 asked on the R-sig-geo
mailing list. That didn't get much help either!

Sorry.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Factors attribute?

I noticed that when I fit a linear model using 'lm' there is an attribute 
called factors that is added to the term. It doesn't seem to appear for 
'model.matrix', just 'lm'. I have been unable to find where it gets constructed 
or what it means? It looks like a two dimensional array that I may be able to 
use so I would just like to get some 'official' statement regarding what it is 
and how it is constructed. I would rather not go on my assumptions. An example 
would be like:

 l - lm(prestige ~ income + education, data=Duncan)
 attr(l$terms,factors)
  income education
prestige   0 0
income 1 0
education  0 1

Thank you.

Kevin Burton
rkevinbur...@charter.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] superfluous distribution found with mclust

2010-03-22 Thread Denis Chabot

Dear R users,

I use mclust to fit a mixture of normal distributions to many datasets. Usually 
the Mclust function finds 1 or two normal distributions, rarely, 3.

But I hit a strange case today.

my.data - c(57.96920, 51.79415, 51.20538, 55.53637, 51.64291, 56.61476, 
51.28855, 55.56169, 51.85113, 54.03330, 51.37370, 49.48561, 52.41580, 53.51176, 
60.49293, 55.77012, 51.59270, 56.29660, 55.90048, 53.05432, 50.87498, 58.47613, 
54.60827, 54.16143, 52.94914, 58.89408, 51.17116, 54.16909, 51.94852, 53.29897, 
57.21962, 66.94420, 56.65536, 53.38147, 52.79163, 52.55879, 55.54395, 54.33984, 
51.79235, 52.93464, 50.03343, 59.04797, 51.85276, 53.16419, 53.27404, 60.08775, 
52.96493, 54.15129, 58.53050, 51.74431, 50.67817, 51.22570, 57.60541, 51.32998, 
56.73625, 55.99371, 50.41035, 52.79797, 59.75973, 52.03613, 56.59133, 51.66319, 
51.06316, 55.57699, 50.12779, 56.04503, 55.75857, 57.55347, 51.48167, 52.22395, 
54.96204, 59.58895, 55.49020, 50.50893, 49.97572, 53.26222, 57.10047, 51.25523, 
52.38768, 56.42965, 51.83258, 55.40537, 51.60564, 54.68883, 53.48098, 58.47231, 
70.15088, 51.68805, 52.82636, 52.97804, 51.90228, 53.49184, 52.24366, 52.36895, 
53.26520, 52.27327, 50.85403)

cl - mclustBIC(my.data)
myModel - summary(cl, my.data)

Warning message:
In map(out$z) : no assignment to 1

I do not know why this happens, but this confirms that a first distribution was 
found but no data was assigned to it:

myModel$classification
 [1] 3 2 2 3 2 3 2 3 2 2 2 2 2 2 3 3 2 3 3 2 2 3 2 2 2 3 2 2 2 2 3 4 3 2 2 2 3 
2 2 2
[41] 2 3 2 2 2 3 2 2 3 2 2 2 3 2 3 3 2 2 3 2 3 2 2 3 2 3 3 3 2 2 3 3 3 2 2 2 3 
2 2 3
[81] 2 3 2 2 2 3 4 2 2 2 2 2 2 2 2 2 2


Furthermore, the first and second distributions have almost the same mean:

myModel$parameters$mean
   1234 
52.33903 52.33948 57.14263 68.54754 



Graphically, I don't see a reason for the distribution with mean=52.33903 to be 
there:


hist(my.data, breaks=99, freq=F, main=, border=grey(0.5))
rug(my.data, ticksize = 0.01, quiet = TRUE)

newx - seq(from = min(my.data), to = max(my.data), length = 500)
Dens - dens(modelName = myModel$modelName, data = newx,
parameters = myModel$parameters)
lines(newx, Dens, col=blue)   


Do you know why I get this first distribution with no member?

Thanks in advance,

Denis Chabot

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Arima forecasting

2010-03-22 Thread Matteo Bertini

 Matteo Bertini schrieb:

 Hello everyone,

 I'm doing some benchmark comparing Arima [1] and SVR on time series data.

 I'm using an out-of-sample one-step-ahead prediction from Arima using
 the fitted method [2].

 Do someone know how to have a two-steps-ahead forecast timeseries from
 Arima?


 Thanks,
 Matteo Bertini

 [1] http://robjhyndman.com/software/forecast
 [2] AirPassengers example on page 5

On Fri, Mar 19, 2010 at 5:31 PM, Stephan Kolassa stephan.kola...@gmx.de wrote:
 Hi Matteo,

 just use forecast.Arima() with h=2 to get forecasts up to 2 steps ahead. R
 will automatically use forecast.Arima() if you call forecast() with an Arima
 object.

 library(forecast)
 model - auto.arima(AirPassengers)
 forecast(model,h=2)

 HTH,
 Stephan


I can perhaps reformulate my question, suppose I have like in [2]:

air.model - Arima(AirPassengers[1:100],c(0,1,1))
air.model2 - Arima(AirPassengers,model=air.model)
outofsample - ts(fitted(air.model2)[-c(1:100)],s=1957+4/12,f=12)

As I can understand 'outofsample' is the timeseries of t+1 forecasts.

What is the equivalent code to obtain the 'outofsample' timeseries
using forecast.Arima()?

Something like this pseudo code?

for i in range(100, 200):
air.model - Arima(AirPassengers[1+i:100+i], c(0,1,1))
air.model2 - Arima(AirPassengers, model=air.model)
outofsample.append( forecast(air.model2, h=1) )

Thanks,
Matteo Bertini

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Why \\ instead of simple / to specify a file path

2010-03-22 Thread Bogaso


Hi all,

I have saved my workplace in a .RData format. However if I want to open
that, I need to use following code :

load(C:\\..)

Here my question is why \\. In all the time generally we use / like when
we use read.delim() function etc. Is there any possibility to have some
consistency there? Is there any other way to re-open the .RData file?

Thanks,
-- 
View this message in context: 
http://n4.nabble.com/Why-instead-of-simple-to-specify-a-file-path-tp1677973p1677973.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Factors attribute?

2010-03-22 Thread Henrique Dallazuanna

See ?terms

On Mon, Mar 22, 2010 at 2:08 PM,  rkevinbur...@charter.net wrote:
 I noticed that when I fit a linear model using 'lm' there is an attribute 
 called factors that is added to the term. It doesn't seem to appear for 
 'model.matrix', just 'lm'. I have been unable to find where it gets 
 constructed or what it means? It looks like a two dimensional array that I 
 may be able to use so I would just like to get some 'official' statement 
 regarding what it is and how it is constructed. I would rather not go on my 
 assumptions. An example would be like:

 l - lm(prestige ~ income + education, data=Duncan)
 attr(l$terms,factors)
          income education
 prestige       0         0
 income         1         0
 education      0         1

 Thank you.

 Kevin Burton
 rkevinbur...@charter.net

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why \\ instead of simple / to specify a file path

On Mon, Mar 22, 2010 at 5:22 PM, Bogaso bogaso.christo...@gmail.com wrote:

 Hi all,

 I have saved my workplace in a .RData format. However if I want to open
 that, I need to use following code :

 load(C:\\..)

 Here my question is why \\. In all the time generally we use / like when
 we use read.delim() function etc. Is there any possibility to have some
 consistency there? Is there any other way to re-open the .RData file?

 Single forward slash works for me (I use H: drive here, our 'home'
network folder on our system):

  x=1
  y=2
  save.image(file=H:/test.rdata)
  ls()
 [1] x y
  rm(x)
  rm(y)
  load(h:/test.rdata)
  x
 [1] 1

 How does it not work for you? Error message and R version please!

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a simple statistic question

2010-03-22 Thread Joshua Wiley

Dear Xiang,

Now I understand what you meant.  If you are only interested in
comparing the Good Samples, I think you would have to use the
proportion (Good Sample/Total Sample) or something similar.  Another
thought would be to dummy code the data (e.g., Good = +1, Fair = 0,
Bad = -1).  Then you could compare the means.  Obviously in my example
the mean would be relatively less impacted by Fair than Bad samples.
Another benefit of this approach (over comparing mean number of good
samples from each city) is that you estimate variability within
factories (based of the dummy codes) which controls for many variables
relative to within city.

Once you dummy code, the test itself is not difficult.  Below is a
sample function.  You can either include the mean, standard deviation,
and n of each group OR include the raw data with each factory being a
separate object in a list that you use as x.  The lambdas are the
weights to apply to the means.  If you want to compare City A to City
B, you could use -1 for every factory in A and +1 for every factory in
B (same idea as a two sample t-test, you just estimate the variability
more finely because it is by factory and then pooled).  A cautionary
note, I just wrote this function myself.  I tested that it gives the
same result as the t.test() function on two samples (mine uses
one-tailed p-values) for simple numeric vectors; however, I have no
idea what it would do with other types of data and it may even appear
to have worked but returned wrong results.


##
t.contrast.test - function(x=NA, lambda, m=NULL, s=NULL, n=NULL,
raw=TRUE, na.rm=TRUE) {
ifelse(identical(raw, TRUE), {
for(i in seq_along(x)) {m[i] - mean(x[[i]], na.rm=na.rm)};
for(i in seq_along(x)) {s[i] - sd(x[[i]], na.rm=na.rm)};
for(i in seq_along(x)) {n[i] - length(x[[i]])};
NA}, {
NA})
df - sum(n-1)
effect - sum(m*lambda)
s2.pooled - weighted.mean(x=s^2, w=n-1)
sample.correction - sum((lambda^2)/n)
variability - sqrt(sample.correction*s2.pooled)
t.score - effect/variability
p.score - pt(q=t.score, df=df, lower.tail=F)
r.score - t.score/sqrt((t.score^2)+df)
value - list(t.score, p.score, r.score, s2.pooled, df)
names(value) - c(t.contrast, p.value, r.contrast,
pooled.variance, df)
return(value)}
##


I hope that all made sense.

Best Regards,

Joshua

-- 
Joshua Wiley
Senior in Psychology
University of California, Riverside
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] add information above bars of a barplot()

2010-03-22 Thread Greg Snow

Adding text at the top of the bars will tend to distort the perception of their 
heights.  It is better to place numbers (if they are even needed) in the 
margins.  Switching to a dotplot instead of a barplot may be more meaningful as 
well.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Martin Batholdy
 Sent: Monday, March 22, 2010 8:53 AM
 To: r help
 Subject: [R] add information above bars of a barplot()
 
 hi,
 
 
 I have a barplot with six clusters of four bars each.
 Now I would like to add the exact value of each bar as a number above
 the bar.
 
 I hoped to get some tips here.
 I could simply add text at the different positions, but I don't
 understand how the margins on the x-axis are calculated
 (how can I get / calculate the x-ticks of a barplot?).
 
 Also I would like to code this flexible enough so that it still works
 when I have more bars in each cluster.
 
 
 
 thanks for any suggestions!
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Factors attribute?


I am sorry but I didn't see factors mentioned in this documentation.

Kevin

 Henrique Dallazuanna www...@gmail.com wrote: 
 See ?terms
 
 On Mon, Mar 22, 2010 at 2:08 PM,  rkevinbur...@charter.net wrote:
  I noticed that when I fit a linear model using 'lm' there is an attribute 
  called factors that is added to the term. It doesn't seem to appear for 
  'model.matrix', just 'lm'. I have been unable to find where it gets 
  constructed or what it means? It looks like a two dimensional array that I 
  may be able to use so I would just like to get some 'official' statement 
  regarding what it is and how it is constructed. I would rather not go on my 
  assumptions. An example would be like:
 
  l - lm(prestige ~ income + education, data=Duncan)
  attr(l$terms,factors)
           income education
  prestige       0         0
  income         1         0
  education      0         1
 
  Thank you.
 
  Kevin Burton
  rkevinbur...@charter.net
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Embed R code in C++

2010-03-22 Thread Sharpie



mans wrote:
 
 Hi, 
 Can anyone  tell me how to embed R code in a C++ file. 
 
 I am actually using a mac running on the OSX 10.6.2 and the IDE Xcode
 Version 3.2 and I would like to embed the basic function like geometric,
 binomial, normal and hyper geometric distributions in a sample cpp file.
 
 I heard about the library RInside and i have downloaded the source code
 for mac but i do not know how to build it in order to use it with my IDE
 XCode.
 
 Could anyone help me step by step because I am new in programming to show
 me how to get this done? 
 
 Thanks for your help. 
 
 Mans. 
 


I use R on OS X, but haven't used Xcode that much- so I can't really offer
any advice.  However, you may try posting this question on the Mac-specific
mailing list:

  https://stat.ethz.ch/mailman/listinfo/r-sig-mac

Your question will probably get better answers there.

Good luck!

-Charlie

-
Charlie Sharpsteen
Undergraduate-- Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://n4.nabble.com/Embed-R-code-in-C-tp1677784p1678051.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a simple statistic question

Thank you very much Joshua. I was thinking to use logistic regression with
glm(). But this will pool the individual factories which share the same
factor levels together. I was puzzled by how to deal with individual
factory. Any idea? I will try your method anyway.

Xiang



On Mon, Mar 22, 2010 at 12:55 PM, Joshua Wiley jwiley.ps...@gmail.comwrote:

 Dear Xiang,

 Now I understand what you meant.  If you are only interested in
 comparing the Good Samples, I think you would have to use the
 proportion (Good Sample/Total Sample) or something similar.  Another
 thought would be to dummy code the data (e.g., Good = +1, Fair = 0,
 Bad = -1).  Then you could compare the means.  Obviously in my example
 the mean would be relatively less impacted by Fair than Bad samples.
 Another benefit of this approach (over comparing mean number of good
 samples from each city) is that you estimate variability within
 factories (based of the dummy codes) which controls for many variables
 relative to within city.

 Once you dummy code, the test itself is not difficult.  Below is a
 sample function.  You can either include the mean, standard deviation,
 and n of each group OR include the raw data with each factory being a
 separate object in a list that you use as x.  The lambdas are the
 weights to apply to the means.  If you want to compare City A to City
 B, you could use -1 for every factory in A and +1 for every factory in
 B (same idea as a two sample t-test, you just estimate the variability
 more finely because it is by factory and then pooled).  A cautionary
 note, I just wrote this function myself.  I tested that it gives the
 same result as the t.test() function on two samples (mine uses
 one-tailed p-values) for simple numeric vectors; however, I have no
 idea what it would do with other types of data and it may even appear
 to have worked but returned wrong results.


 ##
 t.contrast.test - function(x=NA, lambda, m=NULL, s=NULL, n=NULL,
 raw=TRUE, na.rm=TRUE) {
 ifelse(identical(raw, TRUE), {
 for(i in seq_along(x)) {m[i] - mean(x[[i]], na.rm=na.rm)};
 for(i in seq_along(x)) {s[i] - sd(x[[i]], na.rm=na.rm)};
 for(i in seq_along(x)) {n[i] - length(x[[i]])};
 NA}, {
 NA})
 df - sum(n-1)
 effect - sum(m*lambda)
 s2.pooled - weighted.mean(x=s^2, w=n-1)
 sample.correction - sum((lambda^2)/n)
 variability - sqrt(sample.correction*s2.pooled)
 t.score - effect/variability
 p.score - pt(q=t.score, df=df, lower.tail=F)
 r.score - t.score/sqrt((t.score^2)+df)
 value - list(t.score, p.score, r.score, s2.pooled, df)
 names(value) - c(t.contrast, p.value, r.contrast,
 pooled.variance, df)
 return(value)}
 ##


 I hope that all made sense.

 Best Regards,

 Joshua

 --
 Joshua Wiley
 Senior in Psychology
 University of California, Riverside
 http://www.joshuawiley.com/




-- 
Xiang Gao, Ph.D.
Department of Biology
University of North Texas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Factors attribute?


I am sorry but I didn't see factors mentioned in this documentation.

Kevin

 Henrique Dallazuanna www...@gmail.com wrote: 
 See ?terms
 
 On Mon, Mar 22, 2010 at 2:08 PM,  rkevinbur...@charter.net wrote:
  I noticed that when I fit a linear model using 'lm' there is an attribute 
  called factors that is added to the term. It doesn't seem to appear for 
  'model.matrix', just 'lm'. I have been unable to find where it gets 
  constructed or what it means? It looks like a two dimensional array that I 
  may be able to use so I would just like to get some 'official' statement 
  regarding what it is and how it is constructed. I would rather not go on my 
  assumptions. An example would be like:
 
  l - lm(prestige ~ income + education, data=Duncan)
  attr(l$terms,factors)
           income education
  prestige       0         0
  income         1         0
  education      0         1
 
  Thank you.
 
  Kevin Burton
  rkevinbur...@charter.net
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] importing .bil files

2010-03-22 Thread Sharpie



Barry Rowlingson wrote:
 
 GIS and spatial data formats can often be handled by readGDAL (for
 raster grids) from the rgdal package.
 
  .bil files seem to be handled by the Ehdr driver in GDAL:
 
 http://www.gdal.org/frmt_various.html
 
  so if your rgdal package has that driver (run gdalDrivers() to see)
 then you may be sorted.
 
 Barry
 


I think it may be useful to add that rgdal is not provided as a pre-built
binary package for OS X-- you have to build it from source.  This is because
it interfaces with the GDAL library which is a fairly intricate piece of
software that one may wish to custom build.

There are two options I know of for installing rgdal on OS X.  The first is
to use GDAL binaries provided by William Kyngesburye-- he has framework
versions of GDAL along with a pre-built rgdal binary at:

  http://www.kyngchaos.com/software/frameworks#gdal

The second option is to roll your own GDAL and build rgdal from source to
link against it.  You may want to use this option if William's binary
version doesn't include the Ehdr driver.  I posted some instructions for
building the rgdal package from source at:

  http://n4.nabble.com/Help-with-RGDAL-td908487.html#a908488

Hope this helps!

-Charlie


-
Charlie Sharpsteen
Undergraduate-- Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://n4.nabble.com/importing-bil-files-tp1677546p1678071.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Factors attribute?

Kevin,

See ?terms.object, which is indicated in the Value section of ?terms and listed 
in the See Also of ?terms.

HTH,

Marc Schwartz

On Mar 22, 2010, at 1:16 PM, rkevinbur...@charter.net wrote:

 
 I am sorry but I didn't see factors mentioned in this documentation.
 
 Kevin
 
  Henrique Dallazuanna www...@gmail.com wrote: 
 See ?terms
 
 On Mon, Mar 22, 2010 at 2:08 PM,  rkevinbur...@charter.net wrote:
 I noticed that when I fit a linear model using 'lm' there is an attribute 
 called factors that is added to the term. It doesn't seem to appear for 
 'model.matrix', just 'lm'. I have been unable to find where it gets 
 constructed or what it means? It looks like a two dimensional array that I 
 may be able to use so I would just like to get some 'official' statement 
 regarding what it is and how it is constructed. I would rather not go on my 
 assumptions. An example would be like:
 
 l - lm(prestige ~ income + education, data=Duncan)
 attr(l$terms,factors)
  income education
 prestige   0 0
 income 1 0
 education  0 1
 
 Thank you.
 
 Kevin Burton
 rkevinbur...@charter.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] importing .bil files

On Mon, Mar 22, 2010 at 12:09 PM, Sebastian Leuzinger
sebastian.leuzin...@env.ethz.ch wrote:
 Dear list

 Has anyone got a recipie at hand to import .bil files into R? From what I
 understand the .bil files I got contain layered matricies which I would lke
 to make available in R as an array or list.

 GIS people seem to be familiar with the .bil format but I am not using any
 GIS software and would prefer to deal with the data in R.

 I use the latest version of R on Mac OSX  10.5.8.

 There's another complication that might make things more complicated,
or, pervesely, make it simpler...

 .bil files should come with a .hdr file. The .bil is just the NxMxZ
raw data. No definition of the structure or coordinates at all - it
could be NxMxZ or ZxNxM or even any numbers with the right product.
All that info is held in the accompanying .hdr file. If you cant get
the .hdr file and don't know the structure and it isn't a product of
three primes times 4, 8, or 16, then you are possibly in trouble...

 So without a .hdr and with dimension known you can try using R's
binary connection functions to read in the raw bytes and whack up an
array of the right dimension. See ?readBin.

 Or you can create a .hdr file yourself. They are plain text and quite
descriptive - here's one I made earlier:

BYTEORDER  I
LAYOUT BIL
NROWS  22
NCOLS  20
NBANDS 1
NBITS  32
BANDROWBYTES   80
TOTALROWBYTES  80
PIXELTYPE  FLOAT
ULXMAP 22.7946212255725
ULYMAP 5.45149245118748
XDIM   0.333594138544999
YDIM   0.333594138544999

this is for a single layer 22x20 grid of floating point numbers. You
can use writeGDAL with the EHdr driver to create these things (and the
.bil files) to see what it should be, or read a spec somewhere...

How's that?

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why \\ instead of simple / to specify a file path



On Mar 22, 2010, at 1:22 PM, Bogaso wrote:



Hi all,

I have saved my workplace in a .RData format. However if I want to  
open

that, I need to use following code :

load(C:\\..)

Here my question is why \\. In all the time generally we use /  
like when
we use read.delim() function etc. Is there any possibility to have  
some

consistency there?


Depends what you mean by consistency. The same rule applies on all  
current supported OS platforms. Doubling the back-slashes is needed  
because \ is an escape character.



Is there any other way to re-open the .RData file?


If your OS supports it you may be able to use drag-drop with file  
icons or you can use file.choose()



--
David.


Thanks,
--
View this message in context: 
http://n4.nabble.com/Why-instead-of-simple-to-specify-a-file-path-tp1677973p1677973.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error while trying to save summary() output as csv

2010-03-22 Thread Kamil Sijko

Hi,

I need to save output of summary() procedure to a csv file. It's all
OK when it's applied to a 'factor' class variable, but when I try to
save a 'integer' class summary to csv it gives me :

 summary(rnorm(100, 10)) - object
 write.csv2(object, file='name.csv')
Error in do.call(expand.grid, c(dimnames(x), stringsAsFactors =
stringsAsFactors)) :
  second argument must be a list

It's the same when I use write.csv instead of write.csv2

summary() produces a very simple table:

structure(c(7.803, 9.633, 10.15, 10.17, 10.75, 12.41), .Names = c(Min.,
1st Qu., Median, Mean, 3rd Qu., Max.), class = table)

I have no idea, what to do... So Group, please help me: what does this
error mean, and how to cope with it?

Thanks for your help.
Kamil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Factors attribute format

Thanks to Marc Schultz I found the documentation on the factors attribute 
under ?term.object. It stats:

factors: A matrix of variables by terms showing which variables appear
  in which terms.  The entries are 0 if the variable does not
  occur in the term, 1 if it does occur and should be coded by
  contrasts, and 2 if it occurs and should be coded via dummy
  variables for all levels (as when an intercept or lower-order
  term is missing).  If there are no terms other than an
  intercept and offsets, this is ‘numeric(0)’.

So now this brings up another question. It seems that the attriute is a two 
dimentional array. When I print it out in 'R' 

Fitting the formula prestige ~ income + education I get:

  income education
prestige   0 0
income 1 0
education  0 1

This matrix says to me that 'income' occurs in the term 'income' etc. So it 
seems that this matrix will always be a diagonal matrix with an added row of 
zeros containing the response term. If the formula is such that the response is 
a function of one or more of the dependent variables then of course it will be 
something other that a row of zeros. So far OK?

My problem in understanding comes with using a formula that contains R factors. 
I am using the following (from the TSA package)  for an example:

l - lm(tempdub ~ season(tempdub))
attr(l$terms, factors)

season(tempdub)
tempdub   0
season(tempdub)   1

The function 'season' produces a factor (in this case with 12 levels, one for 
each month). But the factor attribute still has a '1' and not a '2' indicating 
that the variable should be coded as a dummy variable (factor).

Please help my misunderstanding.

Thank you.

Kevin Burton

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error while trying to save summary() output as csv

2010-03-22 Thread Ista Zahn

Hi Kamil,
You can use something like
write.csv(t(as.matrix(object)), file=name.csv)

-Ista
On Mon, Mar 22, 2010 at 2:54 PM, Kamil Sijko kamil.si...@swps.edu.pl wrote:
 Hi,

 I need to save output of summary() procedure to a csv file. It's all
 OK when it's applied to a 'factor' class variable, but when I try to
 save a 'integer' class summary to csv it gives me :

 summary(rnorm(100, 10)) - object
 write.csv2(object, file='name.csv')
 Error in do.call(expand.grid, c(dimnames(x), stringsAsFactors =
 stringsAsFactors)) :
  second argument must be a list

 It's the same when I use write.csv instead of write.csv2

 summary() produces a very simple table:

 structure(c(7.803, 9.633, 10.15, 10.17, 10.75, 12.41), .Names = c(Min.,
 1st Qu., Median, Mean, 3rd Qu., Max.), class = table)

 I have no idea, what to do... So Group, please help me: what does this
 error mean, and how to cope with it?

 Thanks for your help.
 Kamil

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error while trying to save summary() output as csv



On Mar 22, 2010, at 3:06 PM, Ista Zahn wrote:


Hi Kamil,
You can use something like
write.csv(t(as.matrix(object)), file=name.csv)

-Ista
On Mon, Mar 22, 2010 at 2:54 PM, Kamil Sijko  
kamil.si...@swps.edu.pl wrote:

Hi,

I need to save output of summary() procedure to a csv file. It's all
OK when it's applied to a 'factor' class variable, but when I try to
save a 'integer' class summary to csv it gives me :


summary(rnorm(100, 10)) - object
write.csv2(object, file='name.csv')

Error in do.call(expand.grid, c(dimnames(x), stringsAsFactors =
stringsAsFactors)) :
 second argument must be a list

It's the same when I use write.csv instead of write.csv2

summary() produces a very simple table:

structure(c(7.803, 9.633, 10.15, 10.17, 10.75, 12.41), .Names =  
c(Min.,

1st Qu., Median, Mean, 3rd Qu., Max.), class = table)

I have no idea, what to do... So Group, please help me: what does  
this

error mean, and how to cope with it?


Not sure why you got that error but if you convert that table into a  
matrix the writing proceeds as expected:


 write.csv(as.matrix(structure(c(7.803, 9.633, 10.15, 10.17, 10.75,  
12.41), .Names = c(Min., 1st Qu., Median, Mean, 3rd Qu.,  
Max.), class = table) ), file=test.csv)



--
David.


Thanks for your help.
Kamil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Setting breaks to data more appropriately

2010-03-22 Thread LCOG1


Basic question.  For the below data, i would like to but each of the values
in a bin that represents their value.  So the below would hopefully put .1
in the 0-.1 bin, .2 in the .11-.2 bin and so forth.  The outlying values
would then be put into and outer category representing everything 1.  Im
using the breaks to inform some code for making a clorepleth map that
represents probabilities, which in some cases IS greater than 1 and i need
to identify those better.  As my code stands now, my real data is put put
into this form when brks is called:

0%10%20%30%40%50%60% 
0. 0.05054675 0.07787235 0.11235238 0.14424786 0.18089360 0.21475990 
   70%80%90%   100% 
0.26309899 0.30807771 0.39478573 0.67573483.

But what i want is for the values to be placed in bins corresponding to
their value(0-.1, .11-.2, .21-.3 etc)  

Pct.SFD-c(.1,.2,.3,.4,.5,.6,.7,.8,.9,1,2,3)
brks - quantile(Pct.SFD, seq(0,1,1/10))

I think this is clear.  Thanks
-- 
View this message in context: 
http://n4.nabble.com/Setting-breaks-to-data-more-appropriately-tp1678019p1678019.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Embed R code in C++

2010-03-22 Thread Romain Francois


Hello,

I don't know specifics of Xcode, etc ... it looks nice but I have not 
used it myself yet.


There seems to be issues on OSX with the current released version of 
RInside (we will release 0.2.2 soon), so I would suggest you download 
and install the next version of RInside from r-forge:


$ svn checkout svn://svn.r-forge.r-project.org/svnroot/rinside
$ cd rinside
$ R CMD INSTALL pkg

Then you can find example application:

$ cd pkg/inst/examples/standard
$ make
$ ./rinside_sample0
Hello, world!

There are several examples in this directory and you can use the 
Makefile as a template to get the bits and pieces (link against Rcpp and 
RInside libraries, include path, etc ...)


If you have further questions a	bout RInside, I would encourage you to 
use the Rcpp-devel mailing list on r-forge: 
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


If you have questions about xcode, ..., then a better place might be the 
r-sig-mac mailing list: https://stat.ethz.ch/mailman/listinfo/r-sig-mac


Romain

Le 22/03/10 16:25, mans a écrit :

Hi,
Can anyone  tell me how to embed R code in a C++ file.

I am actually using a mac running on the OSX 10.6.2 and the IDE Xcode
Version 3.2 and I would like to embed the basic function like geometric,
binomial, normal and hyper geometric distributions in a sample cpp file.

I heard about the library RInside and i have downloaded the source code for
mac but i do not know how to build it in order to use it with my IDE XCode.

Could anyone help me step by step because I am new in programming to show me
how to get this done?

Thanks for your help.

Mans.



--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/OIXN : raster images and RImageJ
|- http://tr.im/OcQe : Rcpp 0.7.7
`- http://tr.im/O1wO : highlight 0.1-5

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find a rectangle of maximal area

2010-03-22 Thread Hans W Borchers

Barry Rowlingson b.rowlingson at lancaster.ac.uk writes:

 
 On Mon, Mar 22, 2010 at 4:28 PM, Hans W Borchers
 hwborchers at googlemail.com wrote:
 
  Still I believe that a clever approach might be possible avoiding the need 
  to
  call a commercial solver. I am getting this hope from one of Jon Bentley's
  articles in the series Programming Pearls.
 
 
 Is this the 'Largest Empty Rectangle' problem?
 
 http://en.wikipedia.org/wiki/Largest_empty_rectangle

Dear Barry,

thanks for this pointer. I never suspected this problem could have a name of its
own. Rethinking the many possible applications makes it clear: I should have
searched for it before.

I looked in some of the references of the late 80s and found two algorithms 
that appear to be appropriate for implementation in R. The goal is to solve the
problem for n=200 points in less than 10-15 secs.

Thanks again, Hans Werner


 I had a look at some of the references from Wikipedia, but they all
 follow a similar pattern, one I have noticed in many computer science
 journal articles:
 
  1. State a problem that looks tricky.
  2. Say We have an efficient algorithm for the problem stated in #1
  3. Proceed to derive, using much algebra and theory, the efficient algorithm.
  4. Stop.
 
 The idea of actually producing some dirty, filthy, actual code to
 implement their shiny algorithms never seems to cross their minds.
 
  I also found a similar question from 2008 asked on the R-sig-geo
 mailing list. That didn't get much help either!
 
 Sorry.
 
 Barry
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Setting breaks to data more appropriately



On Mar 22, 2010, at 1:49 PM, LCOG1 wrote:



Basic question.  For the below data, i would like to but each of the  
values
in a bin that represents their value.  So the below would hopefully  
put .1
in the 0-.1 bin, .2 in the .11-.2 bin and so forth.  The outlying  
values
would then be put into and outer category representing everything  
1.  Im

using the breaks to inform some code for making a clorepleth map that
represents probabilities, which in some cases IS greater than 1


... not if it's a quantile or a probability.


and i need
to identify those better.


Define better.


As my code stands now, my real data is put put
into this form when brks is called:

   0%10%20%30%40% 
50%60%
0. 0.05054675 0.07787235 0.11235238 0.14424786 0.18089360  
0.21475990

  70%80%90%   100%
0.26309899 0.30807771 0.39478573 0.67573483.

But what i want is for the values to be placed in bins corresponding  
to

their value(0-.1, .11-.2, .21-.3 etc)

Pct.SFD-c(.1,.2,.3,.4,.5,.6,.7,.8,.9,1,2,3)
brks - quantile(Pct.SFD, )

I think this is clear.


It's not. You need to decide whether you want the breaking to be  
driven by you or by the data. If you are doing the driving then use


cut(object, breaks=c(seq(0,1, by=0.1), Inf) , right=TRUE)

If the data is doing the driving then:

cut(object, breaks=quantile(object, probs= seq(0,1,1/10 ) ) ,  
right=TRUE)


--
David.


Thanks
--
View this message in context: 
http://n4.nabble.com/Setting-breaks-to-data-more-appropriately-tp1678019p1678019.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] dotplot

2010-03-22 Thread Veerappa Chetty

Hi,
I am trying to make a dot plot in increasing order of the values. It does
not work. How do I do it? Here are the codes I used. I am also attaching the
data. I use lattice library.
Thanks.
Chetty
--


y-state.resid$Hospital.Name[state.resid$State==MAis.na
(state.resid$reg.resid)==F]
x-state.resid$reg.resid[state.resid$State==MAis.na
(state.resid$reg.resid)==F]
dotplot(reorder(y,x)~x,xlab=Regression Adjusted   Rates)
---

-- 
Professor of Family Medicine
Boston University
Tel: 617-414-6221, Fax:617-414-3345
emails: chett...@gmail.com,vche...@bu.edu
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Parameter col.axis

2010-03-22 Thread Enio Jelihovschi

I have a very short question.
Is there any possibility to give to the parameter col.axis of graphics
function axis a vector value of many colors instead of just one color,
otherwise is there any way around it?
Thank you very much
Enio Jelihovschi
UESC - Brazil

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Writing out result of tapply

2010-03-22 Thread James Rome

I need to write out the result oif a tapply
avtaxi = tapply(mdf$TaxiTime, list(mdf$Runway, mdf$OnHour,
mdf$ArrivalGate), FUN=mean, na.rm = TRUE)

to a data file that I can import into Excel.
dim(avtaxi)
[1]  10  24 100

dput(avtaxi, file = outfile, control = c(keepNA, keepInteger,
showAttributes))

Seems to munge things up. I like the way avtaxi appears in the R console:
which gives (first of 100)
 avtaxi
, , A01

   0   1   2   3   4567 8   
910   11   1213   14   15   16  
17181920212223
08L  420  NA  NA  NA  NA 634.2857 545.7143 673.8462  917.6471 750. 
705. 764.2105 634.2857  777.1429 697.8947 649.4118 767.3684
695.4545  755.2941  872.  952.5000 1026.6667  684. 540.0
08R   NA  NA 480  NA  NA 420.   NA   NANA  
NANA   NA   NA 1260.   NA   NA  
NA   NANANANA  780.NANA
09L   NA  NA  NA  NA  NA   NA   NA   NANA
540.NA   NA   NANA   NA   NA  
NA   NANANANA  600.NANA
09R   NA  NA 480  NA  NA 432. 480. 540.  851.4286 847.0588 
790.5882 540. 555.3846  642.5806 663. 717.8571 880.
645.  687.5000  812.0930 1008.6486  678.4615  740. 450.0
10NA  NA  NA  NA  NA   NA   NA 743.0769 1122.8571 986.0870
1162.5000 756. 670.  720. 837.8571 745.7143 708.
832.5000  906.6667  970. 1089.2308  850. 1020.NA
26L   NA 240  NA  NA  NA   NA 390.   NANA  
NANA   NA   NA  720.   NA   NA  
NA   NA  600.  480. 1260.NANA 480.0
26R 1070 420  NA 540 660 393.7500 402.3529 613.  854. 649.2308 
517.8947 550.3448 506.0377  555. 604.8649 588. 557.1429
526.5517  752.  692.9412  787.5000  754.5455  676.3636 742.5
27L  600  NA 420  NA  NA 320. 330. 484.2857  859.2000 686.6667 
486.6667 482.8571 460.9091  544.8000 616.6667 691.4286 604.
613.0435  720.  681.1765  815.4545  904.2857  720.NA
27R   NA 240  NA  NA  NA 600.   NA 480.  870.   NA 
460. 480. 480.  780.   NA 480.   NA  
NANANANANANANA
28   600  NA  NA  NA  NA   NA   NA 717.3913  894.5455 796. 
728.5714   NA 612.  671.4286 646. 777.6000 920. 720.
1050. 1107.0968  903. 1170.  630. 660.0

. . .

How do I get this written out with commas or spaces between the fields,
and no line wraps?

Thanks,
Jim Rome

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] outputing text colors

2010-03-22 Thread Greg Snow

Another possibility (depending on what you want to do/preferences) is to create 
a text file that can be postprocessed to give the colors.  One example of this 
is the etxtStart and related functions in the TeachingDemos package.  These 
produce a text file with extra notations that when processed with the enscript 
program includes different text colors.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of rtist
 Sent: Thursday, March 18, 2010 7:37 PM
 To: r-help@r-project.org
 Subject: [R] outputing text colors
 
 
 Hi all,
 
 I was wondering if there is a way to output text tables with the color
 of
 the text corresponding to a condition.
 
 More specifically, Im outputting an time series table and want the
 console
 colors to be green0 and red0.
 This is very easy to do in excel using conditional formatting. Any
 ideas on
 how to do it here?
 
 thanks.
 P.S. I've thought about using a heatmap, but it might be complicated
 using
 an ts series object.
 --
 View this message in context: http://n4.nabble.com/outputing-text-
 colors-tp1598874p1598874.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] summary.formula and continuous variables

2010-03-22 Thread Erik Iverson


Hello,

I am using the summary.formula function in the Hmisc package to produce 
tables.


With the method argument set to response, the help says,

Continuous independent variables (see the ‘continuous’ parameter below) 
are automatically stratified into ‘g’ (see below) quantile groups.


By my reading, this makes it impossible to summarize a continuous 
variable with, for example, its correlation with the response variable.


Is there some sort of functionality I'm missing here, or is this just 
not possible with how summary.formula is written now?


Thanks,
Erik Iverson

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a simple statistic question

Maybe I should simplify the problem with the following smaller table. And I
just want to ask whether
there is any significant difference in the proportion of Good_Sample
produced by factories located in City_A and City_B.

Factory_ID   Factory_Location Total_Sample   Good_Sample

1  City_A  100  90
2  City_A  120
55
3  City_A  80
40
4  City_A  75
50
5  City_B  150
80
6  City_B  120  55

7  City_B  125
40
8  City_B  100  60
9  City_B  70   45
10City_B  85   65


On Mon, Mar 22, 2010 at 2:56 PM, Joshua Wiley jwiley.ps...@gmail.comwrote:

 I am not completely sure what your regression model looks like (what
 your outcome and predictors are).  It seems like you have different
 levels of data (samples nested in factories nested in cities).  What
 question do you really want to answer?  You might consider looking
 into multi-level analyses.  Douglas Bates has an excellent package
 lme4 that works with nested models.  Particularly check out ?glmer
 for the multi-level equivalent of glm().  I don't know if that really
 gets to your question of dealing with individual factory, but it is at
 least designed to handle different levels.  I only have a rudimentary
 knowledge of multi-level models or logistic regression so I cannot
 offer much advice.

 Best of luck,

 Joshua
 \




-- 
Xiang Gao, Ph.D.
Department of Biology
University of North Texas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fw: Re: Why \\ instead of simple / to specify a file path [modified]

2010-03-22 Thread Ron Michael

Hi, I was following this thread and would like to ask is there any way to save 
and open a .RData file after using some Password. What I mean to say, how to 
make my workplace password-protected?
Also would like to know how same can be done for .R file.
Thanks for your time.
Thanks and regards,


--- On Tue, 23/3/10, David Winsemius dwinsem...@comcast.net wrote:

From: David Winsemius dwinsem...@comcast.net
Subject: Re: [R] Why \\ instead of simple / to specify a file path
To: Bogaso bogaso.christo...@gmail.com
Cc: r-help@r-project.org
Date: Tuesday, 23 March, 2010, 1:51 AM


On Mar 22, 2010, at 1:22 PM, Bogaso wrote:

 
 Hi all,
 
 I have saved my workplace in a .RData format. However if I want to open
 that, I need to use following code :
 
 load(C:\\..)
 
 Here my question is why \\. In all the time generally we use / like when
 we use read.delim() function etc. Is there any possibility to have some
 consistency there?

Depends what you mean by consistency. The same rule applies on all current 
supported OS platforms. Doubling the back-slashes is needed because \ is an 
escape character.

 Is there any other way to re-open the .RData file?

If your OS supports it you may be able to use drag-drop with file icons or you 
can use file.choose()


--David.
 
 Thanks,
 --View this message in context: 
 http://n4.nabble.com/Why-instead-of-simple-to-specify-a-file-path-tp1677973p1677973.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  New Email names for you! 

il. 
[[elided Yahoo spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fw: Re: Why \\ instead of simple / to specify a file path [modified]

How about saving, then issuing system() calls to run a zipping program  
with password? Then remove the original.


?file.remove

(.R source files are just text.)

--
David.
On Mar 22, 2010, at 4:53 PM, Ron Michael wrote:

Hi, I was following this thread and would like to ask is there any  
way to save and open a .RData file after using some Password. What I  
mean to say, how to make my workplace password-protected?

Also would like to know how same can be done for .R file.
Thanks for your time.
Thanks and regards,


--- On Tue, 23/3/10, David Winsemius dwinsem...@comcast.net wrote:

From: David Winsemius dwinsem...@comcast.net
Subject: Re: [R] Why \\ instead of simple / to specify a file path
To: Bogaso bogaso.christo...@gmail.com
Cc: r-help@r-project.org
Date: Tuesday, 23 March, 2010, 1:51 AM


On Mar 22, 2010, at 1:22 PM, Bogaso wrote:



Hi all,

I have saved my workplace in a .RData format. However if I want to  
open

that, I need to use following code :

load(C:\\..)

Here my question is why \\. In all the time generally we use /  
like when
we use read.delim() function etc. Is there any possibility to have  
some

consistency there?


Depends what you mean by consistency. The same rule applies on all  
current supported OS platforms. Doubling the back-slashes is needed  
because \ is an escape character.



Is there any other way to re-open the .RData file?


If your OS supports it you may be able to use drag-drop with file  
icons or you can use file.choose()



--David.


Thanks,
--View this message in context: 
http://n4.nabble.com/Why-instead-of-simple-to-specify-a-file-path-tp1677973p1677973.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



 New Email names for you!

il.
[[elided Yahoo spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Factors attribute format

On Mar 22, 2010, at 2:00 PM, rkevinbur...@charter.net wrote:

 Thanks to Marc Schultz I found the documentation on the factors attribute 
 under ?term.object. It stats:

cough   ;-)

 factors: A matrix of variables by terms showing which variables appear
  in which terms.  The entries are 0 if the variable does not
  occur in the term, 1 if it does occur and should be coded by
  contrasts, and 2 if it occurs and should be coded via dummy
  variables for all levels (as when an intercept or lower-order
  term is missing).  If there are no terms other than an
  intercept and offsets, this is ‘numeric(0)’.


The key is 'dummy variables for *all* levels'. In other words your example 
below of 12 months, would be represented by 12 individual binary (0/1) 
encodings, rather than, for example using default treatment contrasts, 11 
individual binary (0/1) encodings, where the base or reference level is not 
included in the resultant model matrix.

I have not spent a lot of time on this internal R/S model design point, but in 
rather simple cases as an example, a '2' will appear in the presence of 
interaction terms lacking the main effects term for the second factor:

 attr(terms(y ~ x + z), factors)
  x z
y 0 0
x 1 0
z 0 1

 attr(terms(y ~ x + x:z), factors)
  x x:z
y 0   0
x 1   2
z 0   1


Compare the second example above with the more common:

 attr(terms(y ~ x * z), factors)
  x z x:z
y 0 0   0
x 1 0   1
z 0 1   1

which is of course equivalent to:

 attr(terms(y ~ x + z + x:z), factors)
  x z x:z
y 0 0   0
x 1 0   1
z 0 1   1


The difference in the encodings will be reflected in the model matrix. See 
?model.matrix and play around with the examples there, including adding 
interaction terms. For example, model.matrix( ~ a + a:b, dd), etc.

This discussion leads into the complex issue of the internal representation of 
R (and S) models. If you really want to dig deeper, then you should get a copy 
of Statistical Models in S by Chambers and Hastie 1993 (aka The White Book) 
and specifically note the rule described on the bottom of page 38 therein, 
perhaps pre-reading the entire chapter leading up to that particular point.

HTH,

Marc


 So now this brings up another question. It seems that the attriute is a two 
 dimentional array. When I print it out in 'R' 
 
 Fitting the formula prestige ~ income + education I get:
 
  income education
 prestige   0 0
 income 1 0
 education  0 1
 
 This matrix says to me that 'income' occurs in the term 'income' etc. So it 
 seems that this matrix will always be a diagonal matrix with an added row of 
 zeros containing the response term. If the formula is such that the response 
 is a function of one or more of the dependent variables then of course it 
 will be something other that a row of zeros. So far OK?
 
 My problem in understanding comes with using a formula that contains R 
 factors. I am using the following (from the TSA package)  for an example:
 
 l - lm(tempdub ~ season(tempdub))
 attr(l$terms, factors)
 
season(tempdub)
 tempdub   0
 season(tempdub)   1
 
 The function 'season' produces a factor (in this case with 12 levels, one for 
 each month). But the factor attribute still has a '1' and not a '2' 
 indicating that the variable should be coded as a dummy variable (factor).
 
 Please help my misunderstanding.
 
 Thank you.
 
 Kevin Burton

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] using reorder in dotplot

2010-03-22 Thread Veerappa Chetty

   Hi ,
   Name
 rate

  HEALTHALLIANCE HOSPITALS, INC  -1.06211747
  MOUNT AUBURN HOSPITAL  0.50960291
  STURDY MEMORIAL HOSPITAL2.64233232
  LAWRENCE GENERAL HOSPITAL2.15628558
  CAMBRIDGE HEALTH ALLIANCE  1.23623144

I would like use reorder in the dotplot function.  I want the dots in
the increasing order. I know how to do it using dotchart.

I would appreciate help. Also I could not easily find a method to post data
when I seek help in the posting guide.
Thanks.
Chetty
-- 
Professor of Family Medicine
Boston University
Tel: 617-414-6221, Fax:617-414-3345
emails: chett...@gmail.com,vche...@bu.edu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sets package: converting a set to data frame?

2010-03-22 Thread Czerminski, Ryszard

I just started using nice package sets
and I wonder if there are utilities to convert (some) sets to data frame
(as in the example below)

 library(sets)
 a - gset(elements = list(e('A', 0.1), e('B', 0.8)))
 lst - as.list(a)
 nr - length(lst)
 rnames - character()
 for (i in 1:nr) rnames[i] - lst[[i]]
 df - data.frame(row.names=rnames)
 df$memberships - attr(lst, 'memberships')
 a
{A [0.1], B [0.8]}
 df
  memberships
A 0.1
B 0.8


Best regards,
Ryszard
---
Ryszard Czerminski


--
Confidentiality Notice: This message is private and may ...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] determine upper convex hull, 2-dimensional case

2010-03-22 Thread Richard and Barbara Males

For an environmental planning example that involves looking at the
relative efficiencies of one plan over another, I need to determine
the pareto-efficient plans (which I have done), and then, within that
set of plans, determine the convex hull representing the outer upper
boundary of those points.  I have a dataframe, dfPlans,  as follows,
representing the pareto-efficient  cost and benefit of different
environmental restoration plans out of a larger set of plans, where
A1, EC, A6, and A4 are identifiers for the plans.

  Cost  Benefit
A1  0.00  0.000
EC  0.007821.689
A6  76783.1916094.142
A4  78703.7322245.760

I am interesting in determining what I believe is called the upper
convex hull, i.e. the upper outer boundary if I plot benefit on the y
axis, cost on the x axis.  This should be plans A1, EC, and A4, and
not point A6.   I have used chull, which returns all of the points,
including A6, and have tried to use convhulln with the QU option, but
I am unclear as to how to interpret the results, which are returned as
follows:

 chull(dfPlans)
[1] 1 2 3 4
 convhulln(dfPlans,option=QU)
   [,1] [,2]
[1,]32
[2,]34
[3,]12
[4,]14

Any assistance greatly appreciated, any way to accomplish my goal
(need not use convhulln or chull).

Thanks in advance.

--
Richard M. Males
Cincinnati, OH USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using reorder in dotplot



On Mar 22, 2010, at 5:17 PM, Veerappa Chetty wrote:


  Hi ,
  Name
rate

 HEALTHALLIANCE HOSPITALS, INC  -1.06211747
 MOUNT AUBURN HOSPITAL  0.50960291
 STURDY MEMORIAL HOSPITAL2.64233232
 LAWRENCE GENERAL HOSPITAL2.15628558
 CAMBRIDGE HEALTH ALLIANCE  1.23623144

I would like use reorder in the dotplot function.  I want the  
dots in

the increasing order. I know how to do it using dotchart.


The Posting Guide also suggests that you offer code that constructs a  
dummy dataset if you cannot provide a representative real dataset.


With the barley dataset you can see the effects of sorting a factor  
variable with this code:


 dotplot(variety ~ yield | year * site, data=barley)

 str(barley)
'data.frame':   120 obs. of  4 variables:
 $ yield  : num  27 48.9 27.4 39.9 33 ...
 $ variety: Factor w/ 10 levels Svansota,No. 462,..: 3 3 3 3 3 3  
7 7 7 7 ...

 $ year   : Factor w/ 2 levels 1932,1931: 2 2 2 2 2 2 2 2 2 2 ...
 $ site   : Factor w/ 6 levels Grand Rapids,..: 3 6 4 5 1 2 3 6 4  
5 ...

 levels(barley$site)
[1] Grand RapidsDuluth  University Farm  
Morris  Crookston   Waseca

 levels(barley$site) - sort(levels(barley$site))
 dotplot(variety ~ yield | year * site, data=barley)

 levels(barley$variety) - names(sort(with(barley , tapply(yield,  
variety, mean) )))

 levels(barley$variety)
 [1] Svansota ManchuriaNo. 475   
Velvet   Glabron  Peatland

 [7] No. 462  No. 457  Wisconsin No. 38 Trebi
 dotplot(variety ~ yield | year * site, data=barley)




I would appreciate help. Also I could not easily find a method to  
post data

when I seek help in the posting guide.


I was under the impression that files with extension .txt would pass  
the server filter. I am attaching two copies of the same file, one  
with .txt and the other with .csv as extensions. My experience tells  
me that only the .txt file will pass.



-- David
test.csv




name.txt
;V1
Min.;7,803
1st Qu.;9,633
Median;10,15
Mean;10,17
3rd Qu.;10,75
Max.;12,41





Thanks.
Chetty
--
Professor of Family Medicine
Boston University
Tel: 617-414-6221, Fax:617-414-3345
emails: chett...@gmail.com,vche...@bu.edu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sets package: converting a set to data frame?

2010-03-22 Thread Henrique Dallazuanna

Try this:

data.frame(row.names = unlist(a), gset_memberships(a))

On Mon, Mar 22, 2010 at 6:26 PM, Czerminski, Ryszard
ryszard.czermin...@astrazeneca.com wrote:
 I just started using nice package sets
 and I wonder if there are utilities to convert (some) sets to data frame
 (as in the example below)

 library(sets)
 a - gset(elements = list(e('A', 0.1), e('B', 0.8)))
 lst - as.list(a)
 nr - length(lst)
 rnames - character()
 for (i in 1:nr) rnames[i] - lst[[i]]
 df - data.frame(row.names=rnames)
 df$memberships - attr(lst, 'memberships')
 a
 {A [0.1], B [0.8]}
 df
  memberships
 A         0.1
 B         0.8


 Best regards,
 Ryszard
 ---
 Ryszard Czerminski


 --
 Confidentiality Notice: This message is private and may ...{{dropped:11}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Password Protection of Data Files and R Code (Was: Fw: Re: Why \\ instead of simple / to specify a file path [modified])

You need to specify more detail on your functional requirements relative to 
protection. R itself does not support the direct use of encrypted data or R 
source files.

If you simply want to encrypt/decrypt files before and after use in R, you can 
use third party programs such as GnuPG (http://www.gnupg.org/) or commercial 
equivalents.

However, once the file/data is in memory (RAM) during an R session, it may be 
written to a tmp or swap partition on the disk by R itself or by the OS, in 
which case, it will be available 'in the clear' to someone else with physical 
access to your computer.

If this is a concern, then you want to look a whole disk/volume encryption 
programs that protect your entire disk and require a password at system boot or 
during the mounting of an encrypted disk/partition. The details of this will 
depend upon the OS. For example on higher end Windows systems, there is 
BitLocker, on Linux there is DM-Crypt/LUKS and on OSX there is FileVault. There 
are also third party applications such as TrueCrypt, PGP Desktop/WDE (which I 
use on OSX) and WinMagic, which are available on multiple operating systems.

In the second scenario, everything on the physical disk is encrypted and the 
reading and writing of these files, which includes the decryption and 
encryption, is done transparently during disk I/O. Thus, there is no need to 
manage individual files.

Each of these approaches have pros and cons relative to security, the impact on 
operating procedures and to some extent, system performance.

There was also a thread covering related matter back in late 2007:

  http://thread.gmane.org/gmane.comp.lang.r.general/94290/

In the future, with a significant subject matter change like this, please start 
a new thread.

HTH,

Marc Schwartz

On Mar 22, 2010, at 3:53 PM, Ron Michael wrote:

 Hi, I was following this thread and would like to ask is there any way to 
 save and open a .RData file after using some Password. What I mean to say, 
 how to make my workplace password-protected?
 Also would like to know how same can be done for .R file.
 Thanks for your time.
 Thanks and regards,

snip

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Distance between lines

2010-03-22 Thread Janmaat, John

Hello,

I'm trying to assess the similarity of two lines that are represented as points 
(output of differential equation solvers).  Is there a function or a package 
that deals with things like this?

Thanks,

John.



Johannus (John) Janmaat
Assistant Professor of Economics
Barber School of Arts and Sciences
University of British Columbia - Okanagan
john.janm...@ubc.ca


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find a rectangle of maximal area

2010-03-22 Thread Rolf Turner


On 23/03/2010, at 6:03 AM, Barry Rowlingson wrote:

(Commenting on the sort of articles to be found in Computer
Science journals)

SNIP

 The idea of actually producing some dirty, filthy, actual code to
 implement their shiny algorithms never seems to cross their minds.

SNIP


Fortune?

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] swutching rows to columns

2010-03-22 Thread LCOG1


Hi All, 
  
Consider the following:
TRN-c(5.809657,3.1, 1.774901e-02) 
TRN_CLUST-c(-4.174682e-05,  5.538742e-05,1.2)
EmpCo-data.frame(TRN,TRN_CLUST)
row.names(EmpCo)-c(Slope,Fwy,Univ)

returns:
TRN TRN_CLUST
Slope 5.80965700 -4.174682e-05
Fwy   3.1000  5.538742e-05
Univ  0.01774901  1.20e+00

Now my own data is actually first constructed into list form(see below) so
perhaps it would be easier to perform the rows to columns operation from
that.  

List form:
$TRN
   SlopeFwy  UnivDist  
   5.80965700  3.1000   0.01774901 

What i would like to do is switch the rows to columsn so that the above now
shows:

  Slope FwyUniv
TRN   5.809657  3.1 0.01774901
TRN_CLUST -4.17E-05 5.54E-051.20E+00


Tried some things from the reshape package but i dont think thats what i
want.  I will need to do this for more variables and initial columns than
shown here so if the process is automated or easily put into an
automated(loop) form that would be best.  Gracias

JR
-- 
View this message in context: 
http://n4.nabble.com/swutching-rows-to-columns-tp1678429p1678429.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dotplot