date:20110518

[R] Constrainted Nonlinear Optimization - lack of convergence

2011-05-18 Thread mcgrete

Hello,

I am attempting to utilize the 'alabama' package to solve a constrained
nonlinear optimization problem.

The problem has both equality and inequality constraints (heq and hin
functions are used).  All constraints are smooth, i.e. I can differentiate
easily to produce heq.jac and hin.jac functions. 

My initial solution is feasible; I am attempting to maximize a function,
phi.  As such, I create an objective or cost function '-phi', and the
gradient of the cost function '-dphi/dz'.  

I will gladly provide the detailed code, but perhaps an overview of the
problem may be sufficient.

0.  I installed 'alabama' and was successful at solving the example problem.

1.  My constraints are:
z=0 (for several elements in the vector z)
z=0 (for remaining elements in vector z)
Z - sum(z) =0, where Z is a constant real number.

2.  My cost function to maximize is (or, minimize -phi):  phi==SUM[
p[i]*LN{f[i]} ], where sum is for i=1:length(z)
 and where f[i]=={(1-r)*SUM[z+s]-z[i]-s[i]}*z[i]/(z[i]+s[i]) + Z -
sum(z)  
 and where s, p are vectors of length(z) and are constants.  Note,
elements of p  s are all 0
 and where (1-r) is a scalar0
 Note: f[i], under the constraints listed above that is, should always
be = 0
 

3.  I can readily calculate the gradient of phi, where in general:
dphi/dz=d/dz[i] of phi== p*f'/f, where f' is df/dz[i].

4.  I created functions for inequality and equality constraints and their
jacobians, the cost function, and the grad of cost function.
 
5.  I utilize the alabama package, and the 'auglag' function.  As a first
attempt, I utilized only a single inequality constraint for z0, all other z
constraints are z=0, and the Z - sum(z)  0 inequality constraint.I used
default settings, except for attempts to utilize various 'methods', e.g.
BFGS, Nelder-Mead.

Review of the alamaba package source code leads me to believe that this code
automatically generates the Lagrangian of the cost function augmented with
Lagrangian multipliers, and also generates the gradient of the augmented
Lagrangian.  Hence, I assume (perhaps incorrectly), that auglag is
automatically generating the dual problem, and attempts to find a solution
to the dual problem by calling 'optim'.  

MY ISSUE:
The code often runs successfully (converges); sometimes with satisfying
(TRUE) KKT1 and KKT2, sometimes only 1 of the 2.  Sometimes it fails to
converge at all.  When it does converge, I do not obtain the same optimum
condition when I utilize different initial conditions.  When it does fail to
converge, I often end up with a Nan, generated when attempting to take
log(f[i]), meaning that f[i]0, and I interpret and observe that some or all
of the elements of the vector z are less than zero, despite my constraints.

QUESTION
Other than the obvious - review my code for typos, etc, which I believe have
been resolved...
1.  Can the alabama procedure take a solution path that may not satisfy the
constraints?  If not, then I must have an error in my code despite attempts
to eliminate and I must review yet again.  

2.  If the path may not satisfy all of the constraints (perhaps to due to
steep gradients), how to avoid this situation?  

2a.  I presume that some of the issues may be with difference in scaling,
e.g. say s=[200,500,400,300,100], p=[0.1,0.2,0.4,0.1,0.2], Z=1000,
(1-r)=0.8, and initial starting point for z=[0,0,200,0,0].  However, I am
not experienced at scaling these or the constraints.  Any suggestions?

2b I am not an expert in optimization, but have some background in
math/engineering.  I suspect and hope that something as simple as relaxing
the constraints on z=0 to z=delta, where delta is a small positive number,
may help - any comments?  I admit, I am lazy for not trying this, as I just
thought of it while writing this post.

2c.  I am dangerously knowledgeable that penalty functions exist, but I am
uncertain on how to utilize and how to determine how to select the term
'sig0'.  Suggestions?

2d.  Thinking more, I have not rigorously attempted to modify the tolerance
for convergence, thinking that perhaps my issue is more related to the
solution path not remaining in the constraints being the issue, and not my
convergence.  Am I incorrect in thinking so?  

I would appreciate any assitance that someone can provide.  Again, if the
code is required, I will share, but I hope that I have defined my problem
well enough above so as to avoid anyone having to sort through / degub my
own code.

Much appreciated,
Tim


--
View this message in context: 
http://r.789695.n4.nabble.com/Constrainted-Nonlinear-Optimization-lack-of-convergence-tp3531534p3531534.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] PAM Clustering Ignores Cluster Number Parameter

2011-05-18 Thread Dario Strbenac

I am using PAM with k = 10 clusters, but I only get one cluster ID for all my 
observations. I couldn't find any discussion about this in the help file, or 
mailing lists.

Is there a reasonable explanation for this result ?

cIDs - pam(all, 10, cluster.only = TRUE, do.swap = FALSE)
 table(cIDs)
cIDs
0 
16671

The matrix of observations can be found at : 
http://129.94.136.7/file_dump/dario/all.obj

I'm using R version 2.13.0 (2011-04-13) on Platform: x86_64-unknown-linux-gnu 
(64-bit) and have cluster_1.13.3.

--
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] scatter plot: multiple Y variables and error bars

2011-05-18 Thread B77S

#Hi all, 
#Using the example data that follows, can someone please show me how to get
a scatterplot of points with #error bars in the Y direction. something like
this works for one Y:

xYplot(Cbind(y1, l1, u1) ~x1,  data=y)

#but  this:

xYplot(Cbind(y1, l1, u1) + Cbind(y2, l2, u2)~x1,  data=y)

# doesn't give me what I would have expected, which is both sets of points
to have their respective error # bars.  Any examples would be greatly
appreciated, and I am not partial to xYplot, so please share #anything you
like. 

y1 - c(1, 1.2, 0.9, 1, 1.2)
u1 - c(1.3, 1.4, 1.3, 1.2, 1.4)
l1 - c(0.8, 0.9, 0.85, 0.8, 0.9)
x1 -  c(1:5)
y2 - c(1.2, 1.4, 1.2, 1.4, 1.5)
u2 - c(1.5, 1.8, 1.6, 1.6, 1.7)
l2 - c(1.1, 1.3, 1.0, 1.2, 1.4)
y - data.frame(y1,u1,l1,x1)


## thanks ahead of time!

--
View this message in context: 
http://r.789695.n4.nabble.com/scatter-plot-multiple-Y-variables-and-error-bars-tp3531563p3531563.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to make array of regression objects

2011-05-18 Thread Dmitrij Kudriavcev

Dear all,

I have made couple logistic regressions, what making a distribution of some
event.

Currently, i store it like this:

o1 - lrm(...)
o2 - lrm(...)
o3 - lrm(...)
...

Then, i have made a function to peak required regression object from this
variables by it number:

get_object - function(obj_name, nModel) {
eval (parse(text=paste(o - , obj_name, nModel, sep=)))
o
}

Is there a better way to do it? I have try to store it in the matrix using
data.frame(), but object become destroyed after that and predict() function
do not recognize it.

Regards,
Dmitrij Kudriavcev

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] text mining analysis and word visualization of pdfs

2011-05-18 Thread Ajay Ohri

Dear Lists,

What is the appropriate software package for dumping say 20 PDFS in a
folder, then creating data visualization with frequency counts of
certain words as well as measure correlation within each file for
certain key relationships or key words.

I am doing text analysis of biases in enterprise software sponsored
publications- and need to come up with a statistical threshold.

Regards,

Ajay Ohri

Websites-
http://decisionstats.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Date_Time detected as Duplicated (but they are not!)

2011-05-18 Thread Agustin Lobo

I have a problem with duplicated date_time stamps that I do not see as
duplicated.

I read a file with observations taken every 30 minutes:

 aur2009=read.csv(paste(datadir,AUR_ECPP_2009.csv,sep=/),sep=;,stringsAsFactors=F)
 aur2009[1:3,1:5]
  Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
1 1/1/2009 0:000   NaN  5.86NaN
2 1/1/2009 0:300   NaN  5.05NaN
3 1/1/2009 1:000   NaN  5.56NaN

 delme = strptime(aur2009[,1], %m/%d/%Y %H:%M)
 aur2009[,1]=as.POSIXct(delme)
Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
1 2009-01-01 00:00:000   NaN  5.86NaN
2 2009-01-01 00:30:000   NaN  5.05NaN
3 2009-01-01 01:00:000   NaN  5.56NaN

 aur2009ts = ts(aur2009)
 row.names(aur2009ts) = as.character(delme)
 aur2009ts[1:3,1:5]
 Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
2009-01-01 00:00:00 12307644000   NaN  5.86NaN
2009-01-01 00:30:00 12307662000   NaN  5.05NaN
2009-01-01 01:00:00 12307680000   NaN  5.56NaN

Then:
 aur2009z = zoo(aur2009[,2:12],as.POSIXct(delme))
Warning message:
In zoo(aur2009[, 2:12], as.POSIXct(delme)) :
  some methods for “zoo” objects do not work if the index entries in
‘order.by’ are not unique

So I investigate:
 any(duplicated(aur2009ts[,1]))
[1] TRUE

 aur2009ts[(duplicated(aur2009ts[,1])),1:5]
 Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
2009-03-29 02:00:00 12382848000   NaN   1.2NaN
2009-03-29 02:30:00 12382866000   NaN   1.2NaN

But note the surprise:
 aur2009ts[aur2009ts[,1]==1238284800,1:5]
 Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
2009-03-29 01:00:00 12382848000   NaN -0.58NaN
2009-03-29 02:00:00 12382848000   NaN  1.20NaN
 aur2009ts[aur2009ts[,1]==1238286600,1:5]
 Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
2009-03-29 01:30:00 12382866000   NaN -0.34NaN
2009-03-29 02:30:00 12382866000   NaN  1.20NaN

The dates detected as duplicated are actually different times that got
the same value in the ts version of the object!
What am I doing wrong? They are all observations every 30min, why are
these 2 encoded as the
same time?

Any help appreciated

Agus

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Date_Time detected as Duplicated (but they are not!)

2011-05-18 Thread Michael Sumner

See under Note in ?strptime:

   Remember that in most timezones some times do not occur and some
 occur twice because of transitions to/from summer time.
 âstrptimeâ does not validate such times (it does not assume a
 specific timezone), but conversion by âas.POSIXctâ) will do so.




On Wed, May 18, 2011 at 3:53 PM, Agustin Lobo agustin.l...@ictja.csic.eswrote:

 I have a problem with duplicated date_time stamps that I do not see as
 duplicated.

 I read a file with observations taken every 30 minutes:

 
 aur2009=read.csv(paste(datadir,AUR_ECPP_2009.csv,sep=/),sep=;,stringsAsFactors=F)
  aur2009[1:3,1:5]
  Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
 1 1/1/2009 0:000   NaN  5.86NaN
 2 1/1/2009 0:300   NaN  5.05NaN
 3 1/1/2009 1:000   NaN  5.56NaN

  delme = strptime(aur2009[,1], %m/%d/%Y %H:%M)
  aur2009[,1]=as.POSIXct(delme)
Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
 1 2009-01-01 00:00:000   NaN  5.86NaN
 2 2009-01-01 00:30:000   NaN  5.05NaN
 3 2009-01-01 01:00:000   NaN  5.56NaN

  aur2009ts = ts(aur2009)
  row.names(aur2009ts) = as.character(delme)
  aur2009ts[1:3,1:5]
 Date.Time E_filled E_filled_flag LE_filled
 LE_filled_flag
 2009-01-01 00:00:00 12307644000   NaN  5.86
  NaN
 2009-01-01 00:30:00 12307662000   NaN  5.05
  NaN
 2009-01-01 01:00:00 12307680000   NaN  5.56
  NaN

 Then:
  aur2009z = zoo(aur2009[,2:12],as.POSIXct(delme))
 Warning message:
 In zoo(aur2009[, 2:12], as.POSIXct(delme)) :
  some methods for âzooâ objects do not work if the index entries in
 âorder.byâ are not unique

 So I investigate:
  any(duplicated(aur2009ts[,1]))
 [1] TRUE

  aur2009ts[(duplicated(aur2009ts[,1])),1:5]
 Date.Time E_filled E_filled_flag LE_filled
 LE_filled_flag
 2009-03-29 02:00:00 12382848000   NaN   1.2
  NaN
 2009-03-29 02:30:00 12382866000   NaN   1.2
  NaN

 But note the surprise:
  aur2009ts[aur2009ts[,1]==1238284800,1:5]
 Date.Time E_filled E_filled_flag LE_filled
 LE_filled_flag
 2009-03-29 01:00:00 12382848000   NaN -0.58
  NaN
 2009-03-29 02:00:00 12382848000   NaN  1.20
  NaN
  aur2009ts[aur2009ts[,1]==1238286600,1:5]
 Date.Time E_filled E_filled_flag LE_filled
 LE_filled_flag
 2009-03-29 01:30:00 12382866000   NaN -0.34
  NaN
 2009-03-29 02:30:00 12382866000   NaN  1.20
  NaN

 The dates detected as duplicated are actually different times that got
 the same value in the ts version of the object!
 What am I doing wrong? They are all observations every 30min, why are
 these 2 encoded as the
 same time?

 Any help appreciated

 Agus

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Michael Sumner
Institute for Marine and Antarctic Studies, University of Tasmania
Hobart, Australia
e-mail: mdsum...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with PLSR with jack knife

2011-05-18 Thread Bjørn-Helge Mevik

Amit Patel amitrh...@yahoo.co.uk writes:

 BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, validation 
 = 
 LOO)

 and

 BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, validation 
 = 
 CV)
[...]
 Now I am unsure of how to utilise these to identify the significant 
 variables. 

You can use the jackknife built into plsr to get an indication about
significant variables, by adding the argument jackknife = TRUE to the
plsr call.  Use jack.test(BHPLS1) to do the test.

But _PLEASE_ do read the Warning section inf ?jack.test!

-- 
Regards,
Bjørn-Helge Mevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with PLSR Loadings

2011-05-18 Thread Bjørn-Helge Mevik

Amit Patel amitrh...@yahoo.co.uk writes:

 x - loadings(BHPLS1)

 my loadings contain variable names rather than numbers.

No, they don't.

str(x) 
  loadings [1:94727, 1:10] -0.00113 -0.03001 -0.00059 -0.00734 -0.02969 ...
  - attr(*, dimnames)=List of 2
   ..$ : chr [1:94727] PCIList1 PCIList2 PCIList3 PCIList4 ...
   ..$ : chr [1:10] Comp 1 Comp 2 Comp 3 Comp 4 ...
  - attr(*, explvar)= Named num [1:10] 14.57 6.62 7.59 5.91 3.26 ...
   ..- attr(*, names)= chr [1:10] Comp 1 Comp 2 Comp 3 Comp 4 ...

Look at the first line of output.  These are the values, and they are
numeric (it is a matrix).  The other lines are attributes of the matrix.

plot(BHPLS1, loadings, comps = 1:2, legendpos = topleft, labels = 
numbers, 
xlab = nm)
 Error in loadingplot.default(x, ...) : 
   Could not convert variable names to numbers.

This says that loadingplot.default could not convert variable _names_ to
numbers.  That is not surprising, since the variable names are PCIList1,
PCIList2, etc., and the documentation for loadinplot says:

 with 'numbers', the variable names are converted to numbers, if
 possible.  Variable names of the forms 'number' or 'number
 text' (where the space is optional), are handled.

So don't ask the plot function to use numbers as labels.  Use e.g. names
instead: labels = names.

Tip: It is always a good idea to read the output and error messages very
carefully.

-- 
Regards,
Bjørn-Helge Mevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] text mining analysis and word visualization of pdfs

2011-05-18 Thread Karl Ove Hufthammer

Ajay Ohri wrote:

 What is the appropriate software package for dumping say 20 PDFS in a
 folder, then creating data visualization with frequency counts of
 certain words as well as measure correlation within each file for
 certain key relationships or key words.

pdftotext + Unix™ for Poets + R (ggplot2)

HTH.

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] XLSM Question

2011-05-18 Thread Gabrielsen, Alexandros

Hi,
 
I would like to ask you how to read an arrary from an *.xlsm file? I have tried 
different packages such as xlsReadWrite and RODBC. Everything is performed on 
the final versions of addons and R.
 
Additionally, when I tried the RODBC received the following error:
 library(RODBC)

 con = odbcConnectExcel(C:\\Temp.xlsm)

Warning messages:

1: In odbcDriverConnect(con, tabQuote = c([, ]), ...) :

[RODBC] ERROR: state HY000, code -5120, message [Microsoft][ODBC Excel Driver] 
External table is not in the expected format.

2: In odbcDriverConnect(con, tabQuote = c([, ]), ...) :

ODBC connection failed

 

Many thanks,

Alexandros


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Date_Time detected as Duplicated (but they are not!)

2011-05-18 Thread Agustin Lobo

and is it not possible to ignore savings time? My data are in UTC,
with no savings time changes
 delme = strptime(aur2009[,1], %m/%d/%Y %H:%M,tz=UTC)
 any(duplicated(delme))
[1] TRUE

 delme = as.POSIXct(aur2009[,1], %m/%d/%Y %H:%M,tz=UTC)
 any(duplicated(delme))
[1] TRUE

Agus

On Wed, May 18, 2011 at 8:55 AM, Michael Sumner mdsum...@gmail.com wrote:
 See under Note in ?strptime:
    Remember that in most timezones some times do not occur and some
      occur twice because of transitions to/from summer time.
      ‘strptime’ does not validate such times (it does not assume a
      specific timezone), but conversion by ‘as.POSIXct’) will do so.



 On Wed, May 18, 2011 at 3:53 PM, Agustin Lobo agustin.l...@ictja.csic.es
 wrote:

 I have a problem with duplicated date_time stamps that I do not see as
 duplicated.

 I read a file with observations taken every 30 minutes:

 
  aur2009=read.csv(paste(datadir,AUR_ECPP_2009.csv,sep=/),sep=;,stringsAsFactors=F)
  aur2009[1:3,1:5]
      Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
 1 1/1/2009 0:00        0           NaN      5.86            NaN
 2 1/1/2009 0:30        0           NaN      5.05            NaN
 3 1/1/2009 1:00        0           NaN      5.56            NaN

  delme = strptime(aur2009[,1], %m/%d/%Y %H:%M)
  aur2009[,1]=as.POSIXct(delme)
            Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
 1 2009-01-01 00:00:00        0           NaN      5.86            NaN
 2 2009-01-01 00:30:00        0           NaN      5.05            NaN
 3 2009-01-01 01:00:00        0           NaN      5.56            NaN

  aur2009ts = ts(aur2009)
  row.names(aur2009ts) = as.character(delme)
  aur2009ts[1:3,1:5]
                     Date.Time E_filled E_filled_flag LE_filled
 LE_filled_flag
 2009-01-01 00:00:00 1230764400        0           NaN      5.86
  NaN
 2009-01-01 00:30:00 1230766200        0           NaN      5.05
  NaN
 2009-01-01 01:00:00 1230768000        0           NaN      5.56
  NaN

 Then:
  aur2009z = zoo(aur2009[,2:12],as.POSIXct(delme))
 Warning message:
 In zoo(aur2009[, 2:12], as.POSIXct(delme)) :
  some methods for “zoo” objects do not work if the index entries in
 ‘order.by’ are not unique

 So I investigate:
  any(duplicated(aur2009ts[,1]))
 [1] TRUE

  aur2009ts[(duplicated(aur2009ts[,1])),1:5]
                     Date.Time E_filled E_filled_flag LE_filled
 LE_filled_flag
 2009-03-29 02:00:00 1238284800        0           NaN       1.2
  NaN
 2009-03-29 02:30:00 1238286600        0           NaN       1.2
  NaN

 But note the surprise:
  aur2009ts[aur2009ts[,1]==1238284800,1:5]
                     Date.Time E_filled E_filled_flag LE_filled
 LE_filled_flag
 2009-03-29 01:00:00 1238284800        0           NaN     -0.58
  NaN
 2009-03-29 02:00:00 1238284800        0           NaN      1.20
  NaN
  aur2009ts[aur2009ts[,1]==1238286600,1:5]
                     Date.Time E_filled E_filled_flag LE_filled
 LE_filled_flag
 2009-03-29 01:30:00 1238286600        0           NaN     -0.34
  NaN
 2009-03-29 02:30:00 1238286600        0           NaN      1.20
  NaN

 The dates detected as duplicated are actually different times that got
 the same value in the ts version of the object!
 What am I doing wrong? They are all observations every 30min, why are
 these 2 encoded as the
 same time?

 Any help appreciated

 Agus

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Michael Sumner
 Institute for Marine and Antarctic Studies, University of Tasmania
 Hobart, Australia
 e-mail: mdsum...@gmail.com


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Date_Time detected as Duplicated (but they are not!)

2011-05-18 Thread Timothy Bates

Dear Augustin: What are the duplicated times? Looks they really do occur twice 
or more in your original data: perhaps two stamps less time apart than the 
resolution of your clock?

delme[duplicated(delme)]
aur2009[[duplicated(delme),1]

On 18 May 2011, at 8:49 AM, Agustin Lobo wrote:

 and is it not possible to ignore savings time? My data are in UTC,
 with no savings time changes
 delme = strptime(aur2009[,1], %m/%d/%Y %H:%M,tz=UTC)
 any(duplicated(delme))
 [1] TRUE
 
 delme = as.POSIXct(aur2009[,1], %m/%d/%Y %H:%M,tz=UTC)
 any(duplicated(delme))
 [1] TRUE
 
 Agus
 
 On Wed, May 18, 2011 at 8:55 AM, Michael Sumner mdsum...@gmail.com wrote:
 See under Note in ?strptime:
Remember that in most timezones some times do not occur and some
  occur twice because of transitions to/from summer time.
  ‘strptime’ does not validate such times (it does not assume a
  specific timezone), but conversion by ‘as.POSIXct’) will do so.
 
 
 
 On Wed, May 18, 2011 at 3:53 PM, Agustin Lobo agustin.l...@ictja.csic.es
 wrote:
 
 I have a problem with duplicated date_time stamps that I do not see as
 duplicated.
 
 I read a file with observations taken every 30 minutes:
 
 
 aur2009=read.csv(paste(datadir,AUR_ECPP_2009.csv,sep=/),sep=;,stringsAsFactors=F)
 aur2009[1:3,1:5]
  Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
 1 1/1/2009 0:000   NaN  5.86NaN
 2 1/1/2009 0:300   NaN  5.05NaN
 3 1/1/2009 1:000   NaN  5.56NaN
 
 delme = strptime(aur2009[,1], %m/%d/%Y %H:%M)
 aur2009[,1]=as.POSIXct(delme)
Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
 1 2009-01-01 00:00:000   NaN  5.86NaN
 2 2009-01-01 00:30:000   NaN  5.05NaN
 3 2009-01-01 01:00:000   NaN  5.56NaN
 
 aur2009ts = ts(aur2009)
 row.names(aur2009ts) = as.character(delme)
 aur2009ts[1:3,1:5]
 Date.Time E_filled E_filled_flag LE_filled
 LE_filled_flag
 2009-01-01 00:00:00 12307644000   NaN  5.86
  NaN
 2009-01-01 00:30:00 12307662000   NaN  5.05
  NaN
 2009-01-01 01:00:00 12307680000   NaN  5.56
  NaN
 
 Then:
 aur2009z = zoo(aur2009[,2:12],as.POSIXct(delme))
 Warning message:
 In zoo(aur2009[, 2:12], as.POSIXct(delme)) :
  some methods for “zoo” objects do not work if the index entries in
 ‘order.by’ are not unique
 
 So I investigate:
 any(duplicated(aur2009ts[,1]))
 [1] TRUE
 
 aur2009ts[(duplicated(aur2009ts[,1])),1:5]
 Date.Time E_filled E_filled_flag LE_filled
 LE_filled_flag
 2009-03-29 02:00:00 12382848000   NaN   1.2
  NaN
 2009-03-29 02:30:00 12382866000   NaN   1.2
  NaN
 
 But note the surprise:
 aur2009ts[aur2009ts[,1]==1238284800,1:5]
 Date.Time E_filled E_filled_flag LE_filled
 LE_filled_flag
 2009-03-29 01:00:00 12382848000   NaN -0.58
  NaN
 2009-03-29 02:00:00 12382848000   NaN  1.20
  NaN
 aur2009ts[aur2009ts[,1]==1238286600,1:5]
 Date.Time E_filled E_filled_flag LE_filled
 LE_filled_flag
 2009-03-29 01:30:00 12382866000   NaN -0.34
  NaN
 2009-03-29 02:30:00 12382866000   NaN  1.20
  NaN
 
 The dates detected as duplicated are actually different times that got
 the same value in the ts version of the object!
 What am I doing wrong? They are all observations every 30min, why are
 these 2 encoded as the
 same time?
 
 Any help appreciated
 
 Agus
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 --
 Michael Sumner
 Institute for Marine and Antarctic Studies, University of Tasmania
 Hobart, Australia
 e-mail: mdsum...@gmail.com
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Integral Symbol

2011-05-18 Thread Javi Hidalgo


Dear All,

I am documenting a R package. That means writing the *.Rd files inside the \man 
folder of the package structure
I was wondering how to write the symbol for an integral function in a formula.
Similar to this one in LaTeX:

\int_{0}^{10} \Omega(t)dt

I already tried

\deqn{\int_{0}^{10} \Omega(t)dt}

but it does not work. Any idea? Which math symbols does R-help recognise?

Regards,

Javier Hidalgo Carrio
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] DCC-GARCH model

2011-05-18 Thread Marcin P?�ciennik

Hello,
I have a few questions concerning the DCC-GARCH model and its programming in
R.
So here is what I want to do:
I take quotes of two indices - SP500 and DJ. And the aim is to estimate
coefficients of the DCC-GARCH model for them. This is how I do it:


library(tseries)
p1 = get.hist.quote(instrument = ^gspc,start = 2005-01-07,end =
2009-09-04,compression = w, quote=AdjClose)
p2 = get.hist.quote(instrument = ^dji,start = 2005-01-07,end =
2009-09-04,compression = w, quote=AdjClose)
p = cbind(p1,p2)
y = diff(log(p))*100
y[,1] = y[,1]-mean(y[,1])
y[,2] = y[,2]-mean(y[,2])
T = length(y[,1])

library(ccgarch)
library(fGarch)

f1 = garchFit(~ garch(1,1), data=y[,1],include.mean=FALSE)
f1 = f1@fit$coef
f2 = garchFit(~ garch(1,1), data=y[,2],include.mean=FALSE)
f2 = f2@fit$coef

a = c(f1[1], f2[1])
A = diag(c(f1[2],f2[2]))
B = diag(c(f1[3], f2[3]))
dccpara = c(0.2,0.6)
dccresults = dcc.estimation(inia=a, iniA=A, iniB=B, ini.dcc=dccpara,dvar=y,
model=diagonal)

dccresults$out
DCCrho = dccresults$DCC[,2]
matplot(DCCrho, type='l')


dccresults$out deliver me the estimated coefficients of the DCC-GARCH model.
And here is my first question:
How can I check if these coefficients are significant or not? How can I test
them for significance?

second question would be:
Is this true that matplot(DCCrho, type='l') shows conditional correlation
between the two indices in question?

and the third one:
What is actually dccpara and why do I get totally different DCC-alpha and
DCC-beta coefficients if I change dccpara from c(0.2,0.6) to, let's say,
c(0.01, 0.98) ? What determines which values should be chosen?


Hopefully someone will find time to give me a hand.

Thank you very much in advance, people of good will, for looking at/checking
what I wrote and helping me.

Best regards
Marcin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Need expert help with model.matrix

2011-05-18 Thread Axel Urbiz

Dear experts:

Is it possible to create a new function based
on stats:::model.matrix.default so that an alternative factor coding is used
when the function is called instead of the default factor coding?

Basically, I'd like to reproduce the results in 'mat' below, without having
to explicitly specify my desired factor coding (identity matrices) in the
'contrasts.arg'.

dd - data.frame(a = gl(3,4), b = gl(4,1,12))
ca - contrasts(dd$a, contrasts= FALSE)  # 3 x 3 identity matrix
cb - contrasts(dd$b, contrasts= FALSE)  # 4 x 4 identity matrix
mat - model.matrix(~ a + b, dd, contrasts.arg = list(a=ca, b=cb))

My approach was to modify the code in model.matrix by explicitly setting the
contrasts argument in the contr.identity and contrasts function to FALSE.
This is shown at the bottom of the email in the function model.matrix2:

contr.identity - contr.treatment
formals(contr.identity)$contrasts - FALSE

contrasts - contrasts
formals(contrasts)$contrasts - FALSE

However, I believe this function is using contrasts = TRUE, as it doesn't
return the identity contrasts
mat2 - model.matrix2(~ a + b, dd)

Any help here is much appreciated.
Axel.

-

model.matrix2 -

function (object, data = environment(object), contrasts.arg = NULL,
xlev = NULL, ...)
{
t - if (missing(data))
terms(object)
else terms(object, data = data)
if (is.null(attr(data, terms)))
data - model.frame(object, data, xlev = xlev)
else {
reorder - match(sapply(attr(t, variables), deparse,
width.cutoff = 500)[-1L], names(data))
if (any(is.na(reorder)))
stop(model frame and formula mismatch in model.matrix())
if (!identical(reorder, seq_len(ncol(data
data - data[, reorder, drop = FALSE]
}
int - attr(t, response)

contr.identity - contr.treatment
formals(contr.identity)$contrasts - FALSE

contrasts - contrasts
formals(contrasts)$contrasts - FALSE

if (length(data)) {
contr.funs -  c('contr.identity', 'contr.poly')
namD - names(data)
for (i in namD) if (is.character(data[[i]])) {
data[[i]] - factor(data[[i]])
warning(gettextf(variable '%s' converted to a factor,
i), domain = NA)
}
isF - sapply(data, function(x) is.factor(x) || is.logical(x))
isF[int] - FALSE
isOF - sapply(data, is.ordered)
for (nn in namD[isF]) if (is.null(attr(data[[nn]], contrasts)))
contrasts(data[[nn]]) - contr.funs[1 + isOF[nn]]
#browser()
if (!is.null(contrasts.arg)  is.list(contrasts.arg)) {
if (is.null(namC - names(contrasts.arg)))
stop(invalid 'contrasts.arg' argument)
for (nn in namC) {
if (is.na(ni - match(nn, namD)))
  warning(gettextf(variable '%s' is absent, its contrast
will be ignored,
nn), domain = NA)
else {
  ca - contrasts.arg[[nn]]
  if (is.matrix(ca))
contrasts(data[[ni]], ncol(ca)) - ca
  else contrasts(data[[ni]]) - contrasts.arg[[nn]]
}
}
}
}
else {
isF - FALSE
data - list(x = rep(0, nrow(data)))
}
ans - .Internal(model.matrix(t, data))
cons - if (any(isF))
lapply(data[isF], function(x) attr(x, contrasts))
else NULL
attr(ans, contrasts) - cons
ans
}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Integral Symbol

2011-05-18 Thread Uwe Ligges

See the section on writing Mathematics in Rd file in the manual Writing 
R Extensions. This will show how to produce high quality formulas in 
LaTeX generated output and ASCII versions otherwise.
If you want to provide an excellent HTML version as well, the section on 
Conditional text is also worth reading.


Uwe Ligges


On 18.05.2011 10:55, Javi Hidalgo wrote:


Dear All,

I am documenting a R package. That means writing the *.Rd files inside the \man 
folder of the package structure
I was wondering how to write the symbol for an integral function in a formula.
Similar to this one in LaTeX:

\int_{0}^{10} \Omega(t)dt

I already tried

\deqn{\int_{0}^{10} \Omega(t)dt}

but it does not work. Any idea? Which math symbols does R-help recognise?

Regards,

Javier Hidalgo Carrio

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to make array of regression objects

2011-05-18 Thread Ista Zahn

Hi Dmitrij,
I think the usual way is to store the results in a list:

o - list()
o[[1]] - lrm(...)
o[[2]] - lrm(...)
o[[...]] -lrm(...)

then you can access the results like 0[[1]], 0[[2]] ...

Best,
Ista

On Tue, May 17, 2011 at 11:53 PM, Dmitrij Kudriavcev
dimitrij.kudriav...@ntsg.lt wrote:
 Dear all,

 I have made couple logistic regressions, what making a distribution of some
 event.

 Currently, i store it like this:

 o1 - lrm(...)
 o2 - lrm(...)
 o3 - lrm(...)
 ...

 Then, i have made a function to peak required regression object from this
 variables by it number:

 get_object - function(obj_name, nModel) {
    eval (parse(text=paste(o - , obj_name, nModel, sep=)))
    o
 }

 Is there a better way to do it? I have try to store it in the matrix using
 data.frame(), but object become destroyed after that and predict() function
 do not recognize it.

 Regards,
 Dmitrij Kudriavcev

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multidimensional List access

2011-05-18 Thread Ingo Reinhold

Dear all,

I have data organized in a way like this

Data[[CharacteristicsList1]][[CharacteristicsList2]][[CharacteristicsList3]][[CharacteristicsList4]]

where CharacteristicsList4 there is a DF stored with various columns of name 
V1,V2, ... , Vn


Is there an easy way to get a vector of all the values in, lets say V3, in all 
CharacteristicsLists without the need for FOR-loops?

I figured there could be something like

Data[[:]][[:]][[:]][[:]][[V3]]


Many thanks,

Ingo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R-code in R-file documentation

2011-05-18 Thread Brian Oney


Hello List,
I would like to insert code from .r files into a LaTeX appendix 
(possibly using Sweave).

I was considering:

results=tex,eval=true,echo=true=
source(file.r)
@
but I would just like to echo the code and not evaluate the code within 
the file.

maybe:
results=tex,eval=true,echo=false=
cat(\\begin{verbatim})
readLines(file.r)
cat(\\end{verbatim})
@

The above works well other than the line numbers which are included 
(which isn't so bad).


Thanks for the help and ideas!
Brian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] new package SamplingStrata

2011-05-18 Thread Giulio Barcaroli

Dear R users,

I would like to announce that on the CRAN is now available a new package 
(SamplingStrata version 0.9) for the optimal stratification of sampling frames.

This package offers an approach for the determination of the best 
stratification of a sampling frame, the one that ensures the minimum sample 
size under the condition to satisfy precision constraints in a multivariate and 
multidomain case. This approach is based on the use of the genetic algorithm: 
each solution (i.e. a particular partition in strata of the sampling frame) is 
considered as an individual in a population to be evolved; the fitness of all 
individuals is evaluated by calculating (using the Bethel-Chromy algorithm) the 
sampling size satisfying accuracy constraints on the target estimates.

The package covers all the phases, from the optimisation of the sampling frame, 
up to the design of the stratified sample, ending with the selection of the 
units.

In the tar.gz (directory: \inst\doc) it is possible to find a vignette 
('SamplingStrataVignette.pdf') showing a complete application, from the 
optimisation of the sampling frame to the selection of the required sample.

I would appreciate any feedback

Sincerely,

Giulio Barcaroli

-- 
Giulio Barcaroli
Methods, Tools and Methodological Support
Italian National Institute of Statistics
barca...@istat.it

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Query Gene ontology

2011-05-18 Thread LEMAITRE Hervé Université Paris Sud

Dear R-users,

 

I'm looking for a way to query the gene ontology in R like in the GO browser 
(AmiGO). I tried different packages (NCBI2R, GOsim ...) but I did not find the 
way to extract genes names associated to a GO term. Could you tell me if there 
is a way to do that?

 

Thanks,

 

Hervé

 

`·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´

Hervé Lemaître

U1000 Imagerie et Psychiatrie

INSERM - CEA - Faculté de Médecine Paris Sud 11

Service Hospitalier Frédéric Joliot

4, Place du Général Leclerc

91401 ORSAY, FRANCE

Tél:  (+33) 1 69 86 77 84

Fax: (+33) 1 69 86 78 10

`·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] text mining analysis and word visualization of pdfs

2011-05-18 Thread Ashim Kapoor

On Wed, May 18, 2011 at 1:44 PM, Karl Ove Hufthammer k...@huftis.orgwrote:

 Ajay Ohri wrote:

  What is the appropriate software package for dumping say 20 PDFS in a
  folder, then creating data visualization with frequency counts of
  certain words as well as measure correlation within each file for
  certain key relationships or key words.

 pdftotext + Unix for Poets + R (ggplot2)

 What about the tm package ? I am a beginner and I don't know much about
this but I recall that it does have the ability to handle PDF's. A few words
from the experts would be nice.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Query Gene ontology

2011-05-18 Thread David Winsemius

You will find a wider and more experienced audience for this question  
on the Bioconductor mailing list.


--
David

On May 18, 2011, at 5:14 AM, LEMAITRE Hervé Université Paris Sud wrote:


Dear R-users,



I'm looking for a way to query the gene ontology in R like in the GO  
browser (AmiGO). I tried different packages (NCBI2R, GOsim ...) but  
I did not find the way to extract genes names associated to a GO  
term. Could you tell me if there is a way to do that?





David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Query Gene ontology

2011-05-18 Thread Ben Bolker

LEMAITRE Hervé Université Paris Sud herve.lemaitre at cea.fr writes:

 I'm looking for a way to query the gene ontology in R 
 like in the GO browser (AmiGO). I tried different
 packages (NCBI2R, GOsim ...) but I did not find the way 
 to extract genes names associated to a GO term. Could
 you tell me if there is a way to do that?

  You will probably have better luck posting this question on
the bioconductor mailing list (read the posting guide, and
search the list archives, first ...)

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-code in R-file documentation

2011-05-18 Thread Ben Bolker

Brian Oney zenlines at gmail.com writes:

 
 Hello List,
 I would like to insert code from .r files into a LaTeX appendix 
 (possibly using Sweave).
 I was considering:
 

 [snip]

  Wouldn't it be easier to use the LaTeX listings package?
https://stat.ethz.ch/pipermail/r-help/2006-September/113688.html
https://stat.ethz.ch/pipermail/r-help/2006-September/113103.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need expert help with model.matrix

2011-05-18 Thread Gabor Grothendieck

On Wed, May 18, 2011 at 7:15 AM, Axel Urbiz axel.ur...@gmail.com wrote:
 Dear experts:

 Is it possible to create a new function based
 on stats:::model.matrix.default so that an alternative factor coding is used
 when the function is called instead of the default factor coding?

 Basically, I'd like to reproduce the results in 'mat' below, without having
 to explicitly specify my desired factor coding (identity matrices) in the
 'contrasts.arg'.

 dd - data.frame(a = gl(3,4), b = gl(4,1,12))
 ca - contrasts(dd$a, contrasts= FALSE)  # 3 x 3 identity matrix
 cb - contrasts(dd$b, contrasts= FALSE)  # 4 x 4 identity matrix
 mat - model.matrix(~ a + b, dd, contrasts.arg = list(a=ca, b=cb))

 My approach was to modify the code in model.matrix by explicitly setting the
 contrasts argument in the contr.identity and contrasts function to FALSE.
 This is shown at the bottom of the email in the function model.matrix2:

 contr.identity - contr.treatment
 formals(contr.identity)$contrasts - FALSE

 contrasts - contrasts
 formals(contrasts)$contrasts - FALSE

 However, I believe this function is using contrasts = TRUE, as it doesn't
 return the identity contrasts
 mat2 - model.matrix2(~ a + b, dd)

 Any help here is much appreciated.
 Axel.


If your objective in all this is ultimately to get lm coefficients in
the original coding then see ?dummy.coef

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question on approximations of full logistic regression model

2011-05-18 Thread khosoda

Thank you for your advice, Tim.

I am reading your paper and other materials in your website.
I could not find R package of your bootknife method. Is there any R
package for this procedure?

(11/05/17 14:13), Tim Hesterberg wrote:
 My usual rule is that whatever gives the widest confidence intervals
 in a particular problem is most accurate for that problem :-)
 
 Bootstrap percentile intervals tend to be too narrow.
 Consider the case of the sample mean; the usual formula CI is
  xbar +- t_alpha sqrt( (1/(n-1)) sum((x_i - xbar)^2)) / sqrt(n)
 The bootstrap percentile interval for symmetric data is roughly
  xbar +- z_alpha sqrt( (1/(n  )) sum((x_i - xbar)^2)) / sqrt(n)
 It is narrower than the formula CI because
* z quantiles rather than t quantiles
* standard error uses divisor of n rather than (n-1)
 
 In stratified sampling, the narrowness factor depends on the
 stratum sizes, not the overall n.
 In regression, estimates for some quantities may be based on a small
 subset of the data (e.g. coefficients related to rare factor levels).
 
 This doesn't mean we should give up on the bootstrap.
 There are remedies for the bootstrap biases, see e.g.
 Hesterberg, Tim C. (2004), Unbiasing the Bootstrap-Bootknife Sampling
 vs. Smoothing, Proceedings of the Section on Statistics and the
 Environment, American Statistical Association, 2924-2930.
 http://home.comcast.net/~timhesterberg/articles/JSM04-bootknife.pdf
 
 And other methods have their own biases, particularly in nonlinear
 applications such as logistic regression.
 
 Tim Hesterberg
 
 Thank you for your reply, Prof. Harrell.

 I agree with you. Dropping only one variable does not actually help a lot.

 I have one more question.
 During analysis of this model I found that the confidence
 intervals (CIs) of some coefficients provided by bootstrapping (bootcov
 function in rms package) was narrower than CIs provided by usual
 variance-covariance matrix and CIs of other coefficients wider.  My data
 has no cluster structure. I am wondering which CIs are better.
 I guess bootstrapping one, but is it right?

 I would appreciate your help in advance.
 --
 KH



 (11/05/16 12:25), Frank Harrell wrote:
 I think you are doing this correctly except for one thing.  The validation
 and other inferential calculations should be done on the full model.  Use
 the approximate model to get a simpler nomogram but not to get standard
 errors.  With only dropping one variable you might consider just running the
 nomogram on the entire model.
 Frank


 KH wrote:

 Hi,
 I am trying to construct a logistic regression model from my data (104
 patients and 25 events). I build a full model consisting of five
 predictors with the use of penalization by rms package (lrm, pentrace
 etc) because of events per variable issue. Then, I tried to approximate
 the full model by step-down technique predicting L from all of the
 componet variables using ordinary least squares (ols in rms package) as
 the followings. I would like to know whether I am doing right or not.

 library(rms)
 plogit- predict(full.model)
 full.ols- ols(plogit ~ stenosis+x1+x2+ClinicalScore+procedure, sigma=1)
 fastbw(full.ols, aics=1e10)

Deleted   Chi-Sq d.f. P  Residual d.f. P  AICR2
stenosis   1.41  10.2354   1.41   10.2354  -0.59 0.991
x216.78  10.  18.19   20.0001  14.19 0.882
procedure 26.12  10.  44.31   30.  38.31 0.711
ClinicalScore 25.75  10.  70.06   40.  62.06 0.544
x183.42  10. 153.49   50. 143.49 0.000

 Then, fitted an approximation to the full model using most imprtant
 variable (R^2 for predictions from the reduced model against the
 original Y drops below 0.95), that is, dropping stenosis.

 full.ols.approx- ols(plogit ~ x1+x2+ClinicalScore+procedure)
 full.ols.approx$stats
 n  Model L.R.d.f.  R2   g   Sigma
 104.000 487.9006640   4.000   0.9908257   1.3341718   0.1192622

 This approximate model had R^2 against the full model of 0.99.
 Therefore, I updated the original full logistic model dropping
 stenosis as predictor.

 full.approx.lrm- update(full.model, ~ . -stenosis)

 validate(full.model, bw=F, B=1000)
 index.orig trainingtest optimism index.correctedn
 Dxy   0.6425   0.7017  0.6131   0.0887  0.5539 1000
 R20.3270   0.3716  0.3335   0.0382  0.2888 1000
 Intercept 0.   0.  0.0821  -0.0821  0.0821 1000
 Slope 1.   1.  1.0548  -0.0548  1.0548 1000
 Emax  0.   0.  0.0263   0.0263  0.0263 1000

 validate(full.approx.lrm, bw=F, B=1000)
 index.orig trainingtest optimism index.correctedn
 Dxy   0.6446   0.6891  0.6265   0.0626  0.5820 1000
 R20.3245   0.3592  0.3428   0.0164  0.3081 1000
 Intercept 0.   0.

Re: [R] Smooth contour of a map

2011-05-18 Thread Pierre Bruyer

I've pratically resolved my problem (the code is under that), but a last thing 
is not perfect:
when I use the function plot to call after the function polygon, there is a 
marge between my raster and the window. I think it's the axis of the function 
plot(), but I have not found how delete it. Someone have a solution please?

Pierre Bruyer

##smooth contour

contours - contourLines(V2b,levels=paliers)


par(mar=c(0,0,0,0))
plot(1,col=white,main=polygon(), asp = 1, axes = FALSE, ann = 
FALSE,xlim=c(0,1), ylim = c(0,1),type = n, method = c(image))
for (i in seq_along(contours)) {
 x - contours[[i]]$x
 y - contours[[i]]$y
 c - contours[[i]]$level
 j - 1
 tmp - 0 
 while(j  length(level[,1])  tmp == 0){
if(level[j,1] == c){
tmp - j
}
j - j+1 
}   

polygon( spline( seq_along(x), x)$y, spline( seq_along(y), y)$y 
,col = colgraph[tmp+1], border = NA)
}



Le 17 mai 2011 à 16:44, Pierre Bruyer a écrit :

 The result is good, thanks a lot, but how can I with this method fill my 
 raster to color?
 
 Le 17 mai 2011 à 15:43, Duncan Murdoch a écrit :
 
 I don't think filled.contour gives you access to the contour lines.  If you 
 use contourLines() to compute them, then you can draw them using code like 
 this:
 
 contours - contourLines(V2b,levels=paliers)
 for (i in seq_along(contours)) {
  x - contours[[i]]$x
  y - contours[[i]]$y
  lines( splines( seq_along(x), x)$y, splines( seq_along(y), y)$y )
 }
 
 but as I said, you won't get great results.  A better way is to use a finer 
 grid, e.g. by fitting a smooth surface to your set of points and using 
 predictions from the model to interpolate.
 
 Duncan Murdoch
 
 
 On 17/05/2011 9:35 AM, Pierre Bruyer wrote:
 I work with large datasets (1 points) so I can't post them , but my 
 function is :
 
 create_map- function(grd, level ,map_output, format = c(jpeg), width_map 
 = 150, height_map = 150,...)
 {   
 
 ##sp- spline(x = grd[,1], y = grd[,2])
 
 grd2- matrix(grd[,3], nrow = sqrt(length(grd[,3])), ncol = 
 sqrt(length(grd[,3])), byrow = FALSE)
 
 V2b- grd2
 
 
 ##creation of breaks for colors
 i-1
 paliers- c(-1.0E300)
 while(i=length(level[,1]))
 {
 paliers- c(paliers,level[i,1])
 i- i+1
 }
 paliers- c(paliers, 1.0E300)
 
 ##scale color creation
 i- 1
 colgraph- c(rgb(255,255,255, maxColorValue = 255))
 while(i=length(level[,2]))
 {
 colgraph- c(colgraph, rgb(level[i,2],level[i,3],level[i,4], 
 maxColorValue = 255))
 i- i +1
 }
 
 ##user can choose the output format (default is jpeg)
 switch(format,
 png = png(map_output, width = width_map, height = height_map) ,
 jpeg = jpeg(map_output, width = width_map, height = height_map, 
 quality = 100),
 bmp = bmp(map_output, width = width_map, height = height_map),
 tiff = tiff(map_output, width = width_map, height = height_map),
 jpeg(map_output, width = width_map, height = height_map))
 
 ## drawing map
 
 ##delete marge
 par(mar=c(0,0,0,0))
 filled.contour(V2b, col = colgraph, levels = paliers, asp = 1, axes = 
 FALSE, ann = FALSE)
 dev.off()   
 
 }
 
 where grd is a xyz data frame,
 map_output is the path+name of the output image file,
 and level is a matrix like this :
 
 
 level- matrix(0,10,4)
 level[1,1]- 1.E+00
 level[2,1]- 3.E+00
 level[3,1]- 5.E+00
 level[4,1]- 1.E+01
 level[5,1]- 1.5000E+01
 level[6,1]- 2.E+01
 level[7,1]- 3.E+01
 level[8,1]- 4.E+01
 level[9,1]- 5.E+01
 level[10,1]- 7.5000E+01
 
 
 level[1,2]- 102
 level[2,2]- 102
 level[3,2]- 102
 level[4,2]- 93
 level[5,2]- 204
 level[6,2]- 248
 level[7,2]- 241
 level[8,2]- 239
 level[9,2]- 224
 level[10,2]- 153
 
 level[1,3]- 153
 level[2,3]- 204
 level[3,3]- 204
 level[4,3]- 241
 level[5,3]- 255
 level[6,3]- 243
 level[7,3]- 189
 level[8,3]- 126
 level[9,3]- 14
 level[10,3]- 0
 
 level[1,4]- 153
 level[2,4]- 204
 level[3,4]- 153
 level[4,4]- 107
 level[5,4]- 102
 level[6,4]- 33
 level[7,4]- 59
 level[8,4]- 63
 level[9,4]- 14
 level[10,4]- 51
 
 Le 17 mai 2011 à 15:17, Duncan Murdoch a écrit :
 
 On 17/05/2011 8:24 AM, Pierre Bruyer wrote:
 Thank you for your answer, but the function spline() (and a lot of other 
 function in R)  can't take in its parameters the original contour which 
 are define by a vector, i.e. :
 
 
 If you post some reproducible code to generate the contours, someone will 
 show you how to use splines to interpolate them.
 
 Duncan Murdoch
 
   ##creation of breaks for colors
   i-1
   paliers- c(-1.0E300)
   while(i=length(level[,1]))
   {

Re: [R] Integral Symbol

2011-05-18 Thread Javi Hidalgo


Thanks.
I was exactly reading the manual Writing R Extensions, on section Mathematics. 
Where, it informs about basic LaTeX style support.
However, It seems like it does not support the LaTeX integral symbol \int, but 
it does support i.e.: the summation symbol \sum.

Has anyone had this experience on documenting R packages?
Does anyone know any R-package where the integral symbol appear in the help 
files.

Regards,

Javier Hidalgo Carrio


 Date: Wed, 18 May 2011 13:14:54 +0200
 From: lig...@statistik.tu-dortmund.de
 To: havyhida...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R]  Integral Symbol
 
 See the section on writing Mathematics in Rd file in the manual Writing 
 R Extensions. This will show how to produce high quality formulas in 
 LaTeX generated output and ASCII versions otherwise.
 If you want to provide an excellent HTML version as well, the section on 
 Conditional text is also worth reading.
 
 Uwe Ligges
 
 
 On 18.05.2011 10:55, Javi Hidalgo wrote:
 
  Dear All,
 
  I am documenting a R package. That means writing the *.Rd files inside the 
  \man folder of the package structure
  I was wondering how to write the symbol for an integral function in a 
  formula.
  Similar to this one in LaTeX:
 
  \int_{0}^{10} \Omega(t)dt
 
  I already tried
 
  \deqn{\int_{0}^{10} \Omega(t)dt}
 
  but it does not work. Any idea? Which math symbols does R-help recognise?
 
  Regards,
 
  Javier Hidalgo Carrio
  
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to make array of regression objects

2011-05-18 Thread Dmitrij Kudriavcev

Thank you, Ista, that exactly, what i was looking for :)

Regards,
Dmitrij Kudriavcev

2011/5/18 Ista Zahn iz...@psych.rochester.edu

 Hi Dmitrij,
 I think the usual way is to store the results in a list:

 o - list()
 o[[1]] - lrm(...)
 o[[2]] - lrm(...)
 o[[...]] -lrm(...)

 then you can access the results like 0[[1]], 0[[2]] ...

 Best,
 Ista

 On Tue, May 17, 2011 at 11:53 PM, Dmitrij Kudriavcev
 dimitrij.kudriav...@ntsg.lt wrote:
  Dear all,
 
  I have made couple logistic regressions, what making a distribution of
 some
  event.
 
  Currently, i store it like this:
 
  o1 - lrm(...)
  o2 - lrm(...)
  o3 - lrm(...)
  ...
 
  Then, i have made a function to peak required regression object from this
  variables by it number:
 
  get_object - function(obj_name, nModel) {
 eval (parse(text=paste(o - , obj_name, nModel, sep=)))
 o
  }
 
  Is there a better way to do it? I have try to store it in the matrix
 using
  data.frame(), but object become destroyed after that and predict()
 function
  do not recognize it.
 
  Regards,
  Dmitrij Kudriavcev
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Ista Zahn
 Graduate student
 University of Rochester
 Department of Clinical and Social Psychology
 http://yourpsyche.org


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting a list of dataframes

2011-05-18 Thread David Winsemius



On May 17, 2011, at 7:13 PM, Lara Poplarski wrote:

Thank you all, this is exactly what I had in mind, except that I  
still have

to get my head around apply et al. Back to the books for me then!



Read the lapply( ...)  call as:

For every element in the object named `data`, send that element to a  
function that returns TRUE if its first dimension is greater than one,  
returns FALSE if its first dimension is one, and return nothing  
(actually a vector with zero elements) if it doesn't have a (first)  
dim attribute, and finally return the ordered collection of those  
values as a list which is assigned the name 'entries.with.nrows'. 




Lara

On Tue, May 17, 2011 at 2:41 PM, Jannis bt_jan...@yahoo.de wrote:


Have a look at lapply(). Something like:

entries.with.nrows=lapply(data,function(x)dim(x)[1]1)

should give you a vector with the elements of the list that you  
seek marked

with TRUE.

This vector can then be used to extract a subset from your list by:

data.reduced=data[entries.with.nrows]

Or similar


HTH
Jannis

--- Lara Poplarski larapoplar...@gmail.com schrieb am Di,  
17.5.2011:



Von: Lara Poplarski larapoplar...@gmail.com
Betreff: [R] subsetting a list of dataframes
An: r-help@r-project.org
Datum: Dienstag, 17. Mai, 2011 20:24 Uhr
Hello All,

I have a list of dataframes, and I need to subset it by
keeping only those
dataframes in the list that meet a certain criterion.
Specifically, I need
to generate a second list which only includes those
dataframes whose number
of rows is  1.

Could someone suggest how to do this? I have come close to
what I need with
loops and such, but there must be a less clumsy way...

Many thanks,
Lara

   [[alternative HTML version deleted]]

__
R-help@r-project.org
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained,
reproducible code.





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Smooth contour of a map

2011-05-18 Thread David Winsemius

You may be looking for the par settings of xaxs=i, yaxs=i, which  
if you add them to the plot call will prevent the regular behavior  
of adding 4% padding to the axis widths.


?par

-- David.

On May 18, 2011, at 8:27 AM, Pierre Bruyer wrote:

I've pratically resolved my problem (the code is under that), but a  
last thing is not perfect:
when I use the function plot to call after the function polygon,  
there is a marge between my raster and the window. I think it's the  
axis of the function plot(), but I have not found how delete it.  
Someone have a solution please?


Pierre Bruyer

##smooth contour

contours - contourLines(V2b,levels=paliers)


par(mar=c(0,0,0,0))
	plot(1,col=white,main=polygon(), asp = 1, axes = FALSE, ann =  
FALSE,xlim=c(0,1), ylim = c(0,1),type = n, method = c(image))

for (i in seq_along(contours)) {
 x - contours[[i]]$x
 y - contours[[i]]$y
 c - contours[[i]]$level
 j - 1
 tmp - 0
 while(j  length(level[,1])  tmp == 0){
if(level[j,1] == c){
tmp - j
}
j - j+1
}   

		polygon( spline( seq_along(x), x)$y, spline( seq_along(y), y) 
$y ,col = colgraph[tmp+1], border = NA)

}



Le 17 mai 2011 à 16:44, Pierre Bruyer a écrit :

The result is good, thanks a lot, but how can I with this method  
fill my raster to color?


Le 17 mai 2011 à 15:43, Duncan Murdoch a écrit :

I don't think filled.contour gives you access to the contour  
lines.  If you use contourLines() to compute them, then you can  
draw them using code like this:


contours - contourLines(V2b,levels=paliers)
for (i in seq_along(contours)) {
x - contours[[i]]$x
y - contours[[i]]$y
lines( splines( seq_along(x), x)$y, splines( seq_along(y), y)$y )
}

but as I said, you won't get great results.  A better way is to  
use a finer grid, e.g. by fitting a smooth surface to your set of  
points and using predictions from the model to interpolate.


Duncan Murdoch


On 17/05/2011 9:35 AM, Pierre Bruyer wrote:
I work with large datasets (1 points) so I can't post them ,  
but my function is :


create_map- function(grd, level ,map_output, format = c(jpeg),  
width_map = 150, height_map = 150,...)

{   

##sp- spline(x = grd[,1], y = grd[,2])

	grd2- matrix(grd[,3], nrow = sqrt(length(grd[,3])), ncol =  
sqrt(length(grd[,3])), byrow = FALSE)


V2b- grd2


##creation of breaks for colors
i-1 
paliers- c(-1.0E300)
while(i=length(level[,1]))
{
paliers- c(paliers,level[i,1])
i- i+1
}
paliers- c(paliers, 1.0E300)

##scale color creation
i- 1
colgraph- c(rgb(255,255,255, maxColorValue = 255))
while(i=length(level[,2]))
{
		colgraph- c(colgraph, rgb(level[i,2],level[i,3],level[i,4],  
maxColorValue = 255))

i- i +1
}

##user can choose the output format (default is jpeg)
switch(format,
png = png(map_output, width = width_map, height = height_map) ,
		jpeg = jpeg(map_output, width = width_map, height = height_map,  
quality = 100),

bmp = bmp(map_output, width = width_map, height = height_map),
tiff = tiff(map_output, width = width_map, height = height_map),
jpeg(map_output, width = width_map, height = height_map))

## drawing map

##delete marge
par(mar=c(0,0,0,0))
	filled.contour(V2b, col = colgraph, levels = paliers, asp = 1,  
axes = FALSE, ann = FALSE)

dev.off()   

}

where grd is a xyz data frame,
map_output is the path+name of the output image file,
and level is a matrix like this :


level- matrix(0,10,4)
level[1,1]- 1.E+00
level[2,1]- 3.E+00
level[3,1]- 5.E+00
level[4,1]- 1.E+01
level[5,1]- 1.5000E+01
level[6,1]- 2.E+01
level[7,1]- 3.E+01
level[8,1]- 4.E+01
level[9,1]- 5.E+01
level[10,1]- 7.5000E+01


level[1,2]- 102
level[2,2]- 102
level[3,2]- 102
level[4,2]- 93
level[5,2]- 204
level[6,2]- 248
level[7,2]- 241
level[8,2]- 239
level[9,2]- 224
level[10,2]- 153

level[1,3]- 153
level[2,3]- 204
level[3,3]- 204
level[4,3]- 241
level[5,3]- 255
level[6,3]- 243
level[7,3]- 189
level[8,3]- 126
level[9,3]- 14
level[10,3]- 0

level[1,4]- 153
level[2,4]- 204
level[3,4]- 153
level[4,4]- 107
level[5,4]- 102
level[6,4]- 33
level[7,4]- 59
level[8,4]- 63
level[9,4]- 14
level[10,4]- 51

Le 17 mai 2011 à 15:17, Duncan Murdoch a écrit :


On 17/05/2011 8:24 AM, Pierre Bruyer wrote:
Thank you for your answer, but the function spline() (and a lot  
of other function in R)  can't take in its parameters the  
original contour which are define by a vector, i.e. :




If you post some reproducible code to

[R] matrix help (first occurrence of variable in column)

2011-05-18 Thread Michael Denslow

Dear R help,
Apologies for the less than informative subject line. I will do my
best to describe my problem.

Consider the following matrix:

mdat - matrix(c(1,0,1,1,1,0), nrow = 2, ncol=3, byrow=TRUE,
   dimnames = list(c(T1, T2),
   c(sp.1, sp.2, sp.3)))

mdat

In my actual data I have time (rows) and species occurrences (0/1
values, columns). I want to count the number of new species that occur
at a given time sample. For the matrix above the answer would be 1.

Is there a simple way to figure out if the species has never occurred
before and then sum them up?

Thanks in advance,
Micheal

-- 
Michael Denslow

I.W. Carpenter Jr. Herbarium [BOON]
Department of Biology
Appalachian State University
Boone, North Carolina U.S.A.
-- AND --
Communications Manager
Southeast Regional Network of Expertise and Collections
sernec.org

36.214177, -81.681480 +/- 3103 meters

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Change pattern in histograms in ggplot2

2011-05-18 Thread Christopher Desjardins

Hi,
I am wondering if there is a way to change the pattern of the fill in
histogram in ggplot2? By default the fill is solid and I'd like to add some
sort of pattern to make it more visible that these are different levels of a
factor.

Thanks!
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Integral Symbol

2011-05-18 Thread Duncan Murdoch


On 18/05/2011 9:09 AM, Javi Hidalgo wrote:

Thanks.
I was exactly reading the manual Writing R Extensions, on section Mathematics. 
Where, it informs about basic LaTeX style support.
However, It seems like it does not support the LaTeX integral symbol \int, but 
it does support i.e.: the summation symbol \sum.

Has anyone had this experience on documenting R packages?


It appears in the topic shown by ?Special, Special Functions of 
Mathematics, but only in the LaTeX version.  You can see that in the 
PDF version of the Reference Manual, or (if you have things set up 
correctly), by saying


options(help_type=pdf)
?Special

I don't know of any examples where someone has shown an integral sign in 
text or html versions.  It's not a symbol supported by the R help 
system, it would depend on hand coding the right thing.


Duncan Murdoch



Does anyone know any R-package where the integral symbol appear in the help 
files.

Regards,

Javier Hidalgo Carrio


  Date: Wed, 18 May 2011 13:14:54 +0200
  From: lig...@statistik.tu-dortmund.de
  To: havyhida...@hotmail.com
  CC: r-help@r-project.org
  Subject: Re: [R]  Integral Symbol

  See the section on writing Mathematics in Rd file in the manual Writing
  R Extensions. This will show how to produce high quality formulas in
  LaTeX generated output and ASCII versions otherwise.
  If you want to provide an excellent HTML version as well, the section on
  Conditional text is also worth reading.

  Uwe Ligges


  On 18.05.2011 10:55, Javi Hidalgo wrote:
  
Dear All,
  
I am documenting a R package. That means writing the *.Rd files inside 
the \man folder of the package structure
I was wondering how to write the symbol for an integral function in a 
formula.
Similar to this one in LaTeX:
  
\int_{0}^{10} \Omega(t)dt
  
I already tried
  
\deqn{\int_{0}^{10} \Omega(t)dt}
  
but it does not work. Any idea? Which math symbols does R-help recognise?
  
Regards,
  
Javier Hidalgo Carrio

[[alternative HTML version deleted]]
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R Style Guide -- Was Post-hoc tests in MASS using glm.nb

2011-05-18 Thread Bert Gunter

Thanks Bill. Do you and others think that a link to this guide (or
another)should be included in the Posting Guide and/or R FAQ?

-- Bert

On Tue, May 17, 2011 at 4:07 PM,  bill.venab...@csiro.au wrote:
 Amen to all of that, Bert.  Nicely put.  The google style guide (not perfect, 
 but a thoughtful contribution on these kinds of issues, has avoiding attach() 
 as its very first line.  See 
 http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html)

 I would add, though, that not enough people seem yet to be aware of 
 within(...), a companion of with(...) in a way, but used for modifying data 
 frames or other kinds of list objects.  It should be seen as a more flexible 
 replacement for transform() (well, almost).

 The difference between with() and within() is as follows:

 with(data, expr, ...)

 allows you to evaluate 'expr' with 'data' providing the primary source for 
 variables, and returns *the evaluated expression* as the result.  By contrast

 within(data, expr, ...)

 again uses 'data' as the primary source for variables when evaluating 'expr', 
 but now 'expr' is used to modify the varibles in 'data' and returns *the 
 modified data set* as the result.

 I use this a lot in the data preparation phase of a project, especially, 
 which is usually the longest, trickiest, most important, but least discussed 
 aspect of any data analysis project.

 Here is a simple example using within() for something you cannot do in one 
 step with transform():

 polyData - within(data.frame(x = runif(500)), {
  x2 - x^2
  x3 - x*x2
  b - runif(4)
  eta - cbind(1,x,x2,x3) %*% b
  y - eta + rnorm(x, sd = 0.5)
  rm(b)
 })

 check:

 str(polyData)
 'data.frame':   500 obs. of  5 variables:
  $ x  : num  0.5185 0.185 0.5566 0.2467 0.0178 ...
  $ y  : num [1:500, 1] 1.343 0.888 0.583 0.187 0.855 ...
  $ eta: num [1:500, 1] 1.258 0.788 1.331 0.856 0.63 ...
  $ x3 : num  1.39e-01 6.33e-03 1.72e-01 1.50e-02 5.60e-06 ...
  $ x2 : num  0.268811 0.034224 0.309802 0.060844 0.000315 ...


 Bill Venables.

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Bert Gunter
 Sent: Wednesday, 18 May 2011 12:08 AM
 To: Peter Ehlers
 Cc: R list
 Subject: Re: [R] Post-hoc tests in MASS using glm.nb

 Folks:

 Only if the user hasn't yet been introduced to the with() function,
 which is linked to on the ?attach page.

 Note also this sentence from the ?attach page:
   attach can lead to confusion.

 I can't remember the last time I needed attach().

 Peter Ehlers

 Yes. But perhaps it might be useful to flesh this out with a bit of
 commentary. To this end, I invite others to correct or clarify the
 following.

 The potential confusion comes from requiring R to search for the
 data. There is a rigorous process by which this is done, of course,
 but it requires that the runtime environment be consistent with that
 process, and the programmer who wrote the code may not have control
 over that environment. The usual example is that one has an object
 named,say,  a in the formula and in the attached data and another
 a also in the global environment. Then the wrong a would be found.
 The same thing can happen if another data set gets attached in a
 position before the one of interest. (Like Peter, I haven't used
 attach() in so long that I don't know whether any warning messages are
 issued in such cases).

 Using the data =  argument when available or the with() function
 when not avoids this potential confusion and tightly couples the data
 to be analyzed with the analysis.

 I hope this clarifies the previous posters' comments.

 Cheers,
 Bert


 [... non-germane material snipped ...]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Men by nature long to get on to the ultimate truths, and will often
 be impatient with elementary studies or fight shy of them. If it were
 possible to reach the ultimate truths without the elementary studies
 usually prefixed to them, these would not be preparatory studies but
 superfluous diversions.

 -- Maimonides (1135-1204)

 Bert Gunter
 Genentech Nonclinical Biostatistics

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter

Re: [R] R Style Guide -- Was Post-hoc tests in MASS using glm.nb

2011-05-18 Thread Steve_Friedman

This is the first time I've seen an R Style Guide.  I will admit that I
haven't looked for one previously, but nevertheless I still haven't seen
one. My code style simply evolved (perhaps, chugged along) by reading posts
from other users who post to the r-help community.

I regularly program with a colleague who is a Java software development
specialist, hacking together code that we both develop.   Since his coding
style differs substantially from mine and the conventions described for R
we end up modifying my code to follows his convention.  For example, he
typically likes to name variables in this form: variable_ , which the
guide frowns on.

I think this guide will be very helpful.  First for me to become more
proficient and conventional following R stylistics.  Secondly, he will see
why R users do things the way R.  The guide should be helpful.

I appreciate you posting the link to the guide. Much appreciated.

Steve

Steve Friedman Ph. D.
Ecologist  / Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147


   
 Bert Gunter   
 gunter.berton@ge 
 ne.comTo 
 Sent by:  bill.venab...@csiro.au  
 r-help-bounces@r-  cc 
 project.org   r-help@r-project.org
   Subject 
   [R] R Style Guide -- Was Post-hoc   
 05/18/2011 09:47  tests in MASS using glm.nb  
 AM
   
   
   
   
   




Thanks Bill. Do you and others think that a link to this guide (or
another)should be included in the Posting Guide and/or R FAQ?

-- Bert

On Tue, May 17, 2011 at 4:07 PM,  bill.venab...@csiro.au wrote:
 Amen to all of that, Bert.  Nicely put.  The google style guide (not
perfect, but a thoughtful contribution on these kinds of issues, has
avoiding attach() as its very first line.  See
http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html)

 I would add, though, that not enough people seem yet to be aware of
within(...), a companion of with(...) in a way, but used for modifying data
frames or other kinds of list objects.  It should be seen as a more
flexible replacement for transform() (well, almost).

 The difference between with() and within() is as follows:

 with(data, expr, ...)

 allows you to evaluate 'expr' with 'data' providing the primary source
for variables, and returns *the evaluated expression* as the result.  By
contrast

 within(data, expr, ...)

 again uses 'data' as the primary source for variables when evaluating
'expr', but now 'expr' is used to modify the varibles in 'data' and returns
*the modified data set* as the result.

 I use this a lot in the data preparation phase of a project, especially,
which is usually the longest, trickiest, most important, but least
discussed aspect of any data analysis project.

 Here is a simple example using within() for something you cannot do in
one step with transform():

 polyData - within(data.frame(x = runif(500)), {
  x2 - x^2
  x3 - x*x2
  b - runif(4)
  eta - cbind(1,x,x2,x3) %*% b
  y - eta + rnorm(x, sd = 0.5)
  rm(b)
 })

 check:

 str(polyData)
 'data.frame':   500 obs. of  5 variables:
  $ x  : num  0.5185 0.185 0.5566 0.2467 0.0178 ...
  $ y  : num [1:500, 1] 1.343 0.888 0.583 0.187 0.855 ...
  $ eta: num [1:500, 1] 1.258 0.788 1.331 0.856 0.63 ...
  $ x3 : num  1.39e-01 6.33e-03 1.72e-01 1.50e-02 5.60e-06 ...
  $ x2 : num  0.268811 0.034224 0.309802 0.060844 0.000315 ...


 Bill Venables.

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Bert Gunter
 Sent: Wednesday, 18 May 2011 12:08 AM
 To: Peter Ehlers
 Cc: R list
 Subject: Re: [R] Post-hoc tests in MASS using glm.nb

 Folks:

 Only if the user hasn't yet been introduced to the with() function,
 which is linked to on the ?attach page.

 Note also this sentence from the ?attach page:
   attach can lead to confusion.

 I can't remember the last time I needed attach().

 Peter Ehlers

 Yes. But perhaps it might be useful to flesh this out with a bit of

[R] Changing order of facet grid in ggplot2

2011-05-18 Thread Christopher Desjardins

Hi I am running the following code:

   sym - c(sym1,sym2,sym4)

lifedxm - c(O-BD,O-WELL,O-UNI)

life - c(lifedxm,lifedxm,lifedxm)

tp - c(TP-ANY,TP-ANY, TP-ANY, TP-SUB, TP-SUB, TP-SUB, TP-CLIN
, TP-CLIN, TP-CLIN)

data - data.frame(sym,life,tp)

qplot(life,geom=bar,weight=sym,ylim=c(0,1),legend=F,data=data) +
facet_grid(. ~ tp)



This creates a facet grid where TP-ANY is followed by TP-CLIN and  then
TP-SUB. I'd like to create a grid where TP-ANY is followed by TP-SUB then
TP-CLIN.


Is this possible?


Thanks,

Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting a list of dataframes

2011-05-18 Thread William Dunlap

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Lara Poplarski
 Sent: Tuesday, May 17, 2011 4:14 PM
 To: r-help@r-project.org
 Subject: Re: [R] subsetting a list of dataframes

 Thank you all, this is exactly what I had in mind, except 
 that I still have
 to get my head around apply et al. Back to the books for me then!

 Lara

 On Tue, May 17, 2011 at 2:41 PM, Jannis bt_jan...@yahoo.de wrote:

  Have a look at lapply(). Something like:

  entries.with.nrows=lapply(data,function(x)dim(x)[1]1)

Note that the above suggestion does not work in R 2.13.0:
   listOfDataFrames - list(three=data.frame(x=11:13,y=101:103),
 one=data.frame(x=1,y=2),
 five=data.frame(x=1:5,y=11:15))
   listOfDataFrames[lapply(listOfDataFrames,function(x)nrow(x)1)]
  Error in listOfDataFrames[lapply(listOfDataFrames, function(x) nrow(x)
  : 
invalid subscript type 'list'
lapply(...) always returns a list and lists are not acceptable as
subscripts.  Instead, make the subscript one of the following:
  as.logical(lapply(...))
  sapply(...) # and hope that FUN always returns TRUE or FALSE and
length(list)0
  vapply(..., FUN.VALUE=FALSE)

It may be a bit quicker to do the 0 outside of the loop, as in
  as.integer(lapply(listOfDataFrames, FUN=nrow))  0
or
  vapply(listOfDataFrames, FUN=nrow, FUN.VALUE=0L)  0
but you need a pretty long list to notice.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

  should give you a vector with the elements of the list that 
 you seek marked
  with TRUE.

  This vector can then be used to extract a subset from your list by:

  data.reduced=data[entries.with.nrows]

  Or similar

  HTH
  Jannis

  --- Lara Poplarski larapoplar...@gmail.com schrieb am Di, 
 17.5.2011:

   Von: Lara Poplarski larapoplar...@gmail.com
   Betreff: [R] subsetting a list of dataframes
   An: r-help@r-project.org
   Datum: Dienstag, 17. Mai, 2011 20:24 Uhr
   Hello All,

   I have a list of dataframes, and I need to subset it by
   keeping only those
   dataframes in the list that meet a certain criterion.
   Specifically, I need
   to generate a second list which only includes those
   dataframes whose number
   of rows is  1.

   Could someone suggest how to do this? I have come close to
   what I need with
   loops and such, but there must be a less clumsy way...

   Many thanks,
   Lara

   [[alternative HTML version deleted]]

   __
   R-help@r-project.org
   mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
   reproducible code.

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with 2-D plot of k-mean clustering analysis

2011-05-18 Thread Meng Wu

Hi, all

 I would like to use R to perform k-means clustering on my data which
included 33 samples measured with ~1000 variables. I have already used
kmeans package for this analysis, and showed that there are 4 clusters in my
data. However, it's really difficult to plot this cluster in 2-D format
since the huge number of variables. One possible way is to project the
multidimensional space into 2-D platform, but I could not find any good way
to do that. Any suggestions or comments will be really helpful!

Thanks,

Meng

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Changing order of facet grid in ggplot2

2011-05-18 Thread Scott Chamberlain

data$tp - factor(data$tp, levels = c(TP-ANY,TP-SUB,TP-CLIN))
qplot(life,geom=bar,weight=sym,ylim=c(0,1),legend=F,data=data) +
facet_grid(. ~ tp)
On Wednesday, May 18, 2011 at 9:14 AM, Christopher Desjardins wrote:
Hi I am running the following code:
 
  sym - c(sym1,sym2,sym4)
 
 lifedxm - c(O-BD,O-WELL,O-UNI)
 
 life - c(lifedxm,lifedxm,lifedxm)
 
 tp - c(TP-ANY,TP-ANY, TP-ANY, TP-SUB, TP-SUB, TP-SUB, TP-CLIN
 , TP-CLIN, TP-CLIN)
 
 data - data.frame(sym,life,tp)
 
 qplot(life,geom=bar,weight=sym,ylim=c(0,1),legend=F,data=data) +
 facet_grid(. ~ tp)
 
 
 
 This creates a facet grid where TP-ANY is followed by TP-CLIN and then
 TP-SUB. I'd like to create a grid where TP-ANY is followed by TP-SUB then
 TP-CLIN.
 
 
 Is this possible?
 
 
 Thanks,
 
 Chris
 
  [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change pattern in histograms in ggplot2

2011-05-18 Thread Scott Chamberlain


There are a number of discussion threads on the google groups ggplot2 page: 
here are two of them.
http://groups.google.com/group/ggplot2/browse_thread/thread/ca546f7f4d636deb/e0763a54b7735c35?lnk=gstq=fill+pattern#e0763a54b7735c35

http://groups.google.com/group/ggplot2/browse_thread/thread/9a9c081d235efc24/d319d4500174cdd7?lnk=gstq=fill+pattern#d319d4500174cdd7
Scott
On Wednesday, May 18, 2011 at 8:39 AM, Christopher Desjardins wrote:
Hi,
 I am wondering if there is a way to change the pattern of the fill in
 histogram in ggplot2? By default the fill is solid and I'd like to add some
 sort of pattern to make it more visible that these are different levels of a
 factor.
 
 Thanks!
 Chris
 
  [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Convolution confusion:

2011-05-18 Thread Alex Hofmann


Hi,

I'm new to R, and I'm a bit confused with the convolve() function.
If I do:
x-c(1, 2, 3)
convolve(x, rev(x), TRUE, open)
= 9 12 10 4 1

But I expected: 3 8 14 8 3 (like in Octave/MATLAB - conv(x, reverse(x)) )

3 2 1 x 1 2 3
= 3 2 1
0 6 4 2
0 0 9 6 3
= 3 8 14 8 3

The thing is, that convolve(x, x, TRUE, open) works.
For me it feels very confusing, that convolution does the reverse itself 
but the help suggest to reverse it again.


The help file says: Note that the usual definition of convolution of 
two sequences x and y is given by convolve(x, rev(y), type = o).


Thanks for your help,

Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R example code of Split-plot Manova

2011-05-18 Thread riccardo

Hi,
I'm a PhD student in Milan (Italy). I read the OBrienKaiser example in
?Anova in the car package. I think that this is not a Manova split-plot
design. I need to know if someone knows a R code for MANOVA split-plot. Is
there someone who can help me? 
Thanks for kindness

Riccardo

--
View this message in context: 
http://r.789695.n4.nabble.com/R-example-code-of-Split-plot-Manova-tp1593985p3532630.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] retrieving gbif data

2011-05-18 Thread Rafael Rubio de Casas

Hi,
I am trying to use the gbif function in the dismo package and it 
does not seem to work. I get the same error message every time:
Error in if (sp) geo - TRUE : argument is not interpretable as logical
It does not matter whether I query for my species of interest or if I 
copy and paste those included in the help for the function or the 
vignette of the package.
I am not sure whether it is because I am doing something wrong, although 
I am inclined to think there is a bug in the gbif code.
Any help will be appreciated.
Thank you in advance,
Rafa
-- 
National Evolutionary Synthesis Center
*NESCent http://www.nescent.org/*
2024 W. Main Street, Suite A200
Durham, NC27705
r...@nescent.org mailto:r...@duke.edu
919.668.9107

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Integral Symbol

2011-05-18 Thread Javi Hidalgo


Thanks! This is what I was looking for.
Apparently, it is not supported then it should be as integral, but in the pdf 
version appears the integral symbol.

Regards,

Javier.

 Date: Wed, 18 May 2011 09:39:28 -0400
 From: murdoch.dun...@gmail.com
 To: havyhida...@hotmail.com
 CC: r-help@r-project.org; lig...@statistik.tu-dortmund.de
 Subject: Re: [R] Integral Symbol
 
 On 18/05/2011 9:09 AM, Javi Hidalgo wrote:
  Thanks.
  I was exactly reading the manual Writing R Extensions, on section 
  Mathematics. Where, it informs about basic LaTeX style support.
  However, It seems like it does not support the LaTeX integral symbol \int, 
  but it does support i.e.: the summation symbol \sum.
 
  Has anyone had this experience on documenting R packages?
 
 It appears in the topic shown by ?Special, Special Functions of 
 Mathematics, but only in the LaTeX version.  You can see that in the 
 PDF version of the Reference Manual, or (if you have things set up 
 correctly), by saying
 
 options(help_type=pdf)
 ?Special
 
 I don't know of any examples where someone has shown an integral sign in 
 text or html versions.  It's not a symbol supported by the R help 
 system, it would depend on hand coding the right thing.
 
 Duncan Murdoch
 
 
  Does anyone know any R-package where the integral symbol appear in the help 
  files.
 
  Regards,
 
  Javier Hidalgo Carrio
 
 
Date: Wed, 18 May 2011 13:14:54 +0200
From: lig...@statistik.tu-dortmund.de
To: havyhida...@hotmail.com
CC: r-help@r-project.org
Subject: Re: [R]  Integral Symbol
  
See the section on writing Mathematics in Rd file in the manual Writing
R Extensions. This will show how to produce high quality formulas in
LaTeX generated output and ASCII versions otherwise.
If you want to provide an excellent HTML version as well, the section on
Conditional text is also worth reading.
  
Uwe Ligges
  
  
On 18.05.2011 10:55, Javi Hidalgo wrote:

  Dear All,

  I am documenting a R package. That means writing the *.Rd files 
   inside the \man folder of the package structure
  I was wondering how to write the symbol for an integral function in a 
   formula.
  Similar to this one in LaTeX:

  \int_{0}^{10} \Omega(t)dt

  I already tried

  \deqn{\int_{0}^{10} \Omega(t)dt}

  but it does not work. Any idea? Which math symbols does R-help 
   recognise?

  Regards,

  Javier Hidalgo Carrio

[[alternative HTML version deleted]]

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with 2-D plot of k-mean clustering analysis

2011-05-18 Thread David Cross

I wonder if it makes sense to reduce the dimensionality of the variables 
somehow?

David Cross
d.cr...@tcu.edu
www.davidcross.us




On May 18, 2011, at 9:41 AM, Meng Wu wrote:

 Hi, all
 
 I would like to use R to perform k-means clustering on my data which
 included 33 samples measured with ~1000 variables. I have already used
 kmeans package for this analysis, and showed that there are 4 clusters in my
 data. However, it's really difficult to plot this cluster in 2-D format
 since the huge number of variables. One possible way is to project the
 multidimensional space into 2-D platform, but I could not find any good way
 to do that. Any suggestions or comments will be really helpful!
 
 Thanks,
 
 Meng
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Loop stopping after 1 iteration

2011-05-18 Thread armstrwa

Hi all,

This is a very basic question, but I just can't figure out why R is handling
a loop I'm writing the way it is.

Here is the script I have written:

grid_2_series-function(gage_handle,data_type,filename)

series_name-paste(gage_handle,data_type,sep=_)
data_grid-read.table(file=paste(filename,.txt,sep=))

num_rows_data-nrow(data_grid)-1
num_cols_data-ncol(data_grid)-4
num_obs-num_rows_data*num_cols_data

time_series-matrix(nrow=0,ncol=2)

for(i in 1:length(num_obs)){
rownum-ceiling(i/31)+1
colnum-if(i%%31==0){
35
}else{
(i%%31)+4
}
year-data_grid[rownum,2]
month-data_grid[rownum,3]
day-colnum-4
date_string-paste(month,day,year,sep=/)
date-as.Date(date_string,format='%m/%d/%Y')
value-as.character(data_grid[rownum,colnum])
time_series-rbind(time_series,c(date,value))
}

The script is working as I intended it to (goes through a matrix of data
where column 2 is the year, column 3 is the month, and row 1 columns 5-35
are the day of the month the observation was recorded [I have included a
screenshot below to help visualize what I'm talking about] and converts the
grid into a 2 column time series where column 1 is the date and column 2 is
the value of the observation), but it is stopping after only 1 iteration.

nabble_img src=matrix_screenshot.jpg border=0/

Two questions:

1.) Does anyone know of an existing function to accomplish this task?
2.) Why is the loop stopping after 1 iteration?  I have it written to
iterate up to the total number of observations (20,615 in one case).

Thank you for your help and sorry for this question which I'm sure has a
very simple answer.

Thanks again,

Billy

--
View this message in context: 
http://r.789695.n4.nabble.com/Loop-stopping-after-1-iteration-tp3532988p3532988.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Integral Symbol

2011-05-18 Thread Timothy Bates

Can’t you just embed it in the html as a symbol?

#x222b;  or int;

I’d have thought you also just put it into straight into the document as a 
character – ∫– , as long as the html is stored as unicode

 U+222B
http://en.wikipedia.org/wiki/Integral_symbol


On 18 May 2011, at 4:04 PM, Javi Hidalgo wrote:
 Thanks! This is what I was looking for.
 Apparently, it is not supported then it should be as integral, but in the 
 pdf version appears the integral symbol.
 
 Regards,
 
 Javier.
 
 Date: Wed, 18 May 2011 09:39:28 -0400
 From: murdoch.dun...@gmail.com
 To: havyhida...@hotmail.com
 CC: r-help@r-project.org; lig...@statistik.tu-dortmund.de
 Subject: Re: [R] Integral Symbol
 
 On 18/05/2011 9:09 AM, Javi Hidalgo wrote:
 Thanks.
 I was exactly reading the manual Writing R Extensions, on section 
 Mathematics. Where, it informs about basic LaTeX style support.
 However, It seems like it does not support the LaTeX integral symbol \int, 
 but it does support i.e.: the summation symbol \sum.
 
 Has anyone had this experience on documenting R packages?
 
 It appears in the topic shown by ?Special, Special Functions of 
 Mathematics, but only in the LaTeX version.  You can see that in the 
 PDF version of the Reference Manual, or (if you have things set up 
 correctly), by saying
 
 options(help_type=pdf)
 ?Special
 
 I don't know of any examples where someone has shown an integral sign in 
 text or html versions.  It's not a symbol supported by the R help 
 system, it would depend on hand coding the right thing.
 
 Duncan Murdoch
 
 
 Does anyone know any R-package where the integral symbol appear in the help 
 files.
 
 Regards,
 
 Javier Hidalgo Carrio
 
 
 Date: Wed, 18 May 2011 13:14:54 +0200
 From: lig...@statistik.tu-dortmund.de
 To: havyhida...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R]  Integral Symbol
 
 See the section on writing Mathematics in Rd file in the manual Writing
 R Extensions. This will show how to produce high quality formulas in
 LaTeX generated output and ASCII versions otherwise.
 If you want to provide an excellent HTML version as well, the section on
 Conditional text is also worth reading.
 
 Uwe Ligges
 
 
 On 18.05.2011 10:55, Javi Hidalgo wrote:
 
 Dear All,
 
 I am documenting a R package. That means writing the *.Rd files inside 
 the \man folder of the package structure
 I was wondering how to write the symbol for an integral function in a 
 formula.
 Similar to this one in LaTeX:
 
 \int_{0}^{10} \Omega(t)dt
 
 I already tried
 
 \deqn{\int_{0}^{10} \Omega(t)dt}
 
 but it does not work. Any idea? Which math symbols does R-help recognise?
 
 Regards,
 
 Javier Hidalgo Carrio
   
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with 2-D plot of k-mean clustering analysis

2011-05-18 Thread Peter Langfelder

On Wed, May 18, 2011 at 7:41 AM, Meng Wu mengwu1...@gmail.com wrote:
 Hi, all

  I would like to use R to perform k-means clustering on my data which
 included 33 samples measured with ~1000 variables. I have already used
 kmeans package for this analysis, and showed that there are 4 clusters in my
 data. However, it's really difficult to plot this cluster in 2-D format
 since the huge number of variables. One possible way is to project the
 multidimensional space into 2-D platform, but I could not find any good way
 to do that. Any suggestions or comments will be really helpful!

You could use multidimensional scaling, function cmdscale(), to
produce a 2-dimensional representation of your data, then plot it
using colors that correspond to the clusters.

For example, suppose your data is stored in matrix X (1000x33). I
assume you clustered the samples, not the variables, so you have a
vector label[] with length 33 that has values between 1 and 4. Since
k-means uses Euclidean distance, you would re-create the distance

dst = dist(t(X))

then feed it into cmdscale()

mds = cmdscale(dst);

then plot it:

plot(mds, col = label)

HTH,

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loop stopping after 1 iteration

2011-05-18 Thread Martyn Byng

Hi,

The answer to (2) is that num_obs is a scalar, so length(num_obs) is 1.

You probably wanted to do

for (i in 1:num_obs)

instead.

Best wishes

Martyn

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of armstrwa
Sent: 18 May 2011 16:18
To: r-help@r-project.org
Subject: [R] Loop stopping after 1 iteration

Hi all,

This is a very basic question, but I just can't figure out why R is
handling
a loop I'm writing the way it is.

Here is the script I have written:

grid_2_series-function(gage_handle,data_type,filename)

series_name-paste(gage_handle,data_type,sep=_)
data_grid-read.table(file=paste(filename,.txt,sep=))

num_rows_data-nrow(data_grid)-1
num_cols_data-ncol(data_grid)-4
num_obs-num_rows_data*num_cols_data

time_series-matrix(nrow=0,ncol=2)

for(i in 1:length(num_obs)){
rownum-ceiling(i/31)+1
colnum-if(i%%31==0){
35
}else{
(i%%31)+4
}
year-data_grid[rownum,2]
month-data_grid[rownum,3]
day-colnum-4
date_string-paste(month,day,year,sep=/)
date-as.Date(date_string,format='%m/%d/%Y')
value-as.character(data_grid[rownum,colnum])
time_series-rbind(time_series,c(date,value))
}

The script is working as I intended it to (goes through a matrix of data
where column 2 is the year, column 3 is the month, and row 1 columns
5-35
are the day of the month the observation was recorded [I have included a
screenshot below to help visualize what I'm talking about] and converts
the
grid into a 2 column time series where column 1 is the date and column 2
is
the value of the observation), but it is stopping after only 1
iteration.

nabble_img src=matrix_screenshot.jpg border=0/

Two questions:

1.) Does anyone know of an existing function to accomplish this task?
2.) Why is the loop stopping after 1 iteration?  I have it written to
iterate up to the total number of observations (20,615 in one case).

Thank you for your help and sorry for this question which I'm sure has a
very simple answer.

Thanks again,

Billy

--
View this message in context:
http://r.789695.n4.nabble.com/Loop-stopping-after-1-iteration-tp3532988p
3532988.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This e-mail has been scanned for all viruses by Star.\ _...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loop stopping after 1 iteration

2011-05-18 Thread armstrwa

I knew it would be something simple.  Thanks for catching that, Martyn.

Billy

--
View this message in context: 
http://r.789695.n4.nabble.com/Loop-stopping-after-1-iteration-tp3532988p3533041.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loop stopping after 1 iteration

2011-05-18 Thread David Winsemius



On May 18, 2011, at 11:18 AM, armstrwa wrote:


Hi all,

This is a very basic question, but I just can't figure out why R is  
handling

a loop I'm writing the way it is.

Here is the script I have written:

grid_2_series-function(gage_handle,data_type,filename)

series_name-paste(gage_handle,data_type,sep=_)
data_grid-read.table(file=paste(filename,.txt,sep=))

num_rows_data-nrow(data_grid)-1
num_cols_data-ncol(data_grid)-4
num_obs-num_rows_data*num_cols_data

time_series-matrix(nrow=0,ncol=2)

for(i in 1:length(num_obs)){
rownum-ceiling(i/31)+1
colnum-if(i%%31==0){
35
}else{
(i%%31)+4
}
year-data_grid[rownum,2]
month-data_grid[rownum,3]
day-colnum-4
date_string-paste(month,day,year,sep=/)
date-as.Date(date_string,format='%m/%d/%Y')
value-as.character(data_grid[rownum,colnum])
time_series-rbind(time_series,c(date,value))
}

The script is working as I intended it to (goes through a matrix of  
data
where column 2 is the year, column 3 is the month, and row 1 columns  
5-35
are the day of the month the observation was recorded [I have  
included a
screenshot below to help visualize what I'm talking about] and  
converts the
grid into a 2 column time series where column 1 is the date and  
column 2 is
the value of the observation), but it is stopping after only 1  
iteration.


The jpg file will not be seen by most readers of this list.



nabble_img src=matrix_screenshot.jpg border=0/

Two questions:

1.) Does anyone know of an existing function to accomplish this task?


Since you have only define the task in terms of a loop this is not  
working properly and a picture that is not attached and included no  
test data, I have only a vague understanding of the task. Perhaps you  
want stack or melt from the reshape2 package. You might consider  
explaining more completely what you want in natural language. (And  
reading the Posting Guide with attention to acceptable attachment  
formats.)



2.) Why is the loop stopping after 1 iteration?  I have it written to
iterate up to the total number of observations (20,615 in one case).


Most likely is your misunderstanding of how length is being  
interpreted for a vector. You probably want 1:nobs rather than  
1:length(nobs)  since length(nobs) most probably 1 in this case.


Thank you for your help and sorry for this question which I'm sure  
has a

very simple answer.



David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loop stopping after 1 iteration

2011-05-18 Thread Hugo Mildenberger

William,

num_obs obviously isn't a vector, therefore length(num_obs) will
evaluate to one. Hence your for loop control part will expand to 

for (i in 1:1) 

while it should probably read:

for (i in 1:num_obs)

Best

Hugo

On Wednesday 18 May 2011 17:18:15 armstrwa wrote:
 Hi all,
 
 This is a very basic question, but I just can't figure out why R is handling
 a loop I'm writing the way it is.
 
 Here is the script I have written:
 
 grid_2_series-function(gage_handle,data_type,filename)
 
 series_name-paste(gage_handle,data_type,sep=_)
 data_grid-read.table(file=paste(filename,.txt,sep=))
 
 num_rows_data-nrow(data_grid)-1
 num_cols_data-ncol(data_grid)-4
 num_obs-num_rows_data*num_cols_data
 
 time_series-matrix(nrow=0,ncol=2)
 
 for(i in 1:length(num_obs)){
   rownum-ceiling(i/31)+1
   colnum-if(i%%31==0){
   35
   }else{
   (i%%31)+4
   }
   year-data_grid[rownum,2]
   month-data_grid[rownum,3]
   day-colnum-4
   date_string-paste(month,day,year,sep=/)
   date-as.Date(date_string,format='%m/%d/%Y')
   value-as.character(data_grid[rownum,colnum])
   time_series-rbind(time_series,c(date,value))
 }
 
 The script is working as I intended it to (goes through a matrix of data
 where column 2 is the year, column 3 is the month, and row 1 columns 5-35
 are the day of the month the observation was recorded [I have included a
 screenshot below to help visualize what I'm talking about] and converts the
 grid into a 2 column time series where column 1 is the date and column 2 is
 the value of the observation), but it is stopping after only 1 iteration.
 
 nabble_img src=matrix_screenshot.jpg border=0/
 
 Two questions:
 
 1.) Does anyone know of an existing function to accomplish this task?
 2.) Why is the loop stopping after 1 iteration?  I have it written to
 iterate up to the total number of observations (20,615 in one case).
 
 Thank you for your help and sorry for this question which I'm sure has a
 very simple answer.
 
 Thanks again,
 
 Billy
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Loop-stopping-after-1-iteration-tp3532988p3532988.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Dataset Quasi Poisson

2011-05-18 Thread ilpoeta84

Hello, I'm looking for a dataset for Quasipoisson regression. The result must
be significantly different from the classic poisson regression.
You can help me? 
Please It is for my last university exam 
Thanks a lot

--
View this message in context: 
http://r.789695.n4.nabble.com/Dataset-Quasi-Poisson-tp3533060p3533060.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Email out of R (code)

2011-05-18 Thread Kevin Wright

How does this compare to create.post() ?

Kevin


On Tue, May 17, 2011 at 3:44 PM, Daniel Malter dan...@umd.edu wrote:

 Hi all,

 I thought I would post code to send an email out of R. The code uses
 Grothendieck and Bellosta's interface package rJython for executing Python
 from R. The code itself provides basic email functionality for email
 servers
 requiring authentication. It should be easy to extend it (e.g., for sending
 attachments). I hope it's useful.

 require(rJython)
 rJython - rJython()
 rJython$exec( import smtplib )
 rJython$exec(from email.MIMEText import MIMEText)
 rJython$exec(import email.utils)

 mail-c(
 #Email settings
 fromaddr = 'sender email address',
 toaddrs  = 'recipient email address',
 msg = MIMEText('This is the body of the message.'),
 msg['From'] = email.utils.formataddr(('sender name', fromaddr)),
 msg['To'] = email.utils.formataddr(('recipient name', toaddrs)),
 msg['Subject'] = 'Simple test message',

 #SMTP server credentials
 username = 'sender login',
 password = 'sender password',

 #Set SMTP server and send email, e.g., google mail SMTP server
 server = smtplib.SMTP('smtp.gmail.com:587'),
 server.ehlo(),
 server.starttls(),
 server.ehlo(),
 server.login(username,password),
 server.sendmail(fromaddr, toaddrs, msg.as_string()),
 server.quit())

 jython.exec(rJython,mail)



 Best,
 Daniel

 --
 View this message in context:
 http://r.789695.n4.nabble.com/Email-out-of-R-code-tp3530671p3530671.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loop stopping after 1 iteration

2011-05-18 Thread armstrwa

Didn't mean to snub you guys, Hugo and David.  I didn't see your posts
before.  Thanks for the advice.


--
View this message in context: 
http://r.789695.n4.nabble.com/Loop-stopping-after-1-iteration-tp3532988p3533217.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] logistic regression lrm() output

2011-05-18 Thread array chip

Hi, I am trying to run a simple logistic regression using lrm() to calculate a 
odds ratio. I found a confusing output when I use summary() on the fit object 
which gave some OR that is totally different from simply taking 
exp(coefficient), see below:

 dat-read.table(dat.txt,sep='\t',header=T,row.names=NULL)

 d-datadist(dat)
 options(datadist='d')
 library(rms)
 (fit-lrm(response~x,data=dat,x=T,y=T))

Logistic Regression Model
lrm(formula = response ~ x, data = dat, x = T, y = T)

  Model Likelihood DiscriminationRank Discrim.
 Ratio TestIndexes  Indexes   

Obs   150LR chi2  17.11R2   0.191C   0.763
 0128d.f. 1g1.209Dxy 0.526
 1 22Pr( chi2) 0.0001gr   3.350gamma   0.528
max |deriv| 1e-11  gp   0.129tau-a   0.132
   Brier0.111 

  CoefS.E.   Wald Z Pr(|Z|)
Intercept -5.0059 0.9813 -5.10  0.0001 
x  0.5647 0.1525  3.70  0.0002 

As you can see, the odds ratio for x is exp(0.5647)=1.75892.

But if I run the following using summary():

 summary(fit)
 Effects  Response : response 

 Factor  LowHigh   Diff.  Effect S.E. Lower 0.95 Upper 0.95
 x   3.9003 6.2314 2.3311 1.32   0.36 0.62   2.01  
  Odds Ratio 3.9003 6.2314 2.3311 3.73 NA 1.86   7.49

What are these output? none of the numbers is the odds ratio (1.75892) that I 
calculated by using exp().

Can any explain?

Thanks

John
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] strucchange package Linux help

2011-05-18 Thread Hasan Diwan

When I run the code below on Macintosh and Windows, the plot comes out
fine. However, on Linux, the png generated is invalid from R console,
and loading strucchange crashes rkward. Is this a known issue on Linux
and, if so, is there a workaround? Many thanks!
require(strucchange)
data(RealInt)
bp.ri - breakpoints(RealInt~1, h=15)
summary(bp.ri)
fac.ri - breakfactor(bp.ri, breaks = 3, label='seg')
fm.ri - lm(RealInt~0 + fac.ri)
summary(fm.ri)
vcov.ri - function(x,...) kernHAC (x, kernel = 'Quadratic Spectral',
prewhite = 1, approx = 'AR(1)', ...)
coef(bp.ri, breaks - 3)
sapply(vcov(bp.ri, breaks = 3, vcov=vcov.ri), sqrt)
confint(bp.ri, breaks = 3, vcov=vcov.ri)

png('SCC2.png')
plot(RealInt)
lines(as.vector(time(RealInt)), fitted(fm.ri), col=4)
lines(confint(bp.ri, breaks = 3, vcov=vcov.ri))
dev.off()
print(paste('Plot in SCC2.png in', getwd()))

-- 
Sent from my mobile device
Envoyait de mon telephone mobil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Email out of R (code)

2011-05-18 Thread Daniel Malter

I do not know. I was not aware and could hardly find any information on
create.post(). From what I have seen at first glance, it seems that
create.post() either opens your standard email program or web browser, which
the python code does not. Instead it needs the R-library interfacing Python.
I also do not know how create.post() handles server authentication (though,
my blind guess would be with the settings of your email program or browser
mail). To stop guessing, if you want a solid comparison, I am afraid you
have to do it yourself.

Best,
Daniel

--
View this message in context: 
http://r.789695.n4.nabble.com/Email-out-of-R-code-tp3530671p3533280.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with 2-D plot of k-mean clustering analysis

2011-05-18 Thread Claudia Beleites


Hi Meng,


  I would like to use R to perform k-means clustering on my data which
included 33 samples measured with ~1000 variables. I have already used
kmeans package for this analysis, and showed that there are 4 clusters in my
data. However, it's really difficult to plot this cluster in 2-D format
since the huge number of variables. One possible way is to project the
multidimensional space into 2-D platform, but I could not find any good way
to do that. Any suggestions or comments will be really helpful!
For suggestions it would be extremely helpful to tell us what kind of 
variables your 1000 variables are.


Parallel coordinate plots plot values over (many) variables. Whether 
this is useful, depends very much on your variables: E.g. I have 
spectral channels, they have an intrinsic order and the values have 
physically the same meaning (and almost the same range), so the parallel 
coordinate plot comes naturally (it produces in fact the spectra).


Claudia




Thanks,

Meng

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Claudia Beleites
Spectroscopy/Imaging
Institute of Photonic Technology
Albert-Einstein-Str. 9
07745 Jena
Germany

email: claudia.belei...@ipht-jena.de
phone: +49 3641 206-133
fax:   +49 2641 206-399

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with 2-D plot of k-mean clustering analysis

2011-05-18 Thread Albyn Jones

One idea:  Pick the three largest clusters, their centers determine a plane.
project your data into that plane.

albyn

On Wed, May 18, 2011 at 06:55:39PM +0200, Claudia Beleites wrote:
 Hi Meng,
 
   I would like to use R to perform k-means clustering on my data which
 included 33 samples measured with ~1000 variables. I have already used
 kmeans package for this analysis, and showed that there are 4 clusters in my
 data. However, it's really difficult to plot this cluster in 2-D format
 since the huge number of variables. One possible way is to project the
 multidimensional space into 2-D platform, but I could not find any good way
 to do that. Any suggestions or comments will be really helpful!
 For suggestions it would be extremely helpful to tell us what kind
 of variables your 1000 variables are.
 
 Parallel coordinate plots plot values over (many) variables. Whether
 this is useful, depends very much on your variables: E.g. I have
 spectral channels, they have an intrinsic order and the values have
 physically the same meaning (and almost the same range), so the
 parallel coordinate plot comes naturally (it produces in fact the
 spectra).
 
 Claudia
 
 
 
 Thanks,
 
 Meng
 
  [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 -- 
 Claudia Beleites
 Spectroscopy/Imaging
 Institute of Photonic Technology
 Albert-Einstein-Str. 9
 07745 Jena
 Germany
 
 email: claudia.belei...@ipht-jena.de
 phone: +49 3641 206-133
 fax:   +49 2641 206-399
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
Albyn Jones
Reed College
jo...@reed.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] logistic regression lrm() output

2011-05-18 Thread Frank Harrell

Why is a one unit change in x an interesting range for the purpose of
estimating an odds ratio?

The default in summary() is the inter-quartile-range odds ratio as clearly
stated in the rms documentation.
Frank

array chip wrote:
 
 Hi, I am trying to run a simple logistic regression using lrm() to
 calculate a 
 odds ratio. I found a confusing output when I use summary() on the fit
 object 
 which gave some OR that is totally different from simply taking 
 exp(coefficient), see below:
 
 dat-read.table(dat.txt,sep='\t',header=T,row.names=NULL)
 
 d-datadist(dat)
 options(datadist='d')
 library(rms)
 (fit-lrm(response~x,data=dat,x=T,y=T))
 
 Logistic Regression Model
 lrm(formula = response ~ x, data = dat, x = T, y = T)
 
   Model Likelihood DiscriminationRank Discrim.
  Ratio TestIndexes  Indexes   
 
 Obs   150LR chi2  17.11R2   0.191C   0.763
  0128d.f. 1g1.209Dxy 0.526
  1 22Pr( chi2) 0.0001gr   3.350gamma   0.528
 max |deriv| 1e-11  gp   0.129tau-a   0.132
Brier0.111 
 
   CoefS.E.   Wald Z Pr(|Z|)
 Intercept -5.0059 0.9813 -5.10  0.0001 
 x  0.5647 0.1525  3.70  0.0002 
 
 As you can see, the odds ratio for x is exp(0.5647)=1.75892.
 
 But if I run the following using summary():
 
 summary(fit)
  Effects  Response : response 
 
  Factor  LowHigh   Diff.  Effect S.E. Lower 0.95 Upper 0.95
  x   3.9003 6.2314 2.3311 1.32   0.36 0.62   2.01  
   Odds Ratio 3.9003 6.2314 2.3311 3.73 NA 1.86   7.49
 
 What are these output? none of the numbers is the odds ratio (1.75892)
 that I 
 calculated by using exp().
 
 Can any explain?
 
 Thanks
 
 John
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/logistic-regression-lrm-output-tp3533223p3533278.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Smooth contour of a map

2011-05-18 Thread Pierre Bruyer

It's perfect, thank you!
I would like post the final code if someone need help in this subject, but I 
try to correct
a last problem, how can I  constrain the contourLines() function to take the 
corner points of the map in his result ... it does not consider this point like 
a contour point.

Le 18 mai 2011 à 15:18, David Winsemius a écrit :

 You may be looking for the par settings of xaxs=i, yaxs=i, which if you 
 add them to the plot call will prevent the regular behavior of adding 4% 
 padding to the axis widths.
 
 ?par
 
 -- David.
 
 On May 18, 2011, at 8:27 AM, Pierre Bruyer wrote:
 
 I've pratically resolved my problem (the code is under that), but a last 
 thing is not perfect:
 when I use the function plot to call after the function polygon, there is a 
 marge between my raster and the window. I think it's the axis of the 
 function plot(), but I have not found how delete it. Someone have a 
 solution please?
 
 Pierre Bruyer
 
 ##smooth contour
  
  contours - contourLines(V2b,levels=paliers)
 
  
  par(mar=c(0,0,0,0))
  plot(1,col=white,main=polygon(), asp = 1, axes = FALSE, ann = 
 FALSE,xlim=c(0,1), ylim = c(0,1),type = n, method = c(image))
  for (i in seq_along(contours)) {
   x - contours[[i]]$x
   y - contours[[i]]$y
   c - contours[[i]]$level
   j - 1
   tmp - 0
   while(j  length(level[,1])  tmp == 0){
  if(level[j,1] == c){
  tmp - j
  }
  j - j+1
  }   
 
  polygon( spline( seq_along(x), x)$y, spline( seq_along(y), y)$y 
 ,col = colgraph[tmp+1], border = NA)
  }
 
 
 
 Le 17 mai 2011 à 16:44, Pierre Bruyer a écrit :
 
 The result is good, thanks a lot, but how can I with this method fill my 
 raster to color?
 
 Le 17 mai 2011 à 15:43, Duncan Murdoch a écrit :
 
 I don't think filled.contour gives you access to the contour lines.  If 
 you use contourLines() to compute them, then you can draw them using code 
 like this:
 
 contours - contourLines(V2b,levels=paliers)
 for (i in seq_along(contours)) {
 x - contours[[i]]$x
 y - contours[[i]]$y
 lines( splines( seq_along(x), x)$y, splines( seq_along(y), y)$y )
 }
 
 but as I said, you won't get great results.  A better way is to use a 
 finer grid, e.g. by fitting a smooth surface to your set of points and 
 using predictions from the model to interpolate.
 
 Duncan Murdoch
 
 
 On 17/05/2011 9:35 AM, Pierre Bruyer wrote:
 I work with large datasets (1 points) so I can't post them , but my 
 function is :
 
 create_map- function(grd, level ,map_output, format = c(jpeg), 
 width_map = 150, height_map = 150,...)
 { 
   
   ##sp- spline(x = grd[,1], y = grd[,2])
 
   grd2- matrix(grd[,3], nrow = sqrt(length(grd[,3])), ncol = 
 sqrt(length(grd[,3])), byrow = FALSE)
   
   V2b- grd2
 
   
   ##creation of breaks for colors
   i-1
   paliers- c(-1.0E300)
   while(i=length(level[,1]))
   {
   paliers- c(paliers,level[i,1])
   i- i+1
   }
   paliers- c(paliers, 1.0E300)
   
   ##scale color creation
   i- 1
   colgraph- c(rgb(255,255,255, maxColorValue = 255))
   while(i=length(level[,2]))
   {
   colgraph- c(colgraph, rgb(level[i,2],level[i,3],level[i,4], 
 maxColorValue = 255))
   i- i +1
   }
 
   ##user can choose the output format (default is jpeg)
   switch(format,
   png = png(map_output, width = width_map, height = height_map) ,
   jpeg = jpeg(map_output, width = width_map, height = height_map, 
 quality = 100),
   bmp = bmp(map_output, width = width_map, height = height_map),
   tiff = tiff(map_output, width = width_map, height = height_map),
   jpeg(map_output, width = width_map, height = height_map))
 
   ## drawing map
   
   ##delete marge
   par(mar=c(0,0,0,0))
   filled.contour(V2b, col = colgraph, levels = paliers, asp = 1, axes = 
 FALSE, ann = FALSE)
   dev.off()   
 
 }
 
 where grd is a xyz data frame,
 map_output is the path+name of the output image file,
 and level is a matrix like this :
 
 
 level- matrix(0,10,4)
 level[1,1]- 1.E+00
 level[2,1]- 3.E+00
 level[3,1]- 5.E+00
 level[4,1]- 1.E+01
 level[5,1]- 1.5000E+01
 level[6,1]- 2.E+01
 level[7,1]- 3.E+01
 level[8,1]- 4.E+01
 level[9,1]- 5.E+01
 level[10,1]- 7.5000E+01
 
 
 level[1,2]- 102
 level[2,2]- 102
 level[3,2]- 102
 level[4,2]- 93
 level[5,2]- 204
 level[6,2]- 248
 level[7,2]- 241
 level[8,2]- 239
 level[9,2]- 224
 level[10,2]- 153
 
 level[1,3]- 153
 level[2,3]- 204
 level[3,3]- 204
 level[4,3]- 241
 level[5,3]- 255
 level[6,3]- 243
 level[7,3]- 189
 level[8,3]- 126
 level[9,3]- 14
 level[10,3]- 0
 
 level[1,4]- 153
 level[2,4]- 204
 level[3,4]- 153
 level[4,4]- 107
 level[5,4]- 102
 level[6,4]- 33
 level[7,4]- 59
 level[8,4]- 63
 level[9,4]- 14
 level[10,4]- 51
 
 Le 17 mai 2011 à 15:17, Duncan Murdoch a écrit

[R] assign $y of predict() function output to variable

2011-05-18 Thread Asan Ramzan

Hello R-help

Below is the output from the predict() function. How can I assign $y to a 
variable.

predict(function,df2)

$x
   V1
1   36.28
2   34.73
3   33.74
4   69.87
5   58.88
6   89.44
7   43.97
8   41.94
9   33.34
10  38.47
11  35.16
12  42.94
13  46.76
14  53.24
15  52.43
16  50.40
17  34.42
18  33.22
19  33.24
20  39.60
21  39.32
22  44.71
23  54.03
24  47.48
25  35.42
26  34.78
27  34.31
28  78.60
29  74.43
30 120.80
31  48.35
32  45.40
33  33.95
34  38.27
35  35.16
36  47.10
37  48.10
38  51.79
39  62.10
40  50.95
41  35.75
42  34.62
43  57.99
44  45.09
45  43.93
46  60.98
47  66.64
48  59.84
49  64.81
50  77.52
51 113.40
52  88.12
53  80.36
54 118.80
55 113.00
56 169.50
57  53.04
58  63.39
59  96.04
60 109.80
61  83.74
62 133.10
63 122.30
64 168.30
65  61.89
66  58.58
67  75.98
68  87.66
69  84.01
70 132.80
71 135.60
72 127.70
$y
  V1
1   2.676489
2   2.070236
3   1.682677
4  15.853686
5  11.523969
6  23.030727
7   5.678122
8   4.886343
9   1.526004
10  3.532138
11  2.238484
12  5.276394
13  6.766605
14  9.301601
15  8.983873
16  8.188838
17  1.948910
18  1.478992
19  1.486828
20  3.973300
21  3.864004
22  5.966758
23  9.611797
24  7.047672
25  2.340192
26  2.089803
27  1.905852
28 19.180440
29 17.611545
30 32.357421
31  7.387438
32  6.235922
33  1.764910
34  3.454035
35  2.238484
36  6.899319
37  7.289786
38  8.733040
39 12.798111
40  8.404081
41  2.469258
42  2.027188
43 11.172123
44  6.114989
45  5.662521
46 12.354907
47 14.589716
48 11.903750
49 13.869002
50 18.778316
51 30.387489
52 22.579871
53 19.828694
54 31.838458
55 30.277085
56 42.186268
57  9.223121
58 13.308252
59 25.211889
60 29.379133
61 21.048335
62 35.336657
63 32.740194
64 42.003454
65 12.715023
66 11.405337
67 18.199665
68 22.421596
69 21.144335
70 35.268242
71 35.898715
72 34.073050

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Overlaying maps

2011-05-18 Thread Michael . Laviolette


I'm having difficulty overlaying maps when writing to a file graphics
device. My command sequence has the structure

plot(map1)
par(new = T)
plot(map2)

On the screen device, it works fine. When I attempt something like

png(file = map.png)
plot(map1)
par(new = T)
plot(map2)
dev.off()

only the last map appears, the previous ones having been cleared. Can
someone clarify?

Thanks,
Michael Laviolette PhD MPH
New Hampshire Department of Health and Human Services

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Email out of R (code)

2011-05-18 Thread Wolfgang RAFFELSBERGER

In case you're using Unix/Linux, have a look at
www.r-project.org/doc/Rnews/Rnews_2007-1.pdf
(page 30 - 32)

Wolfgang

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Wolfgang Raffelsberger, PhD
IGBMC,
1 rue Laurent Fries,  67404 Illkirch  Strasbourg,  France
Tel (+33) 388 65 3300 Fax (+33) 388 65 3276
wolfgang.raffelsberger (a t) igbmc.fr


De : r-help-boun...@r-project.org [r-help-boun...@r-project.org] de la part de 
Kevin Wright [kw.s...@gmail.com]
Date d'envoi : mercredi 18 mai 2011 17:57
À : Daniel Malter
Cc : r-help@r-project.org
Objet : Re: [R] Email out of R (code)

How does this compare to create.post() ?

Kevin


On Tue, May 17, 2011 at 3:44 PM, Daniel Malter dan...@umd.edu wrote:

 Hi all,

 I thought I would post code to send an email out of R. The code uses
 Grothendieck and Bellosta's interface package rJython for executing Python
 from R. The code itself provides basic email functionality for email
 servers
 requiring authentication. It should be easy to extend it (e.g., for sending
 attachments). I hope it's useful.

 require(rJython)
 rJython - rJython()
 rJython$exec( import smtplib )
 rJython$exec(from email.MIMEText import MIMEText)
 rJython$exec(import email.utils)

 mail-c(
 #Email settings
 fromaddr = 'sender email address',
 toaddrs  = 'recipient email address',
 msg = MIMEText('This is the body of the message.'),
 msg['From'] = email.utils.formataddr(('sender name', fromaddr)),
 msg['To'] = email.utils.formataddr(('recipient name', toaddrs)),
 msg['Subject'] = 'Simple test message',

 #SMTP server credentials
 username = 'sender login',
 password = 'sender password',

 #Set SMTP server and send email, e.g., google mail SMTP server
 server = smtplib.SMTP('smtp.gmail.com:587'),
 server.ehlo(),
 server.starttls(),
 server.ehlo(),
 server.login(username,password),
 server.sendmail(fromaddr, toaddrs, msg.as_string()),
 server.quit())

 jython.exec(rJython,mail)



 Best,
 Daniel

 --
 View this message in context:
 http://r.789695.n4.nabble.com/Email-out-of-R-code-tp3530671p3530671.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] text mining problem using TM package

2011-05-18 Thread Andy Adamiec

Hi, Im using R (TM package) for text mining and Im having problems
filtering articles out of my data set by local meta data.



Here is the code:



*data - (C:/ /19970331)*

* *

* *

*rs - ReutersSource(data , encoding = UTF-8)*

*RC - VCorpus(DirSource(data), readerControl = list(reader =
readRCV1asPlain,*

*
language = en_US,*

*
load = TRUE),*

*
 dbControl = list(useDb = TRUE,*

*
  dbName = texts.db,*

*
  dbType = DB1))*

* *

* *

* *

*tm_index(RC, FUN = sFilter, doclevel = F, useMeta = T,  Topics == 'MCAT')
*

* *

* *



When I use  sFilter, I can only filter fields in yellow, I want to filter
fields in red, what am I doing wrong?



Thanks, Andy



This is meta data that is attached to each article



Available meta data pairs are:

  Author   :

  DateTimeStamp: 1997-03-31

  Description  :

  Heading  : USA: WHX begins tender offer for Dynamics Corp.

  ID   : 476871

  Language : en_US

  Origin   : Reuters Corpus Volume 1

User-defined local meta data pairs are:

$Publisher

[1] Reuters Holdings Plc



$Topics

[1] C18  C181 CCAT



$Industries

[1] I22100 I34000



$Countries

[1] USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strucchange package Linux help

2011-05-18 Thread Achim Zeileis


On Wed, 18 May 2011, Hasan Diwan wrote:


When I run the code below on Macintosh and Windows, the plot comes out
fine. However, on Linux, the png generated is invalid from R console,
and loading strucchange crashes rkward.


I can replicate nothing of this. I ran the script both in a plain R 2.13.0 
and in RKward 0.5.5 on a Debian GNU/Linux machine.


In both cases, the script below yielded the correct outcome and the PNG 
showed the correct graphic. (And neither R or RKward crashed.)



Is this a known issue on Linux and, if so, is there a workaround?


This is almost certainly no general Linux problem with PNG graphics and no 
problem with the specific strucchange example. It's more likely that 
something else goes wrong in your specific setup.

Z


Many thanks!
require(strucchange)
data(RealInt)
bp.ri - breakpoints(RealInt~1, h=15)
summary(bp.ri)
fac.ri - breakfactor(bp.ri, breaks = 3, label='seg')
fm.ri - lm(RealInt~0 + fac.ri)
summary(fm.ri)
vcov.ri - function(x,...) kernHAC (x, kernel = 'Quadratic Spectral',
prewhite = 1, approx = 'AR(1)', ...)
coef(bp.ri, breaks - 3)
sapply(vcov(bp.ri, breaks = 3, vcov=vcov.ri), sqrt)
confint(bp.ri, breaks = 3, vcov=vcov.ri)

png('SCC2.png')
plot(RealInt)
lines(as.vector(time(RealInt)), fitted(fm.ri), col=4)
lines(confint(bp.ri, breaks = 3, vcov=vcov.ri))
dev.off()
print(paste('Plot in SCC2.png in', getwd()))

--
Sent from my mobile device
Envoyait de mon telephone mobil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with Memory Problems (cannot allocate vector of size)

2011-05-18 Thread Amit Patel

While doing pls I found the following problem

 BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, jackknife 
 = 
FALSE, validation = LOO)

when not enabling jackknife the command works fine, but when trying to enable 
jackknife i get the following error. 



BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, jackknife = 
TRUE, validation = LOO)
Error: cannot allocate vector of size 289.1 Mb

I am dealing with a very large dataset

str(PLSdata)
'data.frame':   40 obs. of  2 variables:
 $ GroupingList: int  1 1 1 1 1 1 1 1 1 1 ...
 $ PCIList : AsIs [1:40, 1:94727] 0 0 0 0 0 0 0 0 0 0 ...
  ..- attr(*, dimnames)=List of 2
  .. ..$ : chr  X X.1 X.12 X.13 ...
  .. ..$ : NULL

object.size(PLSdata)/1048600
28.9113560938394 bytes

How can i get around this memory shortage
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-code in R-file documentation

2011-05-18 Thread Yihui Xie

I guess what you want is cat(readLines(file.r), sep = \n)

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA



On Wed, May 18, 2011 at 4:03 AM, Brian Oney zenli...@gmail.com wrote:
 Hello List,
 I would like to insert code from .r files into a LaTeX appendix (possibly
 using Sweave).
 I was considering:

 results=tex,eval=true,echo=true=
 source(file.r)
 @
 but I would just like to echo the code and not evaluate the code within the
 file.
 maybe:
 results=tex,eval=true,echo=false=
 cat(\\begin{verbatim})
 readLines(file.r)
 cat(\\end{verbatim})
 @

 The above works well other than the line numbers which are included (which
 isn't so bad).

 Thanks for the help and ideas!
 Brian

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] hierarchical clustering within a size limit

2011-05-18 Thread rna seq

Hi Peter,

Thanks for your help. A second simple question that I cannot solve is the
following.

labels = cutree(hc, h=500)
# members of cluster 1:
x[labels==1]
# members of cluster 2:
x[labels==2]

When x is = 8 the index numbers appear in the output:

[['[1]', '180066408', '180066464', '180066465', '180066483', '180066486',
'180066518', '180066525', '[8]', '180066554', '180066623', '180066638',
'180066652', '18006681
9', '180066884']]

As opposed to when they are less than 8:

 [['150329963', '150329989', '150330179', '150330299', '150330375',
'150330460']]

Is there a simple way to make these index numbers disappear?

Thanks





On Wed, May 11, 2011 at 10:53 AM, Peter Langfelder 
peter.langfel...@gmail.com wrote:

 On Wed, May 11, 2011 at 10:12 AM, rna seq rna.see...@gmail.com wrote:
  Hello List,
 
  I am trying to implement a hierarchical cluster using the hclust method
  agglomerative single linkage method with a small wrinkle. I would like to
  cluster a set of numbers on a number line only if they are within a
 distance
  of 500. I would then like to print out the members of this list.
 
  So far I can put a vector:
  x-c(2,10,200,300,600,700)
  into a distance matrix:
  dist(x,method=manhattan)
 1   2   3   4   5
  2   8
  3 198 190
  4 298 290 100
  5 598 590 400 300
  6 698 690 500 400 100
 
  I can then cluster these distances using:
 hc-hclust(v, method = complete)
 
  Next, I believe I set my distance limit in the cluster using the
 command
 
 cutree(hc, h=500)
  1 1 1 1 2 2 1 3
  [1] 1 1 1 1 2 2
 
  This seems to produce the correct result however, whatt I am unable to do
 is
  go back and extract and print out the members of each cluster. Any herp
  would be greatly appreciated.

 Very simple.

 labels = cutree(hc, h=500)
 # members of cluster 1:
 x[labels==1]
 # members of cluster 2:
 x[labels==2]

 HTH,

 Peter


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Grouped bar plot

2011-05-18 Thread jctoll

Hi,

I am trying to produce a grouped bar plot from a data.frame and I'm
having difficulties figuring out how to do so.  My data is 500 rows by
4 columns and basically looks like so:

 head(x)
V1V2V3V4
1  XOM 0.2317915 0.1610068 1.6941637
2 AAPL 0.6735488 0.7433611 0.1594102
3   GE 1.2554160 0.9237384 1.6767711
4  IBM 1.6296938 0.3730387 0.5858115
5  CVX 0.9194169 0.4785705 0.1803601
6   PG 0.7768241 1.7622060 0.7640163
 . . .

I would like to produce something similar to what is found at:
http://www.statmethods.net/graphs/bar.html  # the grouped
barplot example
or
http://had.co.nz/ggplot2/geom_bar.html# the Dodged bar
charts example

Across the X-axis, for each set(row) of 3 data points(V2, V3, V4)
associated with a symbol(V1), I would like to create a group of 3 bars
reflecting their values.  So the Y-axis will represent the magnitude
of values in the columns (V2, V3, V4), and X-axis will have 500 groups
of 3 bars, for a total of 1500 bars.  I would like the color of each
bar to reflect the column of data it represents, and to label each
group of 3 with the corresponding symbol in column V1.

I was trying to get this to work using ggplot but the y-axis in the
example is the count, which is not what I'm after.  Any suggestions,
to get me started down the right path would be appreciated.  Thank
you.

James

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data network format and grouping analysis

2011-05-18 Thread Sebastián Daza


Hi everyone,
I have a dataset of friendship with this format:

 ego alter
47461 2
97421 3
14738   1NA
47472NA
974323
14739   21
4748313
97443   5
14740   314
47494NA
97454NA
14741   4NA
47505NA
9746513
14742   510
47516   12
97476 7
...


NA means that individuals don't select any friend. Does anyone know how 
to format this dataset to use sna or igraph packages? I don't know how 
to convert it into a matrix or a edgelist in R without losing isolated 
individuals .


Next question, anyone knows if there is a package to perform a  Moody's 
Crowds routine to identify groups using R, or other algorithms designed 
to search groups by maximizing modularity scores?


Thank  you in advance!

--
Sebastián Daza
sebastian.d...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dataset Quasi Poisson

2011-05-18 Thread Jonathan Daily

1) This mailing list is not for homework.

2) I would recommend reading the introduction to R that comes with
every installation of R, since your answer is in there. Alternatively,
you could google R and quasi poisson.

On Wed, May 18, 2011 at 11:42 AM, ilpoeta84 antonioperfe...@gmail.com wrote:
 Hello, I'm looking for a dataset for Quasipoisson regression. The result must
 be significantly different from the classic poisson regression.
 You can help me?
 Please It is for my last university exam
 Thanks a lot

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Dataset-Quasi-Poisson-tp3533060p3533060.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
===
Jon Daily
Technician
===
#!/usr/bin/env outside
# It's great, trust me.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simple ordering or sorting question

2011-05-18 Thread David Kaplan

Greetings,

I'm trying to simply reorder a data frame on the row numbers.  So, for 
example, instead of getting 1,2,3,4,5,6,7,8,9,10,11, ... 100 ...,  I get 
instead
1, 10, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 11, ...   I've 
tried commands such as

df - df[order(rownames(df)),] and

and have substituted the order command with sort and sort.list to no 
avail.  Any advice would be appreciated.  Thanks in advance.

David

-- 

===
David Kaplan, Ph.D.
Professor
Department of Educational Psychology
University of Wisconsin - Madison
Educational Sciences, Room, 1082B
1025 W. Johnson Street
Madison, WI 53706

email: dkap...@education.wisc.edu
homepage:
http://www.education.wisc.edu/edpsych/default.aspx?content=kaplan.html
Phone: 608-262-0836
===




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple ordering or sorting question

2011-05-18 Thread jim holtman

It looks like your row numbers are characters because that is the
sort sequence you are getting. Try

df - df[order(as.numeric(rownames(df))), ]

On Wed, May 18, 2011 at 2:42 PM, David Kaplan
dkap...@education.wisc.edu wrote:
 Greetings,

 I'm trying to simply reorder a data frame on the row numbers.  So, for
 example, instead of getting 1,2,3,4,5,6,7,8,9,10,11, ... 100 ...,  I get
 instead
 1, 10, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 11, ...   I've
 tried commands such as

 df - df[order(rownames(df)),] and

 and have substituted the order command with sort and sort.list to no
 avail.  Any advice would be appreciated.  Thanks in advance.

 David

 --

 ===
 David Kaplan, Ph.D.
 Professor
 Department of Educational Psychology
 University of Wisconsin - Madison
 Educational Sciences, Room, 1082B
 1025 W. Johnson Street
 Madison, WI 53706

 email: dkap...@education.wisc.edu
 homepage:
 http://www.education.wisc.edu/edpsych/default.aspx?content=kaplan.html
 Phone: 608-262-0836
 ===




        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple ordering or sorting question

2011-05-18 Thread David Kaplan

That did it.  Thanks!!

David


===
David Kaplan, Ph.D.
Professor
Department of Educational Psychology
University of Wisconsin - Madison
Educational Sciences, Room, 1082B
1025 W. Johnson Street
Madison, WI 53706

email: dkap...@education.wisc.edu
homepage:
http://www.education.wisc.edu/edpsych/default.aspx?content=kaplan.html
Phone: 608-262-0836
===




On 5/18/11 1:50 PM, jim holtman wrote:
 It looks like your row numbers are characters because that is the
 sort sequence you are getting. Try

 df- df[order(as.numeric(rownames(df))), ]

 On Wed, May 18, 2011 at 2:42 PM, David Kaplan
 dkap...@education.wisc.edu  wrote:
 Greetings,

 I'm trying to simply reorder a data frame on the row numbers.  So, for
 example, instead of getting 1,2,3,4,5,6,7,8,9,10,11, ... 100 ...,  I get
 instead
 1, 10, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 11, ...   I've
 tried commands such as

 df- df[order(rownames(df)),] and

 and have substituted the order command with sort and sort.list to no
 avail.  Any advice would be appreciated.  Thanks in advance.

 David

 --

 ===
 David Kaplan, Ph.D.
 Professor
 Department of Educational Psychology
 University of Wisconsin - Madison
 Educational Sciences, Room, 1082B
 1025 W. Johnson Street
 Madison, WI 53706

 email: dkap...@education.wisc.edu
 homepage:
 http://www.education.wisc.edu/edpsych/default.aspx?content=kaplan.html
 Phone: 608-262-0836
 ===




 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Use paste function to select column of data

2011-05-18 Thread John Poulsen

Hello,

I want to build a function to call up a column of a data.frame by the names of 
the columns.  I have column names that are sequentially named (col1, col2, 
etc.). How do I change a character expression into something that will be 
understood as a data.frame column.  For example: 

 example-data.frame(cbind(col1=1:10, col2=21:30, col3=41:50))
  call.fun-function(t){
x-paste(col,t, sep=)  ## Change this so that it is the data, not a 
character expression
 example$x}

call.fun(t=2)

 Within the real function, I will continue do calculations on the column of 
data.  My problem is that I am either getting a character expression or NULL 
from my function.

Thanks for your help on what is probably a very simple question.

John
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] assign $y of predict() function output to variable

2011-05-18 Thread David Winsemius



On May 18, 2011, at 1:35 PM, Asan Ramzan wrote:


Hello R-help

Below is the output from the predict() function. How can I assign $y  
to a

variable.



Newvar - predict(function,df2)$y


--
David.


$x
   V1
1   36.28
2   34.73
3   33.74
4   69.87
5   58.88
6   89.44
7   43.97
8   41.94
9   33.34
10  38.47
11  35.16
12  42.94
13  46.76
14  53.24
15  52.43
16  50.40
17  34.42
18  33.22
19  33.24
20  39.60
21  39.32
22  44.71
23  54.03
24  47.48
25  35.42
26  34.78
27  34.31
28  78.60
29  74.43
30 120.80
31  48.35
32  45.40
33  33.95
34  38.27
35  35.16
36  47.10
37  48.10
38  51.79
39  62.10
40  50.95
41  35.75
42  34.62
43  57.99
44  45.09
45  43.93
46  60.98
47  66.64
48  59.84
49  64.81
50  77.52
51 113.40
52  88.12
53  80.36
54 118.80
55 113.00
56 169.50
57  53.04
58  63.39
59  96.04
60 109.80
61  83.74
62 133.10
63 122.30
64 168.30
65  61.89
66  58.58
67  75.98
68  87.66
69  84.01
70 132.80
71 135.60
72 127.70
$y
  V1
1   2.676489
2   2.070236
3   1.682677
4  15.853686
5  11.523969
6  23.030727
7   5.678122
8   4.886343
9   1.526004
10  3.532138
11  2.238484
12  5.276394
13  6.766605
14  9.301601
15  8.983873
16  8.188838
17  1.948910
18  1.478992
19  1.486828
20  3.973300
21  3.864004
22  5.966758
23  9.611797
24  7.047672
25  2.340192
26  2.089803
27  1.905852
28 19.180440
29 17.611545
30 32.357421
31  7.387438
32  6.235922
33  1.764910
34  3.454035
35  2.238484
36  6.899319
37  7.289786
38  8.733040
39 12.798111
40  8.404081
41  2.469258
42  2.027188
43 11.172123
44  6.114989
45  5.662521
46 12.354907
47 14.589716
48 11.903750
49 13.869002
50 18.778316
51 30.387489
52 22.579871
53 19.828694
54 31.838458
55 30.277085
56 42.186268
57  9.223121
58 13.308252
59 25.211889
60 29.379133
61 21.048335
62 35.336657
63 32.740194
64 42.003454
65 12.715023
66 11.405337
67 18.199665
68 22.421596
69 21.144335
70 35.268242
71 35.898715
72 34.073050

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use paste function to select column of data

2011-05-18 Thread David Winsemius



On May 18, 2011, at 3:12 PM, John Poulsen wrote:


Hello,

I want to build a function to call up a column of a data.frame by  
the names of the columns.  I have column names that are sequentially  
named (col1, col2, etc.). How do I change a character expression  
into something that will be understood as a data.frame column.  For  
example:


example-data.frame(cbind(col1=1:10, col2=21:30, col3=41:50))
 call.fun-function(t){
   x-paste(col,t, sep=)  ## Change this so that it is the data,  
not a character expression

# right don't use the $ operator, instead  use [[
example[[x]]}


call.fun(t=2)

Within the real function, I will continue do calculations on the  
column of data.  My problem is that I am either getting a character  
expression or NULL from my function.


Thanks for your help on what is probably a very simple question.

John


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use paste function to select column of data

2011-05-18 Thread Peter Ehlers


On 2011-05-18 12:12, John Poulsen wrote:

Hello,

I want to build a function to call up a column of a data.frame by the names of 
the columns.  I have column names that are sequentially named (col1, col2, 
etc.). How do I change a character expression into something that will be 
understood as a data.frame column.  For example:

  example-data.frame(cbind(col1=1:10, col2=21:30, col3=41:50))
   call.fun-function(t){
 x-paste(col,t, sep=)  ## Change this so that it is the data, not a 
character expression
  example$x}

call.fun(t=2)


Get out of the dollar habit.
Replace your

  example$x

with

  example[[x]]

or with

  example[, x]

Peter Ehlers



  Within the real function, I will continue do calculations on the column of 
data.  My problem is that I am either getting a character expression or NULL 
from my function.

Thanks for your help on what is probably a very simple question.

John
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Grouped bar plot

2011-05-18 Thread Peter Ehlers


On 2011-05-18 11:13, jctoll wrote:

Hi,

I am trying to produce a grouped bar plot from a data.frame and I'm
having difficulties figuring out how to do so.  My data is 500 rows by
4 columns and basically looks like so:


head(x)

 V1V2V3V4
1  XOM 0.2317915 0.1610068 1.6941637
2 AAPL 0.6735488 0.7433611 0.1594102
3   GE 1.2554160 0.9237384 1.6767711
4  IBM 1.6296938 0.3730387 0.5858115
5  CVX 0.9194169 0.4785705 0.1803601
6   PG 0.7768241 1.7622060 0.7640163
  . . .

I would like to produce something similar to what is found at:
http://www.statmethods.net/graphs/bar.html  # the grouped
barplot example
or
http://had.co.nz/ggplot2/geom_bar.html# the Dodged bar
charts example

Across the X-axis, for each set(row) of 3 data points(V2, V3, V4)
associated with a symbol(V1), I would like to create a group of 3 bars
reflecting their values.  So the Y-axis will represent the magnitude
of values in the columns (V2, V3, V4), and X-axis will have 500 groups
of 3 bars, for a total of 1500 bars.  I would like the color of each
bar to reflect the column of data it represents, and to label each
group of 3 with the corresponding symbol in column V1.

I was trying to get this to work using ggplot but the y-axis in the
example is the count, which is not what I'm after.  Any suggestions,
to get me started down the right path would be appreciated.  Thank
you.


Using base barplot() and calling your 6 lines of data 'd':

  barplot(t(d[-1]), names.arg=d[,1], beside=TRUE)

Give a careful reading to the definition of the 'height' argument on
the help page.

Peter Ehlers



James

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Grouped bar plot

2011-05-18 Thread J Toll

On Wed, May 18, 2011 at 2:38 PM, Peter Ehlers ehl...@ucalgary.ca wrote:
 On 2011-05-18 11:13, jctoll wrote:

 Hi,

 I am trying to produce a grouped bar plot from a data.frame and I'm
 having difficulties figuring out how to do so.  My data is 500 rows by
 4 columns and basically looks like so:

 head(x)

     V1        V2        V3        V4
 1  XOM 0.2317915 0.1610068 1.6941637
 2 AAPL 0.6735488 0.7433611 0.1594102
 3   GE 1.2554160 0.9237384 1.6767711
 4  IBM 1.6296938 0.3730387 0.5858115
 5  CVX 0.9194169 0.4785705 0.1803601
 6   PG 0.7768241 1.7622060 0.7640163
  . . .

 I would like to produce something similar to what is found at:
 http://www.statmethods.net/graphs/bar.html          # the grouped
 barplot example
 or
 http://had.co.nz/ggplot2/geom_bar.html                # the Dodged bar
 charts example

 Across the X-axis, for each set(row) of 3 data points(V2, V3, V4)
 associated with a symbol(V1), I would like to create a group of 3 bars
 reflecting their values.  So the Y-axis will represent the magnitude
 of values in the columns (V2, V3, V4), and X-axis will have 500 groups
 of 3 bars, for a total of 1500 bars.  I would like the color of each
 bar to reflect the column of data it represents, and to label each
 group of 3 with the corresponding symbol in column V1.

 I was trying to get this to work using ggplot but the y-axis in the
 example is the count, which is not what I'm after.  Any suggestions,
 to get me started down the right path would be appreciated.  Thank
 you.

 Using base barplot() and calling your 6 lines of data 'd':

  barplot(t(d[-1]), names.arg=d[,1], beside=TRUE)

 Give a careful reading to the definition of the 'height' argument on
 the help page.

 Peter Ehlers


Thank you, that's what I was looking for and it gets me started in the
right direction.  I can now work on refining the layout.  Thanks
again.

James

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Overlaying maps

2011-05-18 Thread Ray Brownrigg

Using my mind-reading skills, plot.Map(), while deprecated, does provide an 
add= option.

If this doesn't help, you'll need to read the posting guide and provide (a lot) 
more 
information.

Ray Brownrigg

On Thu, 19 May 2011, michael.laviole...@dhhs.state.nh.us wrote:
 I'm having difficulty overlaying maps when writing to a file graphics
 device. My command sequence has the structure
 
 plot(map1)
 par(new = T)
 plot(map2)
 
 On the screen device, it works fine. When I attempt something like
 
 png(file = map.png)
 plot(map1)
 par(new = T)
 plot(map2)
 dev.off()
 
 only the last map appears, the previous ones having been cleared. Can
 someone clarify?
 
 Thanks,
 Michael Laviolette PhD MPH
 New Hampshire Department of Health and Human Services
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data network format and grouping analysis

2011-05-18 Thread Scott Chamberlain

The following works to get an igraph object from a matrix edgelist:

dat2 - matrix(rep(seq(1,5,1), 4), nrow=10, ncol=2)
graph.edgelist( dat2 )


I tried with NA's but graph.edgelist did not allow NA's. Wouldn't you just 
leave those rows out with NA's in them? An NA means there is no edge, right? 

Scott 
On Wednesday, May 18, 2011 at 1:23 PM, SebastiÃ¡n Daza wrote:
Hi everyone,
 I have a dataset of friendship with this format:
 
  ego alter
 4746 1 2
 9742 1 3
 14738 1 NA
 4747 2 NA
 9743 2 3
 14739 2 1
 4748 3 13
 9744 3 5
 14740 3 14
 4749 4 NA
 9745 4 NA
 14741 4 NA
 4750 5 NA
 9746 5 13
 14742 5 10
 4751 6 12
 9747 6 7
 ...
 
 
 NA means that individuals don't select any friend. Does anyone know how 
 to format this dataset to use sna or igraph packages? I don't know how 
 to convert it into a matrix or a edgelist in R without losing isolated 
 individuals .
 
 Next question, anyone knows if there is a package to perform a Moody's 
 Crowds routine to identify groups using R, or other algorithms designed 
 to search groups by maximizing modularity scores?
 
 Thank you in advance!
 
 -- 
 SebastiÃ¡n Daza
 sebastian.d...@gmail.com
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Covariable Logistic Regression In R

2011-05-18 Thread Anamaria Crisan

Hello,

I would like some help figuring out how to run
Covariable Logistic Regression in R. I've been searching for a while on how
to get this done in R (I have had the luck previously of using a software
package that just does it) and I am coming up empty handed. Any experience
or insights would be greatly appreciated.

There is a package that does do exactly what I want, with the exception that
it requires very specific data input, it is called :  mbmdr. My input
consists of various clinical variables measured from patients  as well as
expression data from various genes. What I would like to do is identify
significant genes while considering their interactions with the various
clinical variables.


Thank you,

Ana

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convolution confusion:

2011-05-18 Thread peter dalgaard


On May 18, 2011, at 15:47 , Alex Hofmann wrote:

 Hi,
 
 I'm new to R, and I'm a bit confused with the convolve() function.
 If I do:
 x-c(1, 2, 3)
 convolve(x, rev(x), TRUE, open)
 = 9 12 10 4 1
 
 But I expected: 3 8 14 8 3 (like in Octave/MATLAB - conv(x, reverse(x)) )
 
 3 2 1 x 1 2 3
 = 3 2 1
0 6 4 2
0 0 9 6 3
 = 3 8 14 8 3
 
 The thing is, that convolve(x, x, TRUE, open) works.
 For me it feels very confusing, that convolution does the reverse itself but 
 the help suggest to reverse it again.
 
 The help file says: Note that the usual definition of convolution of two 
 sequences x and y is given by convolve(x, rev(y), type = o).
 
 Thanks for your help,

This confuses me every time as well. One way of putting it is that R's 
convolve is really what others call correlate: the product-sum between x 
and y shifted by k, for k=(1-n):(n-1) (adding appropriate padding):

 z - 1:3
 crossprod(z,z)
 [,1]
[1,]   14
 crossprod(c(z,0),c(0,z))
 [,1]
[1,]8
 crossprod(c(z,0,0),c(0,0,z))
 [,1]
[1,]3

Notice that this always comes out symmetric if x==y. 

However in convolution you want sum(x_j, y_(k-j)) so y is used in reverse order.

One way of spotting the issue is that if x represents the distribution of a 
binary random variable X, then the convolution of x with itself should be the 
distribution of the sum of two independent such variables. 

 x
[1] 0.05 0.95
 convolve(x,x,type=o)
[1] 0.0475 0.9050 0.0475
 convolve(x,rev(x),type=o)
[1] 0.0025 0.0950 0.9025

... and it is pretty obviously not the case that the sum of two highly skewed 
distributions is symmetric, so the 2nd line is right.
  
 dbinom(0:2,p=.95,size=2)
[1] 0.0025 0.0950 0.9025

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ggplot geom_boxplot vertical margins

2011-05-18 Thread Justin Haynes

If you plot:

df-data.frame(x=factor(1:100),y=rnorm(1000))
ggplot(df,aes(x=x,y=y))+geom_boxplot()

How do I remove those pesky margins on the sides of the plot area?  Or
maybe just reduce their size to something more like the spacing of the
boxes?


Thanks,

Justin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot geom_boxplot vertical margins

2011-05-18 Thread Felipe Carrillo

Is this what you want? You can control how much space you
want to see on the sides of the plot:

df-data.frame(x=factor(1:100),y=rnorm(1000))
ggplot(df,aes(x=x,y=y))+geom_boxplot() + scale_x_discrete(expand=c(0,0))


 
Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA
http://www.fws.gov/redbluff/rbdd_jsmp.aspx




- Original Message 
 From: Justin Haynes jto...@gmail.com
 To: r-help@r-project.org
 Sent: Wed, May 18, 2011 1:51:19 PM
 Subject: [R] ggplot geom_boxplot vertical margins
 
 If you plot:
 
 df-data.frame(x=factor(1:100),y=rnorm(1000))
 ggplot(df,aes(x=x,y=y))+geom_boxplot()
 
 How do I remove those pesky margins on the sides of the plot area?  Or
 maybe just reduce their size to something more like the spacing of the
 boxes?
 
 
 Thanks,
 
 Justin
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot geom_boxplot vertical margins

2011-05-18 Thread Justin Haynes

Exactly!

Thanks, I couldn't find that anywhere!

On Wed, May 18, 2011 at 1:59 PM, Felipe Carrillo
mazatlanmex...@yahoo.com wrote:
 Is this what you want? You can control how much space you
 want to see on the sides of the plot:

 df-data.frame(x=factor(1:100),y=rnorm(1000))
 ggplot(df,aes(x=x,y=y))+geom_boxplot() + scale_x_discrete(expand=c(0,0))



 Felipe D. Carrillo
 Supervisory Fishery Biologist
 Department of the Interior
 US Fish  Wildlife Service
 California, USA
 http://www.fws.gov/redbluff/rbdd_jsmp.aspx




 - Original Message 
 From: Justin Haynes jto...@gmail.com
 To: r-help@r-project.org
 Sent: Wed, May 18, 2011 1:51:19 PM
 Subject: [R] ggplot geom_boxplot vertical margins

 If you plot:

 df-data.frame(x=factor(1:100),y=rnorm(1000))
 ggplot(df,aes(x=x,y=y))+geom_boxplot()

 How do I remove those pesky margins on the sides of the plot area?  Or
 maybe just reduce their size to something more like the spacing of the
 boxes?


 Thanks,

 Justin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] June R /S+ Courses: Nationwide Back2back (1) R/S+ Fundamentals and (2) R/S-Plus Advanced Programming. in San Francisco, New York City, Washington DC

2011-05-18 Thread Sue Turner

XLSolutions has scheduled the first 2011 back2back courses
in New York City, San Francisco, Washington DC.
Taught by top R/S+ gurus!

West Coast ---back2back--- East Coast

(1) R/S-PLUS Fundamentals and Programming Techniques
http://www.xlsolutions-corp.com/coursedetail.asp?id=30

* San Francisco June 20-21, 2011
* New York * june 9-10, 2011
* Washington, DC * June 16-17, 2011



(2) R/S+ System: Advanced Programming
http://www.xlsolutions-corp.com/coursedetail.asp?id=16

* San Francisco * June 22-23,2011
* Washington, DC * June 14-15,2011
* New York * june 13-14, 2011


Ask for group discount and reserve your seat Now - Earlybird Rates.
Payment due after the class! Email Sue Turner: sue@xlsolutions-
corp.com

http://www.xlsolutions-corp.com/rplus.asp

Please let us know if you and your colleagues are interested in this
class to take advantage of group discount. Register now to secure your
seat!

Cheers,
Elvis Miller, PhD
Manager Training.
XLSolutions Corporation
206 686 1578
www.xlsolutions-corp.com
el...@xlsolutions-corp.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with Memory Problems (cannot allocate vector of size)

2011-05-18 Thread Jannis

How about reading the posting guide and afterwards searching the list 
archive:


http://r.789695.n4.nabble.com/R-help-f789696.html

Searching for

Error: cannot allocate vector of size

will give you hundreds of results as this question is asked VERY frequently!


Jannis


 On 05/18/2011 07:55 PM, Amit Patel wrote:

While doing pls I found the following problem


BHPLS1- plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, jackknife =
FALSE, validation = LOO)

when not enabling jackknife the command works fine, but when trying to enable
jackknife i get the following error.




BHPLS1- plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, jackknife =
TRUE, validation = LOO)

Error: cannot allocate vector of size 289.1 Mb

I am dealing with a very large dataset


str(PLSdata)

'data.frame':   40 obs. of  2 variables:
  $ GroupingList: int  1 1 1 1 1 1 1 1 1 1 ...
  $ PCIList : AsIs [1:40, 1:94727] 0 0 0 0 0 0 0 0 0 0 ...
   ..- attr(*, dimnames)=List of 2
   .. ..$ : chr  X X.1 X.12 X.13 ...
   .. ..$ : NULL


object.size(PLSdata)/1048600

28.9113560938394 bytes

How can i get around this memory shortage
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plots: I've deleted axes, now to delete space

2011-05-18 Thread Jannis


On 05/16/2011 06:18 PM, adele_thomp...@cargill.com wrote:

Re-sizing within the dev command works well. I'm not sure why I would need the 
dev.off(). I have the plot commands run. Then I have the dev.copy2pdf command.
Thanks again for your help.



Well, you really should get into the habit of reading the documentation 
of each of the commands you use. To get familiar with the concept of 
graphics in R (and to answer your question regarding  def.off()) I would 
recommend having a look at some basic textbook about R or one of the 
many tutorials on R available on the web. (googeling graphics R gives 
you a really helpful link with the third entry!)


The answer to your question basicly is that you need to tell R that your 
figure is finished (by running def.off() ). Then R can create the file. 
Usually figures are created by a sequence of calls so R itself can 
never now whether it would be necessary to add some elements to the plot 
later or not.


Sorry for beeing a bit harsh, but quite often many of the questions here 
on the list can be easily answered by searching documentation, the web 
or getting familiar with the basic R concepts!



HTH
Jannis






-Original Message-
From: greg.s...@imail.org [mailto:greg.s...@imail.org]
Sent: Monday, May 16, 2011 11:11 AM
To: Thompson, Adele - adele_thomp...@cargill.com; r-help@r-project.org
Subject: RE: [R] Plots: I've deleted axes, now to delete space

If your goal is to end up with a pdf file, then I would suggest creating the 
pdf file directly using the pdf function (you can specify height and width in 
the function) then run your commands to create the plot and use dev.off() to 
finish.

You often get different results when writing directly to a file vs doing one of 
the dev.copy because of some different settings.  In general the dev.copy 
approach can be a quick and easy solution for a simple graph, but plotting 
directly to the file tends to work better if you want a quality graph in the 
file.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Schatzi
Sent: Monday, May 16, 2011 8:41 AM
To: r-help@r-project.org
Subject: Re: [R] Plots: I've deleted axes, now to delete space

I am outputting the plot to a pdf file using the code:
dev.copy2pdf(file=testing.pdf)

The plots are too small though unless I first manually increase the size in
R and then use the dev.copy command. Is there a way to automatically
increase the window size? I tried fin and din, but those do not seem to work
or they only increase the size to a certain degree, even though I can
manually increase it to fill my screen.


Schatzi wrote:

Thanks all for the replies. I am getting better slowly but surely. I
imagine that I will get better at figuring out things as well so I don't
have to post as many questions. I do lots of searches, but still cannot
figure out how to do everything that I need.

The new code is as such:
par(mfrow=c(4,7), mar=c(2, 2, 2, 1.5), oma=c(1, 1, 4, 0))
for (i in 1:28) {
a-seq(1,3,1)
plot(a,a, ann=FALSE, main= plot of a vs a)
}
mtext(Plot of a vs a,side=3,outer=TRUE)


-Original Message-
From: murdoch.dun...@gmail.com [mailto:murdoch.dun...@gmail.com]
Sent: Friday, May 13, 2011 03:25 PM
To: Thompson, Adele - adele_thomp...@cargill.com
Cc: greg.s...@imail.org; r-help@r-project.org
Subject: Re: [R] Plots: I've deleted axes, now to delete space

On 11-05-13 4:21 PM, adele_thomp...@cargill.com wrote:

Easy fix. Under ?par, I don't see where I can enter an overall title.
Should I add a text command or something?

mtext() writes text in the margins; argument outer puts it in the
outer margins.

Duncan Murdoch


-Original Message-
From: greg.s...@imail.org [mailto:greg.s...@imail.org]
Sent: Friday, May 13, 2011 03:17 PM
To: Thompson, Adele - adele_thomp...@cargill.com; r-help@r-project.org
Subject: RE: [R] Plots: I've deleted axes, now to delete space

Look at the help for par, specifically the section on 'mar' to set the
per plot margins smaller and the section on 'oma' to leave room for the
overall title.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/Plots-I-ve-deleted-axes-now-to-delete-space-tp3521078p3526379.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__

Re: [R] Covariable Logistic Regression In R

2011-05-18 Thread Frank Harrell

First of all, use the correct terminology for the statistical method.

It is not productive to identify 'significant' genes unless you use quite
complex methods.  This kind of endeavor would take at least 6 statistics
courses in order to be successful.
Frank

Anamaria Crisan wrote:
 
 Hello,
 
 I would like some help figuring out how to run
 Covariable Logistic Regression in R. I've been searching for a while on
 how
 to get this done in R (I have had the luck previously of using a software
 package that just does it) and I am coming up empty handed. Any experience
 or insights would be greatly appreciated.
 
 There is a package that does do exactly what I want, with the exception
 that
 it requires very specific data input, it is called :  mbmdr. My input
 consists of various clinical variables measured from patients  as well as
 expression data from various genes. What I would like to do is identify
 significant genes while considering their interactions with the various
 clinical variables.
 
 
 Thank you,
 
 Ana
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Covariable-Logistic-Regression-In-R-tp3533886p3534114.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plots: I've deleted axes, now to delete space

2011-05-18 Thread Adele_Thompson

I've never used an R command without reading the documentation first. I think 
that would be impractical. I am not an expert at deciphering the documentation 
though and I post here because I did in fact read the documentation, do 
extensive google searches, ask friends/collegues and still find no answer. This 
forum is not a first resort for me. There is a very bad rep of R forums being 
notoriously harsh on new comers. I myself do not like when people do not try. 
In order to avoid this harshness (I have not found this forum any more harsh 
than your everyday educated R user), I do my own research and ask a question 
when I get stuck. I realize that people still are upset by my ignorance and 
that is their choice. I will get better and better at R and less and less 
likely to run into any of these issues. I am not a programmer or statistician 
by trade, but I am a learner and thus will continue to improve.

You have not offended me at all as few people have this ability.
Thank you for your help.

Adele

-Original Message-
From: bt_jan...@yahoo.de [mailto:bt_jan...@yahoo.de] 
Sent: Wednesday, May 18, 2011 04:53 PM
To: Thompson, Adele - adele_thomp...@cargill.com
Cc: r-help@r-project.org
Subject: Re: [R] Plots: I've deleted axes, now to delete space

On 05/16/2011 06:18 PM, adele_thomp...@cargill.com wrote:
 Re-sizing within the dev command works well. I'm not sure why I would need 
 the dev.off(). I have the plot commands run. Then I have the dev.copy2pdf 
 command.
 Thanks again for your help.


Well, you really should get into the habit of reading the documentation 
of each of the commands you use. To get familiar with the concept of 
graphics in R (and to answer your question regarding  def.off()) I would 
recommend having a look at some basic textbook about R or one of the 
many tutorials on R available on the web. (googeling graphics R gives 
you a really helpful link with the third entry!)

The answer to your question basicly is that you need to tell R that your 
figure is finished (by running def.off() ). Then R can create the file. 
Usually figures are created by a sequence of calls so R itself can 
never now whether it would be necessary to add some elements to the plot 
later or not.

Sorry for beeing a bit harsh, but quite often many of the questions here 
on the list can be easily answered by searching documentation, the web 
or getting familiar with the basic R concepts!


HTH
Jannis





 -Original Message-
 From: greg.s...@imail.org [mailto:greg.s...@imail.org]
 Sent: Monday, May 16, 2011 11:11 AM
 To: Thompson, Adele - adele_thomp...@cargill.com; r-help@r-project.org
 Subject: RE: [R] Plots: I've deleted axes, now to delete space

 If your goal is to end up with a pdf file, then I would suggest creating the 
 pdf file directly using the pdf function (you can specify height and width in 
 the function) then run your commands to create the plot and use dev.off() to 
 finish.

 You often get different results when writing directly to a file vs doing one 
 of the dev.copy because of some different settings.  In general the dev.copy 
 approach can be a quick and easy solution for a simple graph, but plotting 
 directly to the file tends to work better if you want a quality graph in the 
 file.

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Schatzi
 Sent: Monday, May 16, 2011 8:41 AM
 To: r-help@r-project.org
 Subject: Re: [R] Plots: I've deleted axes, now to delete space

 I am outputting the plot to a pdf file using the code:
 dev.copy2pdf(file=testing.pdf)

 The plots are too small though unless I first manually increase the size in
 R and then use the dev.copy command. Is there a way to automatically
 increase the window size? I tried fin and din, but those do not seem to work
 or they only increase the size to a certain degree, even though I can
 manually increase it to fill my screen.


 Schatzi wrote:
 Thanks all for the replies. I am getting better slowly but surely. I
 imagine that I will get better at figuring out things as well so I don't
 have to post as many questions. I do lots of searches, but still cannot
 figure out how to do everything that I need.

 The new code is as such:
 par(mfrow=c(4,7), mar=c(2, 2, 2, 1.5), oma=c(1, 1, 4, 0))
 for (i in 1:28) {
 a-seq(1,3,1)
 plot(a,a, ann=FALSE, main= plot of a vs a)
 }
 mtext(Plot of a vs a,side=3,outer=TRUE)


 -Original Message-
 From: murdoch.dun...@gmail.com [mailto:murdoch.dun...@gmail.com]
 Sent: Friday, May 13, 2011 03:25 PM
 To: Thompson, Adele - adele_thomp...@cargill.com
 Cc: greg.s...@imail.org; r-help@r-project.org
 Subject: Re: [R] Plots: I've deleted axes, now to delete space

 On 11-05-13 4:21 PM, adele_thomp...@cargill.com wrote:
 Easy fix. Under ?par, I don't see where I can enter an overall title.
 Should I add a text command or something?
 mtext() writes text in the margins; argument outer puts it in the
 outer margins.

Re: [R] R Style Guide -- Was Post-hoc tests in MASS using glm.nb

2011-05-18 Thread Bill.Venables

Hi Bert,

I think people should know about the Google Sytle Guide for R because, as I 
said, it represents a thoughtful contribution to the debate.  Most of its 
advice is very good (meaning I agree with it!) but some is a bit too much (for 
example, the blanket advice never to use S4 classes and methods - that's just 
resisting progress, in my view).  The advice on using - for the (normal) 
assingment operator rather than = is also good advice, (according to me), but 
people who have to program in both C and R about equally often may find it a 
bit tedious.  We can argue over that one.

I suggest it has a place in the R FAQ but with a suitable warning that this is 
just one view, albeit a thougtful one.  I don't think it need be included in 
the posting guide, though.  It would take away some of the fun.  :-)

Bill Venables. 

-Original Message-
From: Bert Gunter [mailto:gunter.ber...@gene.com] 
Sent: Wednesday, 18 May 2011 11:47 PM
To: Venables, Bill (CMIS, Dutton Park)
Cc: r-help@r-project.org
Subject: R Style Guide -- Was Post-hoc tests in MASS using glm.nb

Thanks Bill. Do you and others think that a link to this guide (or
another)should be included in the Posting Guide and/or R FAQ?

-- Bert

On Tue, May 17, 2011 at 4:07 PM,  bill.venab...@csiro.au wrote:
 Amen to all of that, Bert.  Nicely put.  The google style guide (not perfect, 
 but a thoughtful contribution on these kinds of issues, has avoiding attach() 
 as its very first line.  See 
 http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html)

 I would add, though, that not enough people seem yet to be aware of 
 within(...), a companion of with(...) in a way, but used for modifying data 
 frames or other kinds of list objects.  It should be seen as a more flexible 
 replacement for transform() (well, almost).

 The difference between with() and within() is as follows:

 with(data, expr, ...)

 allows you to evaluate 'expr' with 'data' providing the primary source for 
 variables, and returns *the evaluated expression* as the result.  By contrast

 within(data, expr, ...)

 again uses 'data' as the primary source for variables when evaluating 'expr', 
 but now 'expr' is used to modify the varibles in 'data' and returns *the 
 modified data set* as the result.

 I use this a lot in the data preparation phase of a project, especially, 
 which is usually the longest, trickiest, most important, but least discussed 
 aspect of any data analysis project.

 Here is a simple example using within() for something you cannot do in one 
 step with transform():

 polyData - within(data.frame(x = runif(500)), {
  x2 - x^2
  x3 - x*x2
  b - runif(4)
  eta - cbind(1,x,x2,x3) %*% b
  y - eta + rnorm(x, sd = 0.5)
  rm(b)
 })

 check:

 str(polyData)
 'data.frame':   500 obs. of  5 variables:
  $ x  : num  0.5185 0.185 0.5566 0.2467 0.0178 ...
  $ y  : num [1:500, 1] 1.343 0.888 0.583 0.187 0.855 ...
  $ eta: num [1:500, 1] 1.258 0.788 1.331 0.856 0.63 ...
  $ x3 : num  1.39e-01 6.33e-03 1.72e-01 1.50e-02 5.60e-06 ...
  $ x2 : num  0.268811 0.034224 0.309802 0.060844 0.000315 ...


 Bill Venables.

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Bert Gunter
 Sent: Wednesday, 18 May 2011 12:08 AM
 To: Peter Ehlers
 Cc: R list
 Subject: Re: [R] Post-hoc tests in MASS using glm.nb

 Folks:

 Only if the user hasn't yet been introduced to the with() function,
 which is linked to on the ?attach page.

 Note also this sentence from the ?attach page:
   attach can lead to confusion.

 I can't remember the last time I needed attach().

 Peter Ehlers

 Yes. But perhaps it might be useful to flesh this out with a bit of
 commentary. To this end, I invite others to correct or clarify the
 following.

 The potential confusion comes from requiring R to search for the
 data. There is a rigorous process by which this is done, of course,
 but it requires that the runtime environment be consistent with that
 process, and the programmer who wrote the code may not have control
 over that environment. The usual example is that one has an object
 named,say,  a in the formula and in the attached data and another
 a also in the global environment. Then the wrong a would be found.
 The same thing can happen if another data set gets attached in a
 position before the one of interest. (Like Peter, I haven't used
 attach() in so long that I don't know whether any warning messages are
 issued in such cases).

 Using the data =  argument when available or the with() function
 when not avoids this potential confusion and tightly couples the data
 to be analyzed with the analysis.

 I hope this clarifies the previous posters' comments.

 Cheers,
 Bert


 [... non-germane material snipped ...]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

[R] using hglm to fit a gamma GLMM with nested random effects?

2011-05-18 Thread Benjamin Caldwell

Apologies for continuing to ask about this but . .  in my quest to fit a
gamma GLMM model to my data (see partial copy of thread below), I'm
exploring using hglm today. The question of the day has to do with the
errors I'm currently getting from the hglm package. Can hglm handle a model
with nested random effects? I don't see an example of one of those in the
package documentation. If it can, can anyone tell me what these errors are
trying to tell me?

If no, I promise I'll let this rest and just take B. Boker's advice to go
with a nice safe modified log transform of the data.

Best


test.gamma-hglm(fixed=post.f.crwn.length~lg.shigo.av+dbh+leaf.area+
bark.thick.bh+ht.any, random=~1|site/transect/plot, family=Gamma(link=log),
data=rws30.BL)
Error in `contrasts-`(`*tmp*`, value = contr.treatment) :
  contrasts can be applied only to factors with 2 or more levels
In addition: Warning messages:
1: In Ops.factor(site, transect) : / not meaningful for factors
2: In Ops.factor(site/transect, plot) : / not meaningful for factors
 test.gamma-hglm(fixed=post.f.crwn.length~lg.shigo.av+dbh+leaf.area+
bark.thick.bh+ht.any, random=~1|site, family=Gamma(link=log), data=rws30.BL)
Error in hglm.default(X = X, y = Y, Z = z, family = family, rand.family =
rand.family,  :
  Length of X and Z differ.





* Dennis Murphy djmu...@gmail.com Tue, May 17, 2011 at 6:18 PM
To: Benjamin Caldwell btcaldw...@berkeley.edu
 Hi:

Someone else (Wayne Zhang, CNA) asked a similar question re
hierarchical Gamma models on R-help today and responded to suggestions
as follows:

Hglm does the work! Thanks!

Also, I find that the developing version of lme4, called lme4a, has
the capability to fit Gamma models. And both lme4a and hglm produce
results consistent with the published ones.

Problems solved!

Perhaps you might find success following his lead?

Dennis
Ben Bolker bbol...@gmail.com
* *Tue, May 17, 2011 at 4:50

PM*
To: Benjamin Caldwell btcaldw...@berkeley.edu
Cc: r-sig-mixed-mod...@r-project.org r-sig-mixed-mod...@r-project.org
  [forwarding to r-sig-mixed-models list ...]

 As of today, Gamma models are (still) not feasible in lme4 -- they are
somewhat more numerically challenging than the other families, so Doug
Bates is having to do some re-engineering.
 There is a *possibility* that I can get Gamma fitting to work in the
'alpha'/bleeding-edge development version of glmmADMB, but it will
definitely be bleeding-edge ... if you are interested in trying that,
please contact me off-list.

  In the meantime, my standing advice is to try a LMM on the
log-transformed data (zero values in the response are problematic, but
they would be problematic in a Gamma GLMM in any case if the shape
parameter is ever  1 ...)

 Ben Bolker


On 11-05-17 07:32 PM, Benjamin Caldwell wrote:
 Addendum: I tried a gamma fit in glmmPQL and got the same errors.
 *Ben Caldwell*

 PhD Candidate
 University of California, Berkeley




 On Tue, May 17, 2011 at 3:51 PM, Benjamin Caldwell
 btcaldw...@berkeley.eduhttps://mail.google.com/mail/?ui=2ik=938097cb0fview=ptsearch=inboxmsg=130005e6516c0ad7dsqt=1
 
mailto:btcaldw...@berkeley.eduhttps://mail.google.com/mail/?ui=2ik=938097cb0fview=ptsearch=inboxmsg=130005e6516c0ad7dsqt=1
wrote:

 Hello
 After seeing this
 (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/005213.html)
email
 I thought I would check the issue with a gamma family in lme4 hadn't
 been fixed; can I fit a hierarchical gamma model in lme4 at this
 time? There doesn't seem to be another package capable of it at this
 time.

 My thought process:
 1. took a look at the response variable and some subsets to see what
 it looked like, (bppfcl and transformed response var), attached
 2. took a look at a gamma and gaussian fit to the response variable.
 3. ran hierarchical gaussian model in nlme to look at residuals
 (more familiar with graphs from that package) (qqnorm and
residuals)

 Given the residual output for the gaussian model it looks like I
 could remove the values at the end of the distribution and get a
 decent fit. I'd still like to try a gamma model though, if that's
 possible. Is it possible in lme4 or another package I don't know
about?

 ---This is the code I'm running---

 rws30.BL$site - factor(rws30.BL$site)
 rws30.BL$transect - interaction(rws30.BL$site, rws30.BL$transect,
 drop = TRUE)
 rws30.BL$plot - interaction(rws30.BL$site, rws30.BL$transect,
 rws30.BL$plot, drop = TRUE)
 hist(rws30.BL$post.f.crwn.length)
 rws30.BL$gpost.f.crwn.length

 library(nlme)
 burnedmodel1.3-lme(post.f.crwn.length~lg.shigo.av+dbh+leaf.area+
bark.thick.bh
 http://bark.thick.bh+ht.any+ht.alive,
 random=(~1|site/transect/plot),na.action=na.omit, data=rws30.BL)
 Error: no valid set of coefficients has been found: please supply
 starting values
 In addition: Warning message:
 In log(ifelse(y == 0, 1, y/mu)) : NaNs produced

Re: [R] R Style Guide -- Was Post-hoc tests in MASS using glm.nb

2011-05-18 Thread Rolf Turner



On 19/05/11 10:26, bill.venab...@csiro.au wrote:

SNIP

Most of [the Google style guide's] advice is very good (meaning I agree with 
it!) but some is a bit too much (for example, the blanket advice never to use 
S4 classes and methods - that's just resisting progress, in my view).

SNIP

I must respectfully disagree with this view, and concur heartily with 
the style guide.
S4 classes and methods are a ball-and-chain that one has to drag along.  
See also

fortune(S4 methods). :-)

cheers,

Rolf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 127 matches

Mail list logo