from:"Mark Kimpel"

[R] auto clustering with Rgraphviz: possible?

2011-05-16 Thread Mark Kimpel

I am working with about 600 nodes in an Rgraphviz graph. Within this graph
there are, when plotted, about 8 obvious clusters that are highly connected
within them but do not share connections between them. I have a wrapper
function that handles a lot of tasks automatically for me like setting
various node and edge attributes. What I would like to do is be able to
auto-generate plots for each of these independent clusters. Is there a way
to programatically identify these clusters and use this identificaiton to
create either subgraphs or clusters?

#For example
library(graph)
library(Rgraphviz)
g1_gz - gzfile(system.file(GXL/graphExample-01.gxl.gz,package=graph),
open=rb)
g11_gz - gzfile(system.file(GXL/graphExample-11.gxl.gz,package=graph),
open=rb)
g1 - fromGXL(g1_gz)
g11 - fromGXL(g11_gz)
g1_11 - join(g1, g11)
plot(g1_g11)
# yields 2 obvious clusters plus 8 nodes with no edges. What I would like to
be able to do is automatically identify the 2 clusters, so that they can be
separately plotted.

Thanks,
Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ascii or regex code for alt-enter for Excel

2010-10-20 Thread Mark Kimpel

I need to write a table that can be opened in Excel or OpenOffice such that
there are newlines embedded within cells.

After much Googling and futzing, I can't figure out how to do this. The way
to do this within Excel is alt-Enter and I've tried '/n', '/n/r', '/r/n' per
some web suggestions without luck.

Anybody know what character or ASCII code to use for this?

Thanks,

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ascii or regex code for alt-enter for Excel

2010-10-20 Thread Mark Kimpel

Thanks to all for your helpful replies. In my initial email I did mistakenly
write \n when I had correctly been using \n in my code.

The key for me seem to be using the approach Bill suggested, i.e., writing
to a binary file. If I simply do

write.csv(d, d.text.csv, row.names = FALSE, col.names = FALSE)

Then the newlines are not represented within the cell, but create new cells,
which is the problem I was originally having.

I do wonder what ASCII character is represented in Windows with alt-Enter.
I'm actually working in a Linux environment, its my boss who uses Windows. I
was trying to find a text only output that would do the trick. From my web
search I learned that using alt, especially with the number pad, allows
for the entry of all sorts on unusual  ASCII characters. I found a couple
tables of them, but none referenced alt-Enter. Someone on a MS help site
suggested using the ASCII numerics 10 or 13, which I believe are just
new-line and line-feed. Those didn't work for me. I guess I'll be content
with having a working solution and leave the mystery unsolved.

Thanks Bill!

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please


On Wed, Oct 20, 2010 at 2:13 PM, William Dunlap wdun...@tibco.com wrote:


  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
  Sent: Wednesday, October 20, 2010 10:47 AM
  To: Duncan Murdoch; Mark Kimpel
  Cc: r-help@r-project.org
  Subject: Re: [R] ascii or regex code for alt-enter for Excel
 
  I think Excel wants a \n for newlines
  in a text cell entry but \r\n to separate
  rows of a csv file.  You may have to open
  the file in binary mode and put in the \r\n
  at line ends by hand to achieve this from R,
  as it tranlates all \ns to \r\ns when
  writing them to a file.
 
  (\n is not the same as /n in R.)

 I omitted an example:
  d - data.frame(nLines=c(3,2,1),
 entry=c(three\nline\nentry,
 two line\nentry,
 one line entry))
  theFile - file(c:/temp/d.csv, open=wb) # write in binary mode
  write.csv(d, theFile, eol=\r\n)
  close(theFile)

 Now c:/temp/d.csv in Excel and you should see the
 multiline text entries.  (Expand the cells and/or
 formula entry area to see all the lines in an entry.)

  Bill Dunlap
  Spotfire, TIBCO Software
  wdunlap tibco.com
 
   -Original Message-
   From: r-help-boun...@r-project.org
   [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch
   Sent: Wednesday, October 20, 2010 10:29 AM
   To: Mark Kimpel
   Cc: r-help@r-project.org
   Subject: Re: [R] ascii or regex code for alt-enter for Excel
  
   On 20/10/2010 1:04 PM, Mark Kimpel wrote:
I need to write a table that can be opened in Excel or
   OpenOffice such that
there are newlines embedded within cells.
   
After much Googling and futzing, I can't figure out how to
   do this. The way
to do this within Excel is alt-Enter and I've tried '/n',
   '/n/r', '/r/n' per
some web suggestions without luck.
  
   You may need to ask an Excel expert or MS tech support.  What
   character
   is Excel looking for?
  
   (Or it is possible that you have what you need, but used
   forward slashes
   when you should have used backslashes.  The newline character
   is \n, not
   /n, in R.)
  
   Duncan Murdoch
  
Anybody know what character or ASCII code to use for this?
   
Thanks,
   
Mark
   
Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine
   
15032 Hunter Court, Westfield, IN  46074
   
(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please
   
  [[alternative HTML version deleted]]
   
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting

[R] model set up question

2010-03-31 Thread Mark Kimpel

I need to compare gene expression differences between multiple line pairs of
alcohol preferring and non-preferring rat lines. I have 5 such line pairs, 3
are unrelated but two were derived independently from the same parent stock.
For each line, there are 10 samples. I'll be testing multiple genes, but for
simplicity assume just one gene whose expression is measures as
geneExpression.

Alcohol Preferring Alcohol Non-Preferring Line Pair X or
Non-X

Line 1a  Line 1b
1 Non-X
Line 2a  Line 2b
2 Non-X
Line 3a  Line 3b
3 Non-X
Line X4aLine X4b
X4   X
Line X5aLine X5b
X5   X

If all the line pairs were independently derived, a model could be
geneExpression ~ Line.Pair + AlcoholPreference   with the factor of interest
being Alcohol Preference

but, there is the X factor, with the 2 X strain-pairs being related,
whereas the others are unrelated to each other and also to the 2 X
strain-pairs.

We want to take into account the fact that there are really only 4 parent
populations of these 5 strain-pairs so as to decrease the weighting put on
the X strains in the model.

What would the most appropriate approach to this be and how would the model
be written?

Thanks,

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] vectorizing ANOVA over a vectorized linear model

2010-03-07 Thread Mark Kimpel

Is it possible to vectorize anova over the output of a vectorized lm?  I
have a gene expression matrix with each row being a gene and columns for
samples. There are several factors with interactions. I can get p values by
looping over the matrix with lm and anova, but I would like to make this as
computationally efficient as possible. I am able to vectorize the lm
command, but when I try to use anova on the resultant model object I get
just one anova result.

Is what I want to do possible? And, yes, I am quite conversant with Limma
and other BioC packages, I have my reasons for wanting to use lm and anova.

Thanks,

Mark
Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vectorizing ANOVA over a vectorized linear model

2010-03-07 Thread Mark Kimpel

Hadley,

Thanks for pointing me to some good articles. Unfortunately, I have already
read Holger's and my main concern is computational efficiency. The buzzword
on this list regarding efficient code is vectorization. I am, frankly,
surprised that there is a way to vectorize analysis of complex models but
not to extract p values from them. Dieter's reply points one towards using
lapply, which in my experience allows for compact code but not an increase
in efficiency (one of Holger's examples demonstrates this). Anyway, I cannot
see how to go from Holger's fairly simple examples to one that involves a
complex model with several factors and interactions.

Limma, which does provide p values if contrasts are used, is blindingly fast
but I believe Gordon Smyth has hard-coded most of this excellent package in
C. I was hoping to achieve something similar without the use of the
moderated t-statistics that Limma uses.

Looks like I am stuck using loops with mcapply. Thank goodness for my
Corei7!

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please


On Sun, Mar 7, 2010 at 2:08 PM, hadley wickham h.wick...@gmail.com wrote:

 Hi Mark,

 If efficiency is a concern you might want to read Computing Thousands
 of Test Statistics Simultaneously in R by Holger Schwender and Tina
 Müller, http://stat-computing.org/newsletter/issues/scgn-18-1.pdf.

 If you just want to do it, see the examples in
 http://had.co.nz/plyr/plyr-intro-090510.pdf.

 Hadley

 On Sun, Mar 7, 2010 at 7:03 PM, Mark Kimpel mwkim...@gmail.com wrote:
  Is it possible to vectorize anova over the output of a vectorized lm?  I
  have a gene expression matrix with each row being a gene and columns for
  samples. There are several factors with interactions. I can get p values
 by
  looping over the matrix with lm and anova, but I would like to make this
 as
  computationally efficient as possible. I am able to vectorize the lm
  command, but when I try to use anova on the resultant model object I get
  just one anova result.
 
  Is what I want to do possible? And, yes, I am quite conversant with Limma
  and other BioC packages, I have my reasons for wanting to use lm and
 anova.
 
  Thanks,
 
  Mark
  Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
  Indiana University School of Medicine
 
  15032 Hunter Court, Westfield, IN  46074
 
  (317) 490-5129 Work,  Mobile  VoiceMail
  (317) 399-1219 Skype No Voicemail please
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R CMD SHLIB requesting makefile. Is a makefile required?

2009-12-10 Thread Mark Kimpel

A few years ago I used the following to compile a shared object that I
wanted to call from R and it worked just fine.
R CMD SHLIB -o ~/my_C/R.shared.so/cocite.mat.so cocite.mat.c
Now when it is executed I receive the following error message:
make: *** No rule to make target `cocite.mat.o', needed by
`/home/mkimpel/my_C/R.shared.so/cocite.mat.so'.  Stop.

I've consulted R CMD SHLIB --help and R-exts.pdf and neither indicates
that a makefile is now required by R CMD SHLIB, yet Googling the error
message leads me to lots of threads regarding makefiles.

Has something changed? Am I doing something wrong? Following is my gcc
version and sessionInfo().
Thanks, Mark

kimpel-desktop ~/my_C: gcc --version
gcc (Ubuntu 4.4.1-4ubuntu8) 4.4.1
Copyright (C) 2009 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 sessionInfo()
R version 2.10.0 Patched (2009-10-27 r50222)
x86_64-unknown-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices datasets  utils methods   base

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with split eating giga-bytes of memory

2009-12-08 Thread Mark Kimpel

I'm having trouble using split on a very large data-set with ~1400 levels of
the factor to be split. Unfortunately, I can't reproduce it with the simple
self-contained example below. As you can see, splitting the artificial
dataframe of size ~13MB results in a split dataframe of ~ 144MB, with an
increase memory allocation of ~10 fold for the split object. If split scales
linearly, then my actual 52MB dataframe should be easily handled by my 12GB
of RAM, but it is not. instead, when I try to split selectSubAct.df on one
of its factors with 1473 levels, my memory is slowly gobbled up (plus 3 GB
of swap) until I cancel the operation.

Any ideas on what might be happening? Thanks, Mark

myDataFrame - data.frame(matrix(LETTERS, ncol = 7, nrow = 399000))
mySplitVar - factor(as.character(1:1400))
myDataFrame - cbind(myDataFrame, mySplitVar)
object.size(myDataFrame)
## 12860880 bytes # ~ 13MB
myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar)
object.size(myDataFrame.split)
## 144524992 bytes # ~ 144MB
object.size(selectSubAct.df)
## 52,348,272 bytes # ~ 52MB

 sessionInfo()
R version 2.10.0 Patched (2009-10-27 r50222)
x86_64-unknown-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices datasets  utils methods   base

loaded via a namespace (and not attached):
[1] tools_2.10.0

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with split eating giga-bytes of memory

2009-12-08 Thread Mark Kimpel

Charles, I suspect your are correct regarding copying of the attributes.
First off, selectSubAct.df is my real data, which turns out to be of the
same dim() as myDataFrame below, but each column is make up of strings, not
simple letters, and there are many levels in each column, which I did not
properly duplicate in my first example. I have ammended that below and with
the split the new object size is now not 10X the size of the original, but
100X. My real data is even more complex than this, so I suspect that is
where the problem lies. I need to search for a better solution to my problem
than split, for which I will start a separate thread if I can't figure
something out.

Thanks for pointing me in the right direction,

Mark

myDataFrame - data.frame(matrix(paste(The rain in Spain,
as.character(1:1400), sep = .), ncol = 7, nrow = 399000))
mySplitVar - factor(paste(Rainy days and Mondays, as.character(1:1400),
sep = .))
myDataFrame - cbind(myDataFrame, mySplitVar)
object.size(myDataFrame)
## 12860880 bytes # ~ 13MB
myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar)
object.size(myDataFrame.split)
## 1,274,929,792 bytes ~ 1.2GB
object.size(selectSubAct.df)
## 52,348,272 bytes # ~ 52MB
Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please


On Tue, Dec 8, 2009 at 10:22 PM, Charles C. Berry cbe...@tajo.ucsd.eduwrote:

 On Tue, 8 Dec 2009, Mark Kimpel wrote:

  I'm having trouble using split on a very large data-set with ~1400 levels
 of
 the factor to be split. Unfortunately, I can't reproduce it with the
 simple
 self-contained example below. As you can see, splitting the artificial
 dataframe of size ~13MB results in a split dataframe of ~ 144MB, with an
 increase memory allocation of ~10 fold for the split object. If split
 scales
 linearly, then my actual 52MB dataframe should be easily handled by my
 12GB
 of RAM, but it is not. instead, when I try to split selectSubAct.df on one
 of its factors with 1473 levels, my memory is slowly gobbled up (plus 3 GB
 of swap) until I cancel the operation.

 Any ideas on what might be happening? Thanks, Mark


 Each element of myDataFrame.split contains a copy of the attributes of the
 parent data.frame.

 And probably it does scale linearly. But the scaling factor depends on the
 size of the attributes that get copied, I guess.




 myDataFrame - data.frame(matrix(LETTERS, ncol = 7, nrow = 399000))
 mySplitVar - factor(as.character(1:1400))
 myDataFrame - cbind(myDataFrame, mySplitVar)
 object.size(myDataFrame)
 ## 12860880 bytes # ~ 13MB
 myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar)
 object.size(myDataFrame.split)
 ## 144524992 bytes # ~ 144MB


 Note:

  only.attr - lapply(myDataFrame.split,function(x) sapply(x,attributes))

 (object.size(myDataFrame.split)-object.size(myDataFrame))/object.size(only.attr)

 1.03726179240978 bytes




  object.size(selectSubAct.df)
 ## 52,348,272 bytes # ~ 52MB


 What was this??


 Chuck


  sessionInfo()

 R version 2.10.0 Patched (2009-10-27 r50222)
 x86_64-unknown-linux-gnu

 locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices datasets  utils methods   base

 loaded via a namespace (and not attached):
 [1] tools_2.10.0

 Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN  46074

 (317) 490-5129 Work,  Mobile  VoiceMail
 (317) 399-1219 Skype No Voicemail please

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 Charles C. Berry(858) 534-2098
Dept of Family/Preventive
 Medicine
 E mailto:cbe...@tajo.ucsd.edu   UC San Diego
 http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with split eating giga-bytes of memory

2009-12-08 Thread Mark Kimpel

Hadley, Just as you were apparently writing I had the same thought and did
exactly what you suggested, converting all columns except the one that I
want split to character. Executed almost instantaneously without problem.
Thanks! Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please


On Tue, Dec 8, 2009 at 10:48 PM, hadley wickham h.wick...@gmail.com wrote:

 Hi Mark,

 Why are you using factors?  I think for this case you might find
 characters are faster and more space efficient.

 Alternatively, you can have a look at the plyr package which uses some
 tricks to keep memory usage down.

 Hadley

 On Tue, Dec 8, 2009 at 9:46 PM, Mark Kimpel mwkim...@gmail.com wrote:
  Charles, I suspect your are correct regarding copying of the attributes.
  First off, selectSubAct.df is my real data, which turns out to be of
 the
  same dim() as myDataFrame below, but each column is make up of strings,
 not
  simple letters, and there are many levels in each column, which I did not
  properly duplicate in my first example. I have ammended that below and
 with
  the split the new object size is now not 10X the size of the original,
 but
  100X. My real data is even more complex than this, so I suspect that is
  where the problem lies. I need to search for a better solution to my
 problem
  than split, for which I will start a separate thread if I can't figure
  something out.
 
  Thanks for pointing me in the right direction,
 
  Mark
 
  myDataFrame - data.frame(matrix(paste(The rain in Spain,
  as.character(1:1400), sep = .), ncol = 7, nrow = 399000))
  mySplitVar - factor(paste(Rainy days and Mondays,
 as.character(1:1400),
  sep = .))
  myDataFrame - cbind(myDataFrame, mySplitVar)
  object.size(myDataFrame)
  ## 12860880 bytes # ~ 13MB
  myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar)
  object.size(myDataFrame.split)
  ## 1,274,929,792 bytes ~ 1.2GB
  object.size(selectSubAct.df)
  ## 52,348,272 bytes # ~ 52MB
  Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
  Indiana University School of Medicine
 
  15032 Hunter Court, Westfield, IN  46074
 
  (317) 490-5129 Work,  Mobile  VoiceMail
  (317) 399-1219 Skype No Voicemail please
 
 
  On Tue, Dec 8, 2009 at 10:22 PM, Charles C. Berry cbe...@tajo.ucsd.edu
 wrote:
 
  On Tue, 8 Dec 2009, Mark Kimpel wrote:
 
   I'm having trouble using split on a very large data-set with ~1400
 levels
  of
  the factor to be split. Unfortunately, I can't reproduce it with the
  simple
  self-contained example below. As you can see, splitting the artificial
  dataframe of size ~13MB results in a split dataframe of ~ 144MB, with
 an
  increase memory allocation of ~10 fold for the split object. If split
  scales
  linearly, then my actual 52MB dataframe should be easily handled by my
  12GB
  of RAM, but it is not. instead, when I try to split selectSubAct.df on
 one
  of its factors with 1473 levels, my memory is slowly gobbled up (plus 3
 GB
  of swap) until I cancel the operation.
 
  Any ideas on what might be happening? Thanks, Mark
 
 
  Each element of myDataFrame.split contains a copy of the attributes of
 the
  parent data.frame.
 
  And probably it does scale linearly. But the scaling factor depends on
 the
  size of the attributes that get copied, I guess.
 
 
 
 
  myDataFrame - data.frame(matrix(LETTERS, ncol = 7, nrow = 399000))
  mySplitVar - factor(as.character(1:1400))
  myDataFrame - cbind(myDataFrame, mySplitVar)
  object.size(myDataFrame)
  ## 12860880 bytes # ~ 13MB
  myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar)
  object.size(myDataFrame.split)
  ## 144524992 bytes # ~ 144MB
 
 
  Note:
 
   only.attr - lapply(myDataFrame.split,function(x) sapply(x,attributes))
 
 
 (object.size(myDataFrame.split)-object.size(myDataFrame))/object.size(only.attr)
 
  1.03726179240978 bytes
 
 
 
 
   object.size(selectSubAct.df)
  ## 52,348,272 bytes # ~ 52MB
 
 
  What was this??
 
 
  Chuck
 
 
   sessionInfo()
 
  R version 2.10.0 Patched (2009-10-27 r50222)
  x86_64-unknown-linux-gnu
 
  locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
  [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
 
  attached base packages:
  [1] stats graphics  grDevices datasets  utils methods   base
 
  loaded via a namespace (and not attached):
  [1] tools_2.10.0
 
  Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
  Indiana University School of Medicine
 
  15032 Hunter Court, Westfield, IN  46074
 
  (317) 490-5129 Work,  Mobile  VoiceMail
  (317) 399-1219 Skype No Voicemail please
 
 [[alternative HTML version deleted

Re: [R] problem with split eating giga-bytes of memory

2009-12-08 Thread Mark Kimpel

Jim, could you provide a code snippit to illustrate what you mean?

Hadley, good point, I did not know that.

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please


On Tue, Dec 8, 2009 at 11:00 PM, jim holtman jholt...@gmail.com wrote:

 Also instead of 'splitting' the data frame, I split the indices and then
 use those to access the information in the original dataframe.


 On Tue, Dec 8, 2009 at 9:54 PM, Mark Kimpel mwkim...@gmail.com wrote:

 Hadley, Just as you were apparently writing I had the same thought and did
 exactly what you suggested, converting all columns except the one that I
 want split to character. Executed almost instantaneously without problem.
 Thanks! Mark

 Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN  46074

 (317) 490-5129 Work,  Mobile  VoiceMail
 (317) 399-1219 Skype No Voicemail please


  On Tue, Dec 8, 2009 at 10:48 PM, hadley wickham h.wick...@gmail.com
 wrote:

  Hi Mark,
 
  Why are you using factors?  I think for this case you might find
  characters are faster and more space efficient.
 
  Alternatively, you can have a look at the plyr package which uses some
  tricks to keep memory usage down.
 
  Hadley
 
  On Tue, Dec 8, 2009 at 9:46 PM, Mark Kimpel mwkim...@gmail.com wrote:
   Charles, I suspect your are correct regarding copying of the
 attributes.
   First off, selectSubAct.df is my real data, which turns out to be of
  the
   same dim() as myDataFrame below, but each column is make up of
 strings,
  not
   simple letters, and there are many levels in each column, which I did
 not
   properly duplicate in my first example. I have ammended that below and
  with
   the split the new object size is now not 10X the size of the original,
  but
   100X. My real data is even more complex than this, so I suspect that
 is
   where the problem lies. I need to search for a better solution to my
  problem
   than split, for which I will start a separate thread if I can't figure
   something out.
  
   Thanks for pointing me in the right direction,
  
   Mark
  
   myDataFrame - data.frame(matrix(paste(The rain in Spain,
   as.character(1:1400), sep = .), ncol = 7, nrow = 399000))
   mySplitVar - factor(paste(Rainy days and Mondays,
  as.character(1:1400),
   sep = .))
   myDataFrame - cbind(myDataFrame, mySplitVar)
   object.size(myDataFrame)
   ## 12860880 bytes # ~ 13MB
   myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar)
   object.size(myDataFrame.split)
   ## 1,274,929,792 bytes ~ 1.2GB
   object.size(selectSubAct.df)
   ## 52,348,272 bytes # ~ 52MB
   Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
   Indiana University School of Medicine
  
   15032 Hunter Court, Westfield, IN  46074
  
   (317) 490-5129 Work,  Mobile  VoiceMail
   (317) 399-1219 Skype No Voicemail please
  
  
   On Tue, Dec 8, 2009 at 10:22 PM, Charles C. Berry 
 cbe...@tajo.ucsd.edu
  wrote:
  
   On Tue, 8 Dec 2009, Mark Kimpel wrote:
  
I'm having trouble using split on a very large data-set with ~1400
  levels
   of
   the factor to be split. Unfortunately, I can't reproduce it with the
   simple
   self-contained example below. As you can see, splitting the
 artificial
   dataframe of size ~13MB results in a split dataframe of ~ 144MB,
 with
  an
   increase memory allocation of ~10 fold for the split object. If
 split
   scales
   linearly, then my actual 52MB dataframe should be easily handled by
 my
   12GB
   of RAM, but it is not. instead, when I try to split selectSubAct.df
 on
  one
   of its factors with 1473 levels, my memory is slowly gobbled up
 (plus 3
  GB
   of swap) until I cancel the operation.
  
   Any ideas on what might be happening? Thanks, Mark
  
  
   Each element of myDataFrame.split contains a copy of the attributes
 of
  the
   parent data.frame.
  
   And probably it does scale linearly. But the scaling factor depends
 on
  the
   size of the attributes that get copied, I guess.
  
  
  
  
   myDataFrame - data.frame(matrix(LETTERS, ncol = 7, nrow = 399000))
   mySplitVar - factor(as.character(1:1400))
   myDataFrame - cbind(myDataFrame, mySplitVar)
   object.size(myDataFrame)
   ## 12860880 bytes # ~ 13MB
   myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar)
   object.size(myDataFrame.split)
   ## 144524992 bytes # ~ 144MB
  
  
   Note:
  
only.attr - lapply(myDataFrame.split,function(x)
 sapply(x,attributes))
  
  
 
 (object.size(myDataFrame.split)-object.size(myDataFrame))/object.size(only.attr)
  
   1.03726179240978 bytes
  
  
  
  
object.size(selectSubAct.df)
   ## 52,348,272 bytes # ~ 52MB
  
  
   What was this??
  
  
   Chuck
  
  
sessionInfo()
  
   R version 2.10.0 Patched (2009-10-27 r50222)
   x86_64-unknown-linux-gnu
  
   locale

Re: [R] package tm fails to remove the with remove stopwords

2009-11-16 Thread Mark Kimpel

Thanks Ingo.

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please


On Sun, Nov 15, 2009 at 11:05 AM, Ingo Feinerer feine...@logic.at wrote:

 On Thu, Nov 12, 2009 at 11:29:50AM -0500, Mark Kimpel wrote:
  I am using code that previously worked to remove stopwords using package
 tm.

 Thanks for reporting. This is a bug in the removeWords() function in
 tm version 0.5-1 available from CRAN:

  require(tm)
  myDocument - c(the rain in Spain, falls mainly on the plain, jack
 and jill ran up the hill, to fetch a pail of water)
  text.corp - Corpus(VectorSource(myDocument))
  #
  text.corp - tm_map(text.corp, stripWhitespace)
  text.corp - tm_map(text.corp, removeNumbers)
  text.corp - tm_map(text.corp, removePunctuation)
  ## text.corp - tm_map(text.corp, stemDocument)
  text.corp - tm_map(text.corp, removeWords, c(the,
 stopwords(english)))
  dtm - DocumentTermMatrix(text.corp)
  dtm
  dtm.mat - as.matrix(dtm)
  dtm.mat
 
   dtm.mat
  Terms
  Docs falls fetch hill jack jill mainly pail plain rain ran spain the
 water
 1 0 0000  00 01   0 1   1
 0
 2 1 0000  10 10   0 0   0
 0
 3 0 0111  00 00   1 0   0
 0
 4 0 1000  01 00   0 0   0
 1

 The function removeWords() fails to remove patterns at the beginning or at
 the end
 of a line.

 This bug is fixed in the latest development version on R-Forge, and
 the fix will be included in the next CRAN release.

 Please see

 https://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/inst/NEWS?root=tmview=markup
 for a list of all bug fixes and changes between each tm version.

 Best regards, Ingo Feinerer

 --
 Ingo Feinerer
 Vienna University of Technology
 http://www.dbai.tuwien.ac.at/staff/feinerer


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package tm fails to remove the with remove stopwords

2009-11-13 Thread Mark Kimpel

Sam,

Thanks for the example. Removing stop words after the DocumentTermMatrix has
been created works fine if one is working with single words, but what if one
is creating a dtm of possible combinations of words? Wouldn't one want to
remove them from the corpus?

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please


On Thu, Nov 12, 2009 at 12:04 PM, Sam Thomas sam.tho...@revelanttech.comwrote:

  I'm not sure what's wrong with your approach, but this seems to strip
 the



 require(tm)

 params - list(minDocFreq = 1,

 removeNumbers = TRUE,

 stemming = TRUE,

 stopwords = TRUE,

 weighting = weightTf)



 myDocument - c(the rain in Spain, falls mainly on the plain, jack and
 jill ran up the hill, to fetch a pail of water)

 text.corp - Corpus(VectorSource(myDocument))

 dtm - DocumentTermMatrix(text.corp, control = params)

 dtm

 dtm.mat - as.matrix(dtm)

 dtm.mat





 *From:* Mark Kimpel [mailto:mwkim...@gmail.com]
 *Sent:* Thursday, November 12, 2009 11:30 AM
 *To:* r-help@r-project.org; feine...@logic.at; Sam Thomas
 *Subject:* package tm fails to remove the with remove stopwords



 I am using code that previously worked to remove stopwords using package
 tm. Even manually adding the to the list does not work to remove the.
 This package has undergone extensive redevelopment with changes to the
 function syntax, so perhaps I am just missing something.



 Please see my simple example, output, and sessionInfo() below.



 Thanks!

 Mark



 require(tm)

 myDocument - c(the rain in Spain, falls mainly on the plain, jack and
 jill ran up the hill, to fetch a pail of water)

 text.corp - Corpus(VectorSource(myDocument))

 #

 text.corp - tm_map(text.corp, stripWhitespace)

 text.corp - tm_map(text.corp, removeNumbers)

 text.corp - tm_map(text.corp, removePunctuation)

 ## text.corp - tm_map(text.corp, stemDocument)

 text.corp - tm_map(text.corp, removeWords, c(the, stopwords(english)))

 dtm - DocumentTermMatrix(text.corp)

 dtm

 dtm.mat - as.matrix(dtm)

 dtm.mat



  dtm.mat

 Terms

 Docs falls fetch hill jack jill mainly pail plain rain ran spain the water

1 0 0000  00 01   0 1   1 0

2 1 0000  10 10   0 0   0 0

3 0 0111  00 00   1 0   0 0

4 0 1000  01 00   0 0   0 1



 R version 2.10.0 Patched (2009-10-27 r50222)

 x86_64-unknown-linux-gnu



 locale:

  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C

  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8

  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8

  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C

  [9] LC_ADDRESS=C   LC_TELEPHONE=C

 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C



 attached base packages:

 [1] stats graphics  grDevices datasets  utils methods   base



 other attached packages:

 [1] chron_2.3-33 RWeka_0.3-23 tm_0.5-1



 loaded via a namespace (and not attached):

 [1] grid_2.10.0  rJava_0.8-1  slam_0.1-6   tools_2.10.0





 Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN  46074

 (317) 490-5129 Work,  Mobile  VoiceMail
 (317) 399-1219 Skype No Voicemail please


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] package tm fails to remove the with remove stopwords

2009-11-12 Thread Mark Kimpel

I am using code that previously worked to remove stopwords using package
tm. Even manually adding the to the list does not work to remove the.
This package has undergone extensive redevelopment with changes to the
function syntax, so perhaps I am just missing something.

Please see my simple example, output, and sessionInfo() below.

Thanks!
Mark

require(tm)
myDocument - c(the rain in Spain, falls mainly on the plain, jack and
jill ran up the hill, to fetch a pail of water)
text.corp - Corpus(VectorSource(myDocument))
#
text.corp - tm_map(text.corp, stripWhitespace)
text.corp - tm_map(text.corp, removeNumbers)
text.corp - tm_map(text.corp, removePunctuation)
## text.corp - tm_map(text.corp, stemDocument)
text.corp - tm_map(text.corp, removeWords, c(the, stopwords(english)))
dtm - DocumentTermMatrix(text.corp)
dtm
dtm.mat - as.matrix(dtm)
dtm.mat

 dtm.mat
Terms
Docs falls fetch hill jack jill mainly pail plain rain ran spain the water
   1 0 0000  00 01   0 1   1 0
   2 1 0000  10 10   0 0   0 0
   3 0 0111  00 00   1 0   0 0
   4 0 1000  01 00   0 0   0 1

R version 2.10.0 Patched (2009-10-27 r50222)
x86_64-unknown-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices datasets  utils methods   base

other attached packages:
[1] chron_2.3-33 RWeka_0.3-23 tm_0.5-1

loaded via a namespace (and not attached):
[1] grid_2.10.0  rJava_0.8-1  slam_0.1-6   tools_2.10.0


Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to model a numeric factor as a non-ordinal factor

2009-10-28 Thread Mark Kimpel

I am analyzing an experiment in which time is a factor, represented by
numbers indicating time since last treatment, but in this particular case
there is no reason to think that time has a numeric meaning in the sense
that 24 would be greater than 6. We have no idea which genes will be
increasing or decreasing at different times.

So I have the following model (to be applied over many genes):

mod - lm(gene.expression ~ Treatment + Time)

If I wanted to just hack my way around this I could just paste a character
to all of the times, but I'm curious as to the right way to do this.

What would be the correct syntax or transformation? Would Time -
factor(as.character(Time)) do it? I want to make sure that it does not get
coerced back to numeric.

Thanks,

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help with the use of mtext to create main title over multiple plots

2009-10-12 Thread Mark Kimpel

I'm trying to use mtext to create a main title over multiple plots. Below is
a simple self-contained example and my sessionInfo (I should note I've also
tried this with R-2.8.1 with the same results). When I execute the code
chunk below, I get the plots, but no title. I've tried this using the screen
driver, pdf, and postscript. I've used different sizes of paper. I suspect I
am making an elementary error but searching the help files and help archives
hasn't provided me an answer.

Thanks for any help, Mark

#
setwd(~/Desktop)
pdf(my.test.plots.pdf, paper = letter)
par(mfrow=c(2,2))
for (i in 1:4){
  plot(1:6, 1:6)
}
mtext(text = my test plots, side = 3, outer = TRUE)
dev.off()
#

R version 2.10.0 Under development (unstable) (2009-09-21 r49771)
x86_64-unknown-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] car_1.2-15

loaded via a namespace (and not attached):
[1] tools_2.10.0

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with the use of mtext to create main title over multiple plots

2009-10-12 Thread Mark Kimpel

Thanks Tony (and others). Setting oma corrects the problem. Mark
Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219 Skype No Voicemail please


On Mon, Oct 12, 2009 at 1:41 PM, Tony Plate tpl...@acm.org wrote:

 Try playing around with the oma setting in par() -- it sets the outer
 margins, which by default are zero.

 The following shows the mtext label for me, using the windows device:

  par(mfrow=c(2,2))
 par(oma)

 [1] 0 0 0 0

 par(oma=c(0,0,2,0))
 for (i in 1:4) plot(0:1,0:1)
 mtext(text = my test plots, side = 3, outer = TRUE)


 Mark Kimpel wrote:

 I'm trying to use mtext to create a main title over multiple plots. Below
 is
 a simple self-contained example and my sessionInfo (I should note I've
 also
 tried this with R-2.8.1 with the same results). When I execute the code
 chunk below, I get the plots, but no title. I've tried this using the
 screen
 driver, pdf, and postscript. I've used different sizes of paper. I suspect
 I
 am making an elementary error but searching the help files and help
 archives
 hasn't provided me an answer.

 Thanks for any help, Mark

 #
 setwd(~/Desktop)
 pdf(my.test.plots.pdf, paper = letter)
 par(mfrow=c(2,2))
 for (i in 1:4){
  plot(1:6, 1:6)
 }
 mtext(text = my test plots, side = 3, outer = TRUE)
 dev.off()
 #

 R version 2.10.0 Under development (unstable) (2009-09-21 r49771)
 x86_64-unknown-linux-gnu

 locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] car_1.2-15

 loaded via a namespace (and not attached):
 [1] tools_2.10.0

 Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN  46074

 (317) 490-5129 Work,  Mobile  VoiceMail
 (317) 399-1219 Skype No Voicemail please

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problems with strsplit using a split of ' \\\ ' : a regex problem

2009-08-27 Thread Mark Kimpel

I have a vector of gene symbols, some of which have multiple aliases. In the
case of an alias, they are separated by ' \\\ '.
Here is a real world example, which would represent one element of my
vector:
Eif4g2 /// Eif4g2-ps1 /// LOC678831

What I would like to do is input the vector into a function and output a
vector with just the first alias of each element (or, if there are no
aliases, just the one symbol).

So I wrote a simple little function to do this:
get.first.id.func - function(vec, splitter){
  vec.lst - strsplit(vec, splitter)
  first.func - function(vec1){vec1[1]}
  vec.out - sapply(vec.lst, first.func)
  vec.out
}

For a trivial example, this works:
 a - c(a_b, c_d)
 get.first.id.func(a, _)
[1] a c

I am running into problems, however, with the real world split of ' \\\ '
I'm not even able to construct a sample vector of my own! Here is what I
get:
 a - c('a \\\ b', 'a \\\ b')
 a
[1] a \\ b a \\ b
 a - c('a  b', 'a  b')
 a
[1] a  b a  b

I KNOW this is related to R's peculiarities with \ escapes, but I don't have
the expertise to know how to get around it.

I would be very interested to learn:
1. how to construct a vector such that a == c('a \\\ b', 'a \\\ b')
2. how to properly input my split into my function so that I get the split
desired.

Thanks regex experts!
Mark


Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems with strsplit using a split of ' \\\ ' : a regex problem

2009-08-27 Thread Mark Kimpel

Thanks Henrique. I had actually tried using 6 back-slashes but didn't know
to use 'cat' to see the non-escaped representation (see below to see my
original confusion). Your strsplit, of course, works great. Thanks again!

 a
[1] a \\ b a \\ b
 cat(a)
a \\\ b a \\\ b

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**


On Thu, Aug 27, 2009 at 9:15 PM, Henrique Dallazuanna www...@gmail.comwrote:

 You need a escape before each backslash:

 a - c('a \\ b', 'a \\ b')
 cat(a, \n)

 You can write in this form:

 strsplit(a,  .*\\.* )



 On Thu, Aug 27, 2009 at 10:03 PM, Mark Kimpel mwkim...@gmail.com wrote:

 I have a vector of gene symbols, some of which have multiple aliases. In
 the
 case of an alias, they are separated by ' \\\ '.
 Here is a real world example, which would represent one element of my
 vector:
 Eif4g2 /// Eif4g2-ps1 /// LOC678831

 What I would like to do is input the vector into a function and output a
 vector with just the first alias of each element (or, if there are no
 aliases, just the one symbol).

 So I wrote a simple little function to do this:
 get.first.id.func - function(vec, splitter){
  vec.lst - strsplit(vec, splitter)
  first.func - function(vec1){vec1[1]}
  vec.out - sapply(vec.lst, first.func)
  vec.out
 }

 For a trivial example, this works:
  a - c(a_b, c_d)
  get.first.id.func(a, _)
 [1] a c

 I am running into problems, however, with the real world split of ' \\\ '
 I'm not even able to construct a sample vector of my own! Here is what I
 get:
  a - c('a \\\ b', 'a \\\ b')
  a
 [1] a \\ b a \\ b
  a - c('a  b', 'a  b')
  a
 [1] a  b a  b

 I KNOW this is related to R's peculiarities with \ escapes, but I don't
 have
 the expertise to know how to get around it.

 I would be very interested to learn:
 1. how to construct a vector such that a == c('a \\\ b', 'a \\\ b')
 2. how to properly input my split into my function so that I get the split
 desired.

 Thanks regex experts!
 Mark

 
 Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN  46074

 (317) 490-5129 Work,  Mobile  VoiceMail

 The real problem is not whether machines think but whether men do. -- B.
 F. Skinner
 **

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help with regular expressions in R

2009-08-20 Thread Mark Kimpel

I'm having trouble achieving the results I want using a regular expression.
I want to eliminate all characters that fall within square brackets as well
as the brackets themselves, returning an . I'm not sure if it's R's use of
double slash escapes or something else that is tripping me up. If I only use
one slash I get
1: '\[' is an unrecognized escape in a character string
2: '\]' is an unrecognized escape in a character string
3: unrecognized escapes removed from \[*.\]

Below is my self-contained code followed by sessionInfo().

Thanks in advance for your help. I'm going to be doing a lot of text mining
in the near future. I have an excellent O'Reilly book on regex's. What is
the best reference for R's special treatment of these animals?
Mark


myCharVec - c([the rain in spain], (the rain in spain))
gsub('\\[*.\\]', '', myCharVec)

#what I get
# [1] [the rain in spai   (the rain in spain)

#what I want
[1](the rain in spain)

 sessionInfo()
R version 2.10.0 Under development (unstable) (2009-08-12 r49193)
x86_64-unknown-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices datasets  utils methods   base

other attached packages:
[1] RWeka_0.3-20 tm_0.4

loaded via a namespace (and not attached):
[1] grid_2.10.0 rJava_0.6-3 slam_0.1-3



Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with regular expressions in R

2009-08-20 Thread Mark Kimpel

Well, I guess I'm not quite there yet. What I gave earlier was a simplified
example, and did not accurately reflect the complexity of the task.

This is my real world example. As you can see, what I need to do is delete
an arbitrary number of characters, including brackets and parens enclosing
them, multiple times within the same string. Help?

myCharVec -  medicare [link  220.30.05]  ssa (1-800-772-1213). 2008 [link
145.30.05] amounts  (2d) gross income (magi) here. (2e)
myCharVec
myCharVec - gsub('\\[.*\\]', '', myCharVec)
myCharVec
myCharVec - gsub('\\(.*\\)', '', myCharVec)
myCharVec

#what I want
# medicare  ssa . 2008  amounts gross income here.

myCharVec -  medicare [link  220.30.05]  ssa (1-800-772-1213). 2008 [link
145.30.05] amounts  (2d) gross income (magi) here. (2e)
 myCharVec
[1] medicare [link  220.30.05]  ssa (1-800-772-1213). 2008 [link
145.30.05] amounts  (2d) gross income (magi) here. (2e)
 myCharVec - gsub('\\[.*\\]', '', myCharVec)
 myCharVec
[1] medicare  amounts  (2d) gross income (magi) here. (2e)
 myCharVec - gsub('\\(.*\\)', '', myCharVec)
 myCharVec
[1] medicare  amounts  

 #what I want
 # medicare  ssa . 2008  amounts gross income here.

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**


On Thu, Aug 20, 2009 at 11:39 AM, William Dunlap wdun...@tibco.com wrote:


  -Original Message-
  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] On Behalf Of Mark Kimpel
  Sent: Thursday, August 20, 2009 8:31 AM
  To: r-help@r-project.org
  Subject: [R] help with regular expressions in R
  ...
  myCharVec - c([the rain in spain], (the rain in spain))
  gsub('\\[*.\\]', '', myCharVec)

 Change the '*.' to '.*'.

 Your expression matches 0 or more left square brackets,
 followed by 1 character, followed by a right squared bracket.

 \\[.*\]] matches a left square bracket, followed by 0 or more
 characters, followed by a right square bracket.

 Bill Dunlap
 TIBCO Software Inc - Spotfire Division
 wdunlap tibco.com

 
  #what I get
  # [1] [the rain in spai   (the rain in spain)
 
  #what I want
  [1](the rain in spain)
 
   sessionInfo()
  R version 2.10.0 Under development (unstable) (2009-08-12 r49193)
  x86_64-unknown-linux-gnu
 
  locale:
   [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
   [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
   [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
   [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
   [9] LC_ADDRESS=C   LC_TELEPHONE=C
  [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
 
  attached base packages:
  [1] stats graphics  grDevices datasets  utils methods   base
 
  other attached packages:
  [1] RWeka_0.3-20 tm_0.4
 
  loaded via a namespace (and not attached):
  [1] grid_2.10.0 rJava_0.6-3 slam_0.1-3
 
 
  
  Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
  Indiana University School of Medicine
 
  15032 Hunter Court, Westfield, IN  46074
 
  (317) 490-5129 Work,  Mobile  VoiceMail
 
  The real problem is not whether machines think but whether
  men do. -- B.
  F. Skinner
  **
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with regular expressions in R

2009-08-20 Thread Mark Kimpel

Thanks guys. I've pulled my O'Reilly book and will begin reviewing it.

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**


On Thu, Aug 20, 2009 at 12:37 PM, Phil Spector spec...@stat.berkeley.eduwrote:

 Mark -
   It looks like you're running into the greediness of regular expressions.
 When R sees .* it tries to find the longest match,  which also grabs
 some of the stuff you want.  You can either replace .* with something
 like [^\\])]* (i.e. one or more of any character *except* ] or ) ),
 or use perl=TRUE, which allows the question mark (?) to mean the shortest
 match instead of the longest.  Here's what I'd use:

  gsub('[\\[(].*?[\\])]','',myCharVec,perl=TRUE)

 In English:  substitute the shortest string starting with [ or ( and
 ending with ] or ) with nothing.

   Hope this helps.
 - Phil




 On Thu, 20 Aug 2009, Mark Kimpel wrote:

  Well, I guess I'm not quite there yet. What I gave earlier was a
 simplified
 example, and did not accurately reflect the complexity of the task.

 This is my real world example. As you can see, what I need to do is delete
 an arbitrary number of characters, including brackets and parens enclosing
 them, multiple times within the same string. Help?

 myCharVec -  medicare [link  220.30.05]  ssa (1-800-772-1213). 2008
 [link
 145.30.05] amounts  (2d) gross income (magi) here. (2e)
 myCharVec
 myCharVec - gsub('\\[.*\\]', '', myCharVec)
 myCharVec
 myCharVec - gsub('\\(.*\\)', '', myCharVec)
 myCharVec

 #what I want
 # medicare  ssa . 2008  amounts gross income here.

 myCharVec -  medicare [link  220.30.05]  ssa (1-800-772-1213). 2008
 [link
 145.30.05] amounts  (2d) gross income (magi) here. (2e)

 myCharVec

 [1] medicare [link  220.30.05]  ssa (1-800-772-1213). 2008 [link
 145.30.05] amounts  (2d) gross income (magi) here. (2e)

 myCharVec - gsub('\\[.*\\]', '', myCharVec)
 myCharVec

 [1] medicare  amounts  (2d) gross income (magi) here. (2e)

 myCharVec - gsub('\\(.*\\)', '', myCharVec)
 myCharVec

 [1] medicare  amounts  


 #what I want
 # medicare  ssa . 2008  amounts gross income here.

 
 Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN  46074

 (317) 490-5129 Work,  Mobile  VoiceMail

 The real problem is not whether machines think but whether men do. -- B.
 F. Skinner
 **


 On Thu, Aug 20, 2009 at 11:39 AM, William Dunlap wdun...@tibco.com
 wrote:


  -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of Mark Kimpel
 Sent: Thursday, August 20, 2009 8:31 AM
 To: r-help@r-project.org
 Subject: [R] help with regular expressions in R
 ...
 myCharVec - c([the rain in spain], (the rain in spain))
 gsub('\\[*.\\]', '', myCharVec)


 Change the '*.' to '.*'.

 Your expression matches 0 or more left square brackets,
 followed by 1 character, followed by a right squared bracket.

 \\[.*\]] matches a left square bracket, followed by 0 or more
 characters, followed by a right square bracket.

 Bill Dunlap
 TIBCO Software Inc - Spotfire Division
 wdunlap tibco.com


 #what I get
 # [1] [the rain in spai   (the rain in spain)

 #what I want
 [1](the rain in spain)

  sessionInfo()

 R version 2.10.0 Under development (unstable) (2009-08-12 r49193)
 x86_64-unknown-linux-gnu

 locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices datasets  utils methods   base

 other attached packages:
 [1] RWeka_0.3-20 tm_0.4

 loaded via a namespace (and not attached):
 [1] grid_2.10.0 rJava_0.6-3 slam_0.1-3


 
 Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN  46074

 (317) 490-5129 Work,  Mobile  VoiceMail

 The real problem is not whether machines think but whether
 men do. -- B.
 F. Skinner
 **

  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read

[R] reading in MS Word files

2009-08-18 Thread Mark Kimpel

I am familiar with packages that read and write Excel files on both Windows
and Linux platforms.

Do any packages provide similar functionality for MS Word files? I have a
lot of text processing to do and the text is embedded in ~200 different Word
files (.doc format Office 2003). All I need to do is read, not write.

Thanks,
Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reading in MS Word files

2009-08-18 Thread Mark Kimpel

Thanks guys, as I wanted to do a little preprocessing before importing into
tm (the files have all sorts of stuff in them that I don't need), I used a
system to invoke Abiword and do the batch conversions. Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**


On Tue, Aug 18, 2009 at 10:56 AM, Ingo Feinerer feine...@logic.at wrote:

 On Tue, Aug 18, 2009 at 12:00:07PM +0200, Mark Kimpel wrote:
  I am familiar with packages that read and write Excel files on both
 Windows
  and Linux platforms.
 
  Do any packages provide similar functionality for MS Word files? I have a
  lot of text processing to do and the text is embedded in ~200 different
 Word
  files (.doc format Office 2003). All I need to do is read, not write.

 See readDOC in package tm. E.g., something like

 Corpus(DirSource(aDirectoryContainingTheWordFiles), readerControl =
 list(reader = readDOC))

 Note that you need antiword (http://www.winfield.demon.nl/) in your
 path such that readDOC can use it.

 Best regards, Ingo

 --
 Ingo Feinerer
 Vienna University of Technology
 http://www.dbai.tuwien.ac.at/staff/feinerer


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using package tm to find phrases

2009-08-14 Thread Mark Kimpel

Thanks, the pointer to the tokenizer helped.

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**


On Thu, Aug 13, 2009 at 6:11 PM, Ingo Feinerer feine...@logic.at wrote:

 On Thu, Aug 13, 2009 at 03:36:22PM -0400, Mark Kimpel wrote:
  I am using the package tm for text-mining of abstracts and would like
 to use
  it to find instances of gene names that may contain white space. For
 instance
  gene regulatory protein 1. The default behavior of tm is to parse this
 into 4
  separate words, but I would like to use the class constructor
 dictionary to
  define phrases such as just mentioned.
 
  Is this possible? If so, how?

 Yes.

 * In case you only need to find instances, you could use full text
  search on your corpus, e.g.

  R tmIndex(yourCorpus, gene regulatory protein 1)

  would return the indices of all documents in your corpus containing
  this phrase.

 * If you need tokens (in a term-document matrix) of length 4, you could
  use a n-gram tokenizer (n = 4). See e.g.,
  http://tm.r-forge.r-project.org/faq.html#Bigrams. Then you can use
  the dictionary argument to store only your selection of gene
  names. I.e., something like

  R yourTokenizer - function(x) RWeka::NGramTokenizer(x, Weka_control(min
 = 4, max = 4))
  R TermDocumentMatrix(crude, control = list(tokenize = yourTokenizer,
 dictionary = yourDictionary))

  where yourDictionary contains the gene names (a character vector
  suffices) to be included in the term-document matrix.

 * If you want to extract arbitrary patterns of different length that
  could match some gene names (and build a dictionary from that), you
  need some custom functionality. Regular expressions might be a good
  starting point ...

 Best regards, Ingo

 --
 Ingo Feinerer
 Vienna University of Technology
 http://www.dbai.tuwien.ac.at/staff/feinerer


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] using package tm to find phrases

2009-08-13 Thread Mark Kimpel

I am using the package tm for text-mining of abstracts and would like to
use it to find instances of gene names that may contain white space. For
instance gene regulatory protein 1. The default behavior of tm is to parse
this into 4 separate words, but I would like to use the class constructor
dictionary to define phrases such as just mentioned.

Is this possible? If so, how?

Thanks,
Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with heatmap.2 in package gplots generating non-finite breaks

2009-07-21 Thread Mark Kimpel

I have written a wrapper for heatmap.2 called
heatmap.w.row.and.col.clust which auto-generates breaks using
breaks-round((c(seq(from=(-20 * stddev), to=(20 * stddev/20,
digits = 2)  #(stddev in this case = 2.5)

This has always worked well in the past but now I am getting an error
that non-finite breaks are being generated. Drilling down, it seems
that my wrapper is generating finite breaks but for some reason
heatmap.2 is putting a NaN into the first and last positions in the
vector.

Is it obvious using the breaks my wrapper has generated why this
should be so? My sessionInfo() follows.

Thanks, Mark

Browse[1] c

Enter a frame number, or 0 to exit

1: heatmap.w.row.and.col.clust(iqa.corp.sparse.rem)
2: heatmap.func.R#29: heatmap.2(as.matrix(dataframe), col = color.palette, bre
3: image(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt = n, y
4: image.default(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt

Selection: 1
Called from: eval(expr, envir, enclos)
Browse[1] ls()
[1] breaks col.labels color.palette
[4] dataframe  dendrogram.options remove.mean
[7] row.labels stddev
Browse[1] breaks
  [1] -2.50 -2.45 -2.40 -2.35 -2.30 -2.25 -2.20 -2.15 -2.10 -2.05 -2.00 -1.95
 [13] -1.90 -1.85 -1.80 -1.75 -1.70 -1.65 -1.60 -1.55 -1.50 -1.45 -1.40 -1.35
 [25] -1.30 -1.25 -1.20 -1.15 -1.10 -1.05 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75
 [37] -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15
 [49] -0.10 -0.05  0.00  0.05  0.10  0.15  0.20  0.25  0.30  0.35  0.40  0.45
 [61]  0.50  0.55  0.60  0.65  0.70  0.75  0.80  0.85  0.90  0.95  1.00  1.05
 [73]  1.10  1.15  1.20  1.25  1.30  1.35  1.40  1.45  1.50  1.55  1.60  1.65
 [85]  1.70  1.75  1.80  1.85  1.90  1.95  2.00  2.05  2.10  2.15  2.20  2.25
 [97]  2.30  2.35  2.40  2.45  2.50
Browse[1] is.finite(breaks)
  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
Browse[1] c

Enter a frame number, or 0 to exit

1: heatmap.w.row.and.col.clust(iqa.corp.sparse.rem)
2: heatmap.func.R#29: heatmap.2(as.matrix(dataframe), col = color.palette, bre
3: image(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt = n, y
4: image.default(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt

Selection: 2
Called from: eval(expr, envir, enclos)
Browse[1] heatmap.func.R
Error during wrapup: object 'heatmap.func.R' not found
Browse[1] ls()
 [1] add.expr  breakscellnote  cexCol
 [5] cexRowcol   colIndcolsep
 [9] ColSideColors Colv  ddc   ddr
[13] dendrogramdensadj   denscol   density.info
[17] didistfun   hcc   hclustfun
[21] hcr   hline iykey
[25] keysize   labCollabRowlhei
[29] linecol   lmat  lwid  main
[33] margins   max.breaksmax.raw   max.scale
[37] min.breaksmin.raw   min.scale mmat
[41] na.color  na.rm nbr   nc
[45] ncol  notecex   notecol   nr
[49] opretvalrevC  rm
[53] rowIndrowsepRowSideColors Rowv
[57] scale scale01   sepcolor  sepwidth
[61] sxsymbreaks symkeysymm
[65] tmpbreaks trace tracecol  vline
[69] x xlab  x.scaled  x.unscaled
[73] ylab  z
Browse[1] tmpbreaks
  [1]   NaN -2.45 -2.40 -2.35 -2.30 -2.25 -2.20 -2.15 -2.10 -2.05 -2.00 -1.95
 [13] -1.90 -1.85 -1.80 -1.75 -1.70 -1.65 -1.60 -1.55 -1.50 -1.45 -1.40 -1.35
 [25] -1.30 -1.25 -1.20 -1.15 -1.10 -1.05 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75
 [37] -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15
 [49] -0.10 -0.05  0.00  0.05  0.10  0.15  0.20  0.25  0.30  0.35  0.40  0.45
 [61]  0.50  0.55  0.60  0.65  0.70  0.75  0.80  0.85  0.90  0.95  1.00  1.05
 [73]  1.10  1.15  1.20  1.25  1.30  1.35  1.40  1.45  1.50  1.55  1.60  1.65
 [85]  1.70  1.75  1.80  1.85  1.90  1.95  2.00  2.05  2.10  2.15  2.20  2.25
 [97]  2.30  2.35  2.40  2.45   NaN
Browse[1] sessionInfo()
R version 2.10.0 Under development (unstable) (2009-05-31 r48697)
x86_64-unknown-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base

Re: [R] problem with heatmap.2 in package gplots generating non-finite breaks

2009-07-21 Thread Mark Kimpel

Never mind, the problem seems to be that I have ignored the warning
Using scale=row or scale=column when breaks arespecified can
produce unpredictable results.Please consider using only one or the
other.

I just stop specifying the breaks and it works fine.

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do.
-- B. F. Skinner
**



On Tue, Jul 21, 2009 at 10:28 AM, Mark Kimpelmwkim...@gmail.com wrote:
 I have written a wrapper for heatmap.2 called
 heatmap.w.row.and.col.clust which auto-generates breaks using
 breaks-round((c(seq(from=(-20 * stddev), to=(20 * stddev/20,
 digits = 2)  #(stddev in this case = 2.5)

 This has always worked well in the past but now I am getting an error
 that non-finite breaks are being generated. Drilling down, it seems
 that my wrapper is generating finite breaks but for some reason
 heatmap.2 is putting a NaN into the first and last positions in the
 vector.

 Is it obvious using the breaks my wrapper has generated why this
 should be so? My sessionInfo() follows.

 Thanks, Mark

 Browse[1] c

 Enter a frame number, or 0 to exit

 1: heatmap.w.row.and.col.clust(iqa.corp.sparse.rem)
 2: heatmap.func.R#29: heatmap.2(as.matrix(dataframe), col = color.palette, bre
 3: image(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt = n, y
 4: image.default(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt

 Selection: 1
 Called from: eval(expr, envir, enclos)
 Browse[1] ls()
 [1] breaks             col.labels         color.palette
 [4] dataframe          dendrogram.options remove.mean
 [7] row.labels         stddev
 Browse[1] breaks
  [1] -2.50 -2.45 -2.40 -2.35 -2.30 -2.25 -2.20 -2.15 -2.10 -2.05 -2.00 -1.95
  [13] -1.90 -1.85 -1.80 -1.75 -1.70 -1.65 -1.60 -1.55 -1.50 -1.45 -1.40 -1.35
  [25] -1.30 -1.25 -1.20 -1.15 -1.10 -1.05 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75
  [37] -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15
  [49] -0.10 -0.05  0.00  0.05  0.10  0.15  0.20  0.25  0.30  0.35  0.40  0.45
  [61]  0.50  0.55  0.60  0.65  0.70  0.75  0.80  0.85  0.90  0.95  1.00  1.05
  [73]  1.10  1.15  1.20  1.25  1.30  1.35  1.40  1.45  1.50  1.55  1.60  1.65
  [85]  1.70  1.75  1.80  1.85  1.90  1.95  2.00  2.05  2.10  2.15  2.20  2.25
  [97]  2.30  2.35  2.40  2.45  2.50
 Browse[1] is.finite(breaks)
  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 
 TRUE
  [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 
 TRUE
  [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 
 TRUE
  [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 
 TRUE
  [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 
 TRUE
  [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 
 TRUE
  [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 Browse[1] c

 Enter a frame number, or 0 to exit

 1: heatmap.w.row.and.col.clust(iqa.corp.sparse.rem)
 2: heatmap.func.R#29: heatmap.2(as.matrix(dataframe), col = color.palette, bre
 3: image(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt = n, y
 4: image.default(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt

 Selection: 2
 Called from: eval(expr, envir, enclos)
 Browse[1] heatmap.func.R
 Error during wrapup: object 'heatmap.func.R' not found
 Browse[1] ls()
  [1] add.expr      breaks        cellnote      cexCol
  [5] cexRow        col           colInd        colsep
  [9] ColSideColors Colv          ddc           ddr
 [13] dendrogram    densadj       denscol       density.info
 [17] di            distfun       hcc           hclustfun
 [21] hcr           hline         iy            key
 [25] keysize       labCol        labRow        lhei
 [29] linecol       lmat          lwid          main
 [33] margins       max.breaks    max.raw       max.scale
 [37] min.breaks    min.raw       min.scale     mmat
 [41] na.color      na.rm         nbr           nc
 [45] ncol          notecex       notecol       nr
 [49] op            retval        revC          rm
 [53] rowInd        rowsep        RowSideColors Rowv
 [57] scale         scale01       sepcolor      sepwidth
 [61] sx            symbreaks     symkey        symm
 [65] tmpbreaks     trace         tracecol      vline
 [69] x             xlab          x.scaled      x.unscaled
 [73] ylab          z
 Browse[1] tmpbreaks
  [1]   NaN -2.45 -2.40 -2.35 -2.30 -2.25 -2.20 -2.15 -2.10 -2.05 -2.00 -1.95
  [13] -1.90 -1.85 -1.80 -1.75 -1.70 -1.65 -1.60 -1.55 -1.50 -1.45 -1.40 -1.35
  [25] -1.30 -1.25 -1.20 -1.15 -1.10 -1.05 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75
  [37] -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40

[R] can't get rJava to install on Linux

2009-07-07 Thread Mark Kimpel

Having difficulties getting rJava to install on my Debian Squeeze box.
Perused the R-help list and tried some things that have worked for
others but not for me. Below is the output of my attempted build, R
CMD javareconf -e, and sessionInfo(). Note I tried the R CMD
javareconf also as root, restarted R after each of these, all no help.


* Installing *source* package ‘rJava’ ...
checking for gcc... gcc -std=gnu99
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc -std=gnu99 accepts -g... yes
checking for gcc -std=gnu99 option to accept ISO C89... none needed
checking how to run the C preprocessor... gcc -std=gnu99 -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/wait.h that is POSIX.1 compatible... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for string.h... (cached) yes
checking sys/time.h usability... yes
checking sys/time.h presence... yes
checking for sys/time.h... yes
checking for unistd.h... (cached) yes
checking for an ANSI C-conforming const... yes
checking whether time.h and sys/time.h may both be included... yes
configure: checking whether gcc -std=gnu99 supports static inline...
yes
checking Java support in R... present:
interpreter : '/usr/bin/java'
archiver: '/usr/bin/jar'
compiler: '/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javac'
header prep.: '/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javah'
cpp flags   : ''
java libs   : '-L/usr/lib/../lib/gcj-4.3-90 -L/usr/lib/jni -ljvm'
configure: error: One or more Java configuration variables are not set.
Make sure R is configured with full Java support (including JDK). Run
R CMD javareconf
as root to add Java support to R.

If you don't have root privileges, run
R CMD javareconf -e
to set all Java-related variables and then install rJava.

ERROR: configuration failed for package ‘rJava’
* Removing ‘/home/mkimpel/R_HOME/site-library-2.9.0/rJava’

The downloaded packages are in
‘/tmp/Rtmpfp9kiG/downloaded_packages’
Warning message:
In install.packages(rJava) :
  installation of package 'rJava' had non-zero exit status

 ./R CMD javareconf -e
Java interpreter : /usr/bin/java
Java version : 1.5.0
Java home path   : /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre
Java compiler: /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javac
Java headers gen.: /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javah
Java archive tool: /usr/bin/jar
Java library path: /usr/lib/../lib/gcj-4.3-90:/usr/lib/jni
JNI linker flags : -L/usr/lib/../lib/gcj-4.3-90 -L/usr/lib/jni -ljvm
JNI cpp flags:

The following Java variables have been exported:
JAVA_HOME JAVA JAVAC JAVAH JAR JAVA_LIBS JAVA_CPPFLAGS JAVA_LD_LIBRARY_PATH

 sessionInfo()
R version 2.9.1 (2009-06-26)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices datasets  utils methods   base

loaded via a namespace (and not attached):
[1] tcltk_2.9.1 tools_2.9.1

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do.
-- B. F. Skinner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can't get rJava to install on Linux

2009-07-07 Thread Mark Kimpel

Switching to Sun definitely did not help, still no build with rJava,
below is the output of  R CMD javareconf.

mkimpel-m90 /home/mkimpel/bin# ./R CMD javareconf
*** JAVA_HOME is not a valid path, ignoring
Java interpreter : /usr/bin/java
Java version : 1.6.0_14
Java home path   : /usr/lib/jvm/java-6-sun-1.6.0.14/jre
/home/mkimpel/R_HOME/R-2.9.1/R-build/lib64/R/bin/javareconf: line 150:
/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javac: No such file
or directory
Java compiler: not functional
Java headers gen.: /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javah
Java archive tool: /usr/bin/jar
Java library path: /usr/lib/../lib/gcj-4.3-90:/usr/lib/jni
JNI linker flags : -L/usr/lib/../lib/gcj-4.3-90 -L/usr/lib/jni -ljvm
JNI cpp flags:

Updating Java configuration in /home/mkimpel/R_HOME/R-2.9.1/R-build/lib64/R
Done.


Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do.
-- B. F. Skinner
**



On Tue, Jul 7, 2009 at 9:43 PM, Godmar Backgod...@gmail.com wrote:
 This is just a guess: looks like you have GNU's Java version in your
 path (aka gcj).
 Perhaps rJava relies on Sun's Java version.

 If so, install Sun's Java first. apt-get install sun-java6-jdk might do it.

  - Godmar


 On Tue, Jul 7, 2009 at 9:28 PM, Mark Kimpelmwkim...@gmail.com wrote:
 Having difficulties getting rJava to install on my Debian Squeeze box.
 Perused the R-help list and tried some things that have worked for
 others but not for me. Below is the output of my attempted build, R
 CMD javareconf -e, and sessionInfo(). Note I tried the R CMD
 javareconf also as root, restarted R after each of these, all no help.


 * Installing *source* package ‘rJava’ ...
 checking for gcc... gcc -std=gnu99
 checking for C compiler default output file name... a.out
 checking whether the C compiler works... yes
 checking whether we are cross compiling... no
 checking for suffix of executables...
 checking for suffix of object files... o
 checking whether we are using the GNU C compiler... yes
 checking whether gcc -std=gnu99 accepts -g... yes
 checking for gcc -std=gnu99 option to accept ISO C89... none needed
 checking how to run the C preprocessor... gcc -std=gnu99 -E
 checking for grep that handles long lines and -e... /bin/grep
 checking for egrep... /bin/grep -E
 checking for ANSI C header files... yes
 checking for sys/wait.h that is POSIX.1 compatible... yes
 checking for sys/types.h... yes
 checking for sys/stat.h... yes
 checking for stdlib.h... yes
 checking for string.h... yes
 checking for memory.h... yes
 checking for strings.h... yes
 checking for inttypes.h... yes
 checking for stdint.h... yes
 checking for unistd.h... yes
 checking for string.h... (cached) yes
 checking sys/time.h usability... yes
 checking sys/time.h presence... yes
 checking for sys/time.h... yes
 checking for unistd.h... (cached) yes
 checking for an ANSI C-conforming const... yes
 checking whether time.h and sys/time.h may both be included... yes
 configure: checking whether gcc -std=gnu99 supports static inline...
 yes
 checking Java support in R... present:
 interpreter : '/usr/bin/java'
 archiver    : '/usr/bin/jar'
 compiler    : '/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javac'
 header prep.: '/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javah'
 cpp flags   : ''
 java libs   : '-L/usr/lib/../lib/gcj-4.3-90 -L/usr/lib/jni -ljvm'
 configure: error: One or more Java configuration variables are not set.
 Make sure R is configured with full Java support (including JDK). Run
 R CMD javareconf
 as root to add Java support to R.

 If you don't have root privileges, run
 R CMD javareconf -e
 to set all Java-related variables and then install rJava.

 ERROR: configuration failed for package ‘rJava’
 * Removing ‘/home/mkimpel/R_HOME/site-library-2.9.0/rJava’

 The downloaded packages are in
        ‘/tmp/Rtmpfp9kiG/downloaded_packages’
 Warning message:
 In install.packages(rJava) :
  installation of package 'rJava' had non-zero exit status

  ./R CMD javareconf -e
 Java interpreter : /usr/bin/java
 Java version     : 1.5.0
 Java home path   : /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre
 Java compiler    : /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javac
 Java headers gen.: /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javah
 Java archive tool: /usr/bin/jar
 Java library path: /usr/lib/../lib/gcj-4.3-90:/usr/lib/jni
 JNI linker flags : -L/usr/lib/../lib/gcj-4.3-90 -L/usr/lib/jni -ljvm
 JNI cpp flags    :

 The following Java variables have been exported:
 JAVA_HOME JAVA JAVAC JAVAH JAR JAVA_LIBS JAVA_CPPFLAGS JAVA_LD_LIBRARY_PATH

 sessionInfo()
 R version 2.9.1 (2009-06-26)
 x86_64-unknown-linux-gnu

[R] help with dealing with integer(0) returns from grep used within a conditional loop

2009-07-04 Thread Mark Kimpel

I am using grep to locate colnames to automate a report build and have
run into a problem when a colname is not found. The use of integer(0)
in a conditional statement seems to be a no no as it has length 0.
Below is a self-contained trivial example. I would like to get
something like NA or -1 for the position when it is not found OR
learn a way to use integer(0) or some cast of it in a logical
statement. Example, output, and sessionInfo follow. Thanks, Mark

3
vec.1 - c(a, b, c)
vec.2 - c(a, c, d)
for (i in 1:length(vec.1)){
  for (j in 1:length(vec.2)){
print(paste(i:, i,  j:, j, sep = ))
pos - grep(vec.1[i], vec.2[j])
if (pos  0){
  print(pos identified)
}
else{
  print(pos not found)
}
  }
}


 #3
 vec.1 - c(a, b, c)
 vec.2 - c(a, c, d)
 for (i in 1:length(vec.1)){
+   for (j in 1:length(vec.2)){
+ print(paste(i:, i,  j:, j, sep = ))
+ pos - grep(vec.1[i], vec.2[j])
+ if (pos  0){
+   print(pos identified)
+ }
+ else{
+   print(pos not found)
+ }
+   }
+ }
[1] i:1 j:1
[1] pos identified
[1] i:1 j:2
Error in if (pos  0) { : argument is of length zero
No suitable frames for recover()
 ###
 sessionInfo()
R version 2.9.1 (2009-06-26)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] tcltk stats graphics  grDevices datasets  utils methods
[8] base

other attached packages:
 [1] sfsmisc_1.0-7   KEGG.db_2.2.11  GO.db_2.2.11
 [4] rat2302.db_2.2.11   GOstats_2.10.0  RSQLite_0.7-1
 [7] DBI_0.2-4   graph_1.22.2Category_2.10.1
[10] AnnotationDbi_1.6.1 qvalue_1.18.0   limma_2.18.2
[13] affy_1.22.0 Biobase_2.4.1

loaded via a namespace (and not attached):
 [1] affyio_1.12.0annotate_1.22.0  genefilter_1.24.2
 [4] GSEABase_1.6.1   preprocessCore_1.6.0 RBGL_1.20.0
 [7] splines_2.9.1survival_2.35-4  tools_2.9.1
[10] XML_2.5-3xtable_1.5-5

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do.
-- B. F. Skinner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with dealing with integer(0) returns from grep used within a conditional loop

2009-07-04 Thread Mark Kimpel

Thanks, embarrased that I didn't think of that myself :)

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail

The real problem is not whether machines think but whether men do.
-- B. F. Skinner
**



On Sat, Jul 4, 2009 at 2:47 PM, Allan Engelhardtall...@cybaea.com wrote:


 On 04/07/09 18:56, Mark Kimpel wrote:

 I am using grep to locate colnames to automate a report build and have
 run into a problem when a colname is not found. The use of integer(0)
 in a conditional statement seems to be a no no as it has length 0.
 Below is a self-contained trivial example. I would like to get
 something like NA or -1 for the position when it is not found OR
 learn a way to use integer(0) or some cast of it in a logical
 statement. Example, output, and sessionInfo follow. Thanks, Mark

 3
 vec.1- c(a, b, c)
 vec.2- c(a, c, d)
 for (i in 1:length(vec.1)){
   for (j in 1:length(vec.2)){
     print(paste(i:, i,  j:, j, sep = ))
     pos- grep(vec.1[i], vec.2[j])
     if (pos  0){


 Try: if ( length(pos)  0 ) {

 Allan.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with scan recognizing newline '\n'

2009-06-16 Thread Mark Kimpel

I'm using R to do some file processing in Linux and am trying to read
in the output of find . -type f -print 
~/Music_Archives_search_problem/ls.output.find.txt

This command yields a text file with each line representing the full
path name of all files in the directory and subdirs. Unfortunately,
there seem to be some special characters that interfere with scan
recognizing '\n' as newline. At least that's what I assume the problem
is, but I can't identify which those might be or how to correct the
problem. Below is my code and the problem output followed by
sessionInfo(). This is executed in a loop, with i starting from zero.
I also tried with 'allowEscapes = TRUE', but that made no difference.
As you can see, the first FLAC file is followed by a '\n', which is
ignored. This seems to happen about once in every 20 file names, so it
does work properly most of the time. Also, when the file is opened in
emacs, the newlines are recognized.

current.line - scan(~/Music_Archives_search_problem/ls.output.find.txt,
   skip = i, nlines = 1, what = 'character', sep =
@, allowEscapes = FALSE)

[1] ./Christian/Christian Gospel/Chanticleer/Chanticleer - How Sweet
the Sound; Spirituals  Traditional Gosp - 04 - Soon One Mornin
Medley; Soon One Mornin-What You Gon Do When the
flac\n./Christian/Christian Gospel/Chanticleer/Chanticleer - How
Sweet the Sound; Spirituals  Traditional Gosp - 05 - Didnt It
Rain.flac

 sessionInfo()
R version 2.9.0 (2009-04-17)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices datasets  utils methods   base

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

The real problem is not whether machines think but whether men do.
-- B. F. Skinner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 'options=utils::recover' not working in .Rprofile or within R

2009-05-31 Thread Mark Kimpel

options(error=utils::recover)

Does indeed work, at least with the new install of R-devel (to be 2.10.0)
that I am running right now. I was sure I checked this with 2.9.0 last
night, but I am probably mistaken.

One point, the ?options help page is misleading in that the example is 
Note that these need to
  specified as e.g. 'options=utils::recover' in startup files
  such as '.Rprofile'.

Since the use of utils:: is a new requirement, I think stemming from when
utils is loaded, this help page should be corrected as the example is
confusing/incorrect.

So, stick with what is in the first line above and, for now, ignore the help
page.

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**


On Sat, May 30, 2009 at 10:49 PM, David Winsemius dwinsem...@comcast.netwrote:

 You are wiping out all of the default options with that approach.

 Try (after restarting R to get the other options back to what they should
 be):

 op=options()   # so you can reset back to baseline
 options(error=utils::recover)  # do not think the utils:: is needed
  my.func - function(x){
  y - x + 12
  nonsense
  y
  }

  my.func(14)
 Error in my.func(14) : object nonsense not found

 Enter a frame number, or 0 to exit

 1: my.func(14)

 Selection:



 On May 30, 2009, at 10:24 PM, Mark Kimpel wrote:

  Duncan,

 I've pared down my .Rprofile so that it has just the options line, started
 R
 from terminal (instead of using ESS-emacs) and I still have the problem.
 Am
 I specifying the options incorrectly? I believe I took this directly from
 the help page.


 Not what the examples look like on my machine.


  See my output of .Rprofile, the code example that doesn't
 work as we think it ought, and my sessionInfo().  Thanks, Mark

 Type 'demo()' for some demos, 'help()' for on-line help, or
 'help.start()' for an HTML browser interface to help.
 Type 'q()' to quit R.

  read.table(~/.Rprofile)

 V1
 1 options=utils::recover

 my.func - function(x){

 + y - x + 12
 + nonsense
 + y
 + }

 my.func(14)

 Error in my.func(14) : object 'nonsense' not found

 sessionInfo()

 R version 2.9.0 (2009-04-17)
 x86_64-unknown-linux-gnu

 locale:

 LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base




 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 'options=utils::recover' not working in .Rprofile or within R

2009-05-30 Thread Mark Kimpel

Duncan,

I've pared down my .Rprofile so that it has just the options line, started R
from terminal (instead of using ESS-emacs) and I still have the problem. Am
I specifying the options incorrectly? I believe I took this directly from
the help page. See my output of .Rprofile, the code example that doesn't
work as we think it ought, and my sessionInfo().  Thanks, Mark

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

 read.table(~/.Rprofile)
  V1
1 options=utils::recover
 my.func - function(x){
+ y - x + 12
+ nonsense
+ y
+ }
 my.func(14)
Error in my.func(14) : object 'nonsense' not found
 sessionInfo()
R version 2.9.0 (2009-04-17)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base



Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**


On Sat, May 30, 2009 at 5:44 AM, Duncan Murdoch murd...@stats.uwo.cawrote:

 [Sent this before completing my last sentence; here's another attempt]

 On 29/05/2009 11:45 PM, Mark Kimpel wrote:

 For years I have been using options(error = recover) either in .Rprofile
 or
 from within R for debugging purposes. The functionality of this appears to
 have changed and I can't recover it (no pun intended) using the ?options
 help page. How can I get the old functionality back, particularly from
 within .Rprofile? A specific line entry would be appreciated. An example,
 the help page, and sessionInfo() follow. Thanks, Mark


 I don't think there were any substantial changes in 2.9.0, so I would
 guess that you have a local object named recover or options, and it
 is causing your  problems.  When I run options(error=recover) and your
 two lines below, I get this output:

   options(error=recover)
   b.func - function(x) {y - x + 2; nonsense; y}
   b.func(3)
 Error in b.func(3) : object 'nonsense' not found

 Enter a frame number, or 0 to exit

 1: b.func(3)

 Selection: 0
  

 which is what you wanted.

 To put this into your .Rprofile, you need to use utils::recover (the
 utils package hasn't been attached yet).  That also works for me.

 There have been changes to recover in R-devel (to become 2.10.0), and will
 likely be more, but what you did shouldn't appear much different than what I
 showed from 2.9.0 above.  If you had sourced the code from a file, 2.10.0
 should tell you which line of the file contained the error.

 Duncan Murdoch



 b.func - function(x) {y - x + 2; nonsense; y}

 b.func(3)

 Error in b.func(3) : object 'nonsense' not found ## in the past this would
 be a menu with numbers for what level I want to go to (in this case just
 1)

 This help page states:
 'error': either a function or an expression governing the handling
  of non-catastrophic errors such as those generated by 'stop'
  as well as by signals and internally detected errors... The
  functions 'dump.frames' and 'recover' provide alternatives
  that allow post-mortem debugging.  Note that these need to
  specified as e.g. 'options=utils::recover' in startup files
  such as '.Rprofile'.

  sessionInfo()

 R version 2.9.0 (2009-04-17)
 x86_64-unknown-linux-gnu

 locale:

 LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices datasets  utils methods   base
 
 Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN  46074

 (317) 490-5129 Work,  Mobile  VoiceMail
 (317) 399-1219  Home
 Skype:  mkimpel

 The real problem is not whether machines think but whether men do. -- B.
 F. Skinner
 **

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





[[alternative HTML version

[R] 'options=utils::recover' not working in .Rprofile or within R

2009-05-29 Thread Mark Kimpel

For years I have been using options(error = recover) either in .Rprofile or
from within R for debugging purposes. The functionality of this appears to
have changed and I can't recover it (no pun intended) using the ?options
help page. How can I get the old functionality back, particularly from
within .Rprofile? A specific line entry would be appreciated. An example,
the help page, and sessionInfo() follow. Thanks, Mark

b.func - function(x) {y - x + 2; nonsense; y}
 b.func(3)
Error in b.func(3) : object 'nonsense' not found ## in the past this would
be a menu with numbers for what level I want to go to (in this case just 1)

This help page states:
'error': either a function or an expression governing the handling
  of non-catastrophic errors such as those generated by 'stop'
  as well as by signals and internally detected errors... The
  functions 'dump.frames' and 'recover' provide alternatives
  that allow post-mortem debugging.  Note that these need to
  specified as e.g. 'options=utils::recover' in startup files
  such as '.Rprofile'.

 sessionInfo()
R version 2.9.0 (2009-04-17)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices datasets  utils methods   base

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question about the Y of R article in the latest R news

2008-11-08 Thread Mark Kimpel

I found the article the Y of R in the latest R news to be very
interesting. It is certainly challenging me to learn more about how R works
under the hood as the author states. What is less clear to me is whether
this approach is primarily for teaching purposes or has a real world
application. What is meant by fragility of reliance on the function
name defined as a global variable as a downside to the classical recursive
formulation of function s? How can that impact the average R programmer?

Beyond that, empiricist that I am, I decided to put the examples to the
test. My source code and output is below, but the bottom line consists of 2
observations:

   - The Y function approach using csum is consistently slower on my machine
   that the s function approach
   - The Y function using csum gives recursive error with high input values
   just like the s function does
   - The Y function in fact reaches the limit of recursion BEFORE the s
   function does

Given that it is slower, is more cumbersome to write, and has a lower
nesting limit than the classical approach, I wonder about its utility for
the average programmer (or somewhat below average programmer like me).

Okay, here's my code, output, and sessionInfo()

s - function(n) {
  if (n == 1) return(1)
  return(s(n-1)+n)
}


Y - function(f) {
  g - function(h) function(x) f(h(h))(x)
  g(g)
}

csum - function(f) function(n) {
  if (n  2) return(1);
  return(n+f(n-1))
}

recurs.time - matrix(0, ncol = 3, nrow = 100)
Y.time - matrix(0, ncol = 3, nrow = 100)

for (i in 1:100) recurs.time[i,] - unclass(system.time(a - s(996)))[1:3]
ave.recurs.time - colSums(recurs.time)
ave.recurs.time

for (i in 1:100) Y.time[i,] - unclass(system.time(b - Y (csum)(996)))[1:3]
ave.Y.time - colSums(Y.time)
ave.Y.time

u - s(1000)
u

v - Y (csum)(1000)
v

sessionInfo()

 s - function(n) {
+   if (n == 1) return(1)
+   return(s(n-1)+n)
+ }


 Y - function(f) {
+   g - function(h) function(x) f(h(h))(x)
+   g(g)
+ }

 csum - function(f) function(n) {
+   if (n  2) return(1);
+   return(n+f(n-1))
+ }

 recurs.time - matrix(0, ncol = 3, nrow = 100)
 Y.time - matrix(0, ncol = 3, nrow = 100)

 for (i in 1:100) recurs.time[i,] - unclass(system.time(a - s(996)))[1:3]
 ave.recurs.time - colSums(recurs.time)
 ave.recurs.time
[1] 0.356 0.004 0.355

 for (i in 1:100) Y.time[i,] - unclass(system.time(b - Y
(csum)(996)))[1:3]
 ave.Y.time - colSums(Y.time)
 ave.Y.time
[1] 0.652 0.000 0.640

 u - s(1000)
 u
[1] 500500
 v - Y (csum)(1000)
Error: evaluation nested too deeply: infinite recursion /
options(expressions=)?
Error during wrapup: evaluation nested too deeply: infinite recursion /
options(expressions=)?
 v
Error: object v not found
No suitable frames for recover()

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to list variables enclosed in an environment

2008-10-17 Thread Mark Kimpel

I'm having trouble with a Bioconductor package, an variable expected in an
environment does not seem to be there. As part of my investigation of the
problem (most likely on my end) I'd like to list the variables contained in
an environment. If you have an environment loaded, lets call it pkgEnv',
how does one find what it does contain? Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to list variables enclosed in an environment

2008-10-17 Thread Mark Kimpel

Never mind, I got the brilliant idea to ls(pkgEnv) and of course it worked.

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**


On Fri, Oct 17, 2008 at 11:03 AM, Mark Kimpel [EMAIL PROTECTED] wrote:

 I'm having trouble with a Bioconductor package, an variable expected in an
 environment does not seem to be there. As part of my investigation of the
 problem (most likely on my end) I'd like to list the variables contained in
 an environment. If you have an environment loaded, lets call it pkgEnv',
 how does one find what it does contain? Mark
 
 Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN  46074

 (317) 490-5129 Work,  Mobile  VoiceMail
 (317) 399-1219  Home
 Skype:  mkimpel

 The real problem is not whether machines think but whether men do. -- B.
 F. Skinner
 **


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] XML_1.98-0 fails to build on Debian Lenny with gcc 4.3.2 and R-beta 2.8.0

2008-10-14 Thread Mark Kimpel

Subject pretty much says it all. Wonder if there is there is some code in
XML that the new gcc doesn't like? See output below:

* Installing *source* package 'XML' ...
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking how to run the C preprocessor... gcc -E
checking for sed... /bin/sed
checking for xml2-config... no
Cannot find xml2-config
ERROR: configuration failed for package 'XML'
** Removing '/home/mkimpel/R_HOME/site-library-2.8.0/XML'


Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] XML_1.98-0 fails to build on Debian Lenny with gcc 4.3.2 and R-beta 2.8.0

2008-10-14 Thread Mark Kimpel

Dirk,

Please let me know when you will support nightly builds of R-devel with all
R and BioConductor packages. I would also need your help syncing my home
setup with the one that I use on a remote Linux cluster using RHEL 5. For my
needs, the packages on these two setups need to be exactly the same. I would
also need your help being able to load either the current release of R and
the current R-devel with 2 different site-libraries.

If you are set up to help with this, let me know and I'll get started.

Lastly, do you think non-Debian Linux users would NOT benefit to the answers
to my question? It does seem that the problem was my lack of understanding
of the need for the xlm2-dev libraries, not a Debian specific issue. Or do
you believe that if I were using another distribution this problem would not
have occurred? I don't think so, but if you believe that to be the case, it
would be of interest.

Ready to get started when you are,

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

The real problem is not whether machines think but whether men do. -- B.
F. Skinner
**


On Tue, Oct 14, 2008 at 3:00 PM, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:

 On Tue, Oct 14, 2008 at 02:34:57PM -0400, Mark Kimpel wrote:
  Subject pretty much says it all. Wonder if there is there is some code in
  XML that the new gcc doesn't like? See output below:

 You are wondering wronly.

  * Installing *source* package 'XML' ...
  checking for gcc... gcc
  checking for C compiler default output file name... a.out
  checking whether the C compiler works... yes
  checking whether we are cross compiling... no
  checking for suffix of executables...
  checking for suffix of object files... o
  checking whether we are using the GNU C compiler... yes
  checking whether gcc accepts -g... yes
  checking for gcc option to accept ISO C89... none needed
  checking how to run the C preprocessor... gcc -E
  checking for sed... /bin/sed
  checking for xml2-config... no
  Cannot find xml2-config
  ERROR: configuration failed for package 'XML'
  ** Removing '/home/mkimpel/R_HOME/site-library-2.8.0/XML'

 You seem to be

 a) missing the libxml2-dev package for Debian:

sudo apt-get install libxml2-dev

 b) once again ignoring the fact that XML is available for you as a
   binary Debian package via

sudo apt-get install r-cran-xml'

 c) also ignoring the fact that, should you still insist on building it
   yourself, that

sudo apt-get build-dep r-cran-xml

   would do step a) for you

 d) forgetting that we repeatedly recommended r-sig-debian as a more
   suitable mailing list to you.

 Stunned,  Dirk



 
  
  Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
  Indiana University School of Medicine
 
  15032 Hunter Court, Westfield, IN  46074
 
  (317) 490-5129 Work,  Mobile  VoiceMail
  (317) 399-1219  Home
  Skype:  mkimpel
 
  The real problem is not whether machines think but whether men do. --
 B.
  F. Skinner
  **
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 --
 Three out of two people have difficulties with fractions.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] using assign with lists

2008-10-07 Thread Mark Kimpel

I am performing many permutations on a data-set with each permutation
producing a variable number of results. I thought that the best way to keep
track of all this in one object would be with a list ('res.lst'). To address
these variable results for each permutation I attempted to construct this
list using 'assign'. There is even more nesting than indicated below, but
this is a simple example that, if addressed, will fit answer my question.
The below code chunk clearly does not produce the desired results because,
instead of assigning a new vector to the list, it creates a new variable
'res.list$contrast.i.j' . In the last two lines I show what I really want to
happen. Can I use assign in this context by using it differently?

Thanks, Mark

res.lst - list()
for (i in 1:2){
  for (j in 1:2){
assign(paste(res.lst$contrast, i, j, sep = .), paste(i,j,sep=.))
  }
}
res.lst

ls(pattern = res.lst..?)

res.lst$contrast.5.5 - 5.5
res.lst

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] efficient use of lm over a matrix vs. using apply over rows

2008-10-05 Thread Mark Kimpel

I have a large matrix, each row of which needs lm applied. I am certain than
I read an article in R-news about this within the last year or two that
discussed the application of lm to matrices but I'll be darned if I can find
it with Google. Probably using the wrong search terms.

Can someone steer me to this article of just tell me if this is possible
and, if so, how to do it? My simplistic attempts have failed.

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] efficient use of lm over a matrix vs. using apply over rows

2008-10-05 Thread Mark Kimpel

Sorry for the vagueness of my question, your interpretation, however, was
spot on. Correct me if I am wrong, but my impression is that apply is a more
compact way of a for loop, but that the way R handles them computationally
are the same. In the article I seem to remember, there was a significant
increase in speed with your second approach, presumably because function
calls are avoided in R and the heavy lifting is done in C. I will use your
second approach anyway, but can I expect increased computational efficiency
with it and, if so, is my reasoning in the prior sentence correct?

BTW, it appears as though my own attempt was almost correct, but I did not
transpose the matrix. In genomics, our response variables (genes) are the
rows and the predictor values are the column names. The BioConductor
packages I routinely use are very good at hiding this and I just didn't come
to mind.

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 399-1219  Home
Skype:  mkimpel

**


On Sun, Oct 5, 2008 at 10:28 AM, Duncan Murdoch [EMAIL PROTECTED]wrote:

 On 05/10/2008 10:08 AM, Mark Kimpel wrote:

 I have a large matrix, each row of which needs lm applied. I am certain
 than
 I read an article in R-news about this within the last year or two that
 discussed the application of lm to matrices but I'll be darned if I can
 find
 it with Google. Probably using the wrong search terms.

 Can someone steer me to this article of just tell me if this is possible
 and, if so, how to do it? My simplistic attempts have failed.


 You don't give a lot of detail on what you mean by applying lm to a row of
 a matrix, but I'll assume you have fixed predictor variables, and each row
 is a different response vector.  Then you can use apply() like this:

 x - 1:10
 mat - matrix(rnorm(200), nrow=20, ncol=10)

 resultlist - apply(mat, 1, function(y) lm(y ~ x))
 resultcoeffs - apply(mat, 1, function(y) lm(y ~ x)$coefficients)


 resultlist will contain a list of 20 different lm() results,
 resultcoeffs will be a matrix holding just the coefficients.

 lm() also allows the response to be a matrix, where the columns are
 considered different components of a multivariate response.  So if you
 transpose your matrix you can do it all in one call:

 resultmulti - lm(t(mat) ~ x)

 The coefficients of resultmulti will match resultcoeffs.

 Duncan Murdoch

 Duncan Murdoch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] creating overall title for plots made with par(mfrow=c(2,2))

2008-08-05 Thread Mark Kimpel

I'm making some plots on the same page and would like to include an overall
title instead of individual main titles as they are similar and their x and
y axis labels are sufficient to distinguish them.

Is there a way to assign an overall main to this page of plots?

Mark

-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] creating overall title for plots made with par(mfrow=c(2, 2))

2008-08-05 Thread Mark Kimpel

Ouch! I had searched the archives and read over ?par and ?plot, but sure
missed the post of today. Thanks, Mark

On Tue, Aug 5, 2008 at 7:27 PM, Marc Schwartz [EMAIL PROTECTED]wrote:

 on 08/05/2008 06:05 PM Mark Kimpel wrote:

 I'm making some plots on the same page and would like to include an
 overall
 title instead of individual main titles as they are similar and their x
 and
 y axis labels are sufficient to distinguish them.

 Is there a way to assign an overall main to this page of plots?

 Mark


 Mark,

 Hate to do this, but see this post from Prof. Ripley from earlier today:

 https://stat.ethz.ch/pipermail/r-help/2008-August/169974.html

 :-)

 Regards,

 Marc Schwartz




-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] setting editor environment variable EDITOR either when configuring R for installation or in .Rprofile

2008-07-30 Thread Mark Kimpel

Marc,

That is exactly what I was looking for, although I am surprised that this
variable does not appear to be configurable during installation from source
as are some of the other environmental variables.

Mark

On Wed, Jul 30, 2008 at 2:04 PM, Marc Schwartz [EMAIL PROTECTED]wrote:

 on 07/30/2008 12:54 PM Mark Kimpel wrote:

 I'm running R on Linux and use emacs as my editor. When doing
 edit(vignette(foo.vignette)) I would like to invoke emacs rather than
 the
 default vi. I am able to manually set this by editing $R_HOME/etc/Renviron
 but would like to avoid doing this with each install. I assume this can be
 accomplished with a flag to .configure or in .Rprofile but I can't find
 the
 syntax in R-admin. Editor is not listed as an environment variable in
 appendix B of that manual.

 So, help is appreciated as I've probably missed something.
 Mark


 Mark,

 See ?options for 'editor'. This is also referenced in ?edit, where the
 default value for the 'editor' argument is getOption(editor).

 Thus, in your .Rprofile, put:

  options(editor=emacs)

 HTH,

 Marc Schwartz




-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] source code for R-dev packages

2008-07-14 Thread Mark Kimpel

Where is the link on www.r-project.org or CRAN to download source code for
development versions of packages? This is straightforward for BioConductor
packages but I can't seem to find it for R packages.

Mark

-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with cube3d cube size

2008-06-26 Thread Mark Kimpel

Duncan  Ben,

Thanks for giving me some tips as to how I can best investigate
packages. Ive been using ESS-Emacs and the help pages are HTML. Looks
like I need to figure out how to enable HTML help.

As for a draft vignette, sure, I would be happy (and honored) to
contribute as best I can. I would need a lot of help from the two of
you, but it would be a good learning experience for me and give me a
chance to give something back to the community which has been so good
to me.

Speaking of vignettes, I wonder if there should be a vignette written
called navigating new functions and packages for the R beginner or
be included as a chapter in An Introduction to R, which focuses on
R-base functionality. Im just musing here, but I often think that if
I need to learn something perhaps others do too.

One thing at a time, however, so I'll work on a draft and send it to
you for comments and input. Ben, if you get such a list of packages
with functions that depend on rgl, send it to me.

Mark

On Thu, Jun 26, 2008 at 7:02 AM, Duncan Murdoch [EMAIL PROTECTED] wrote:
 Mark Kimpel wrote:

 Thanks for the pointers. I think the package is great, just want to
 use it to its full potential without driving the list crazy with
 questions.  Below is my output to help and sessionInfo(). I don't see
 the scaling functions, although I now see that they can be retrieved
 with ?matrices. You guys are the experts as to what should go where,
 but for someone unfamiliar with how rgl works, having them print out
 with help(package = rgl) would make some of these functions more
 obvious to the newbie.  Mark


 The issue here is that several of those are documented on the same page, and
 that index lists the names of help pages, not all of the aliases that point
 there.  The index in the HTML help version (or the CHM help in Windows)
 repeats each alias, so you'll see things like

 scale3dWork with homogeneous coordinates
 scaleMatrixWork with homogeneous coordinates

 as well as the entry for matrices as below.  I can see arguments for both:
  repetition is bad, full coverage is good.  The index in the pdf document is
 another approach:  it lists all the aliases, and tells you which topic
 documents them.

 All of this is common to all R packages; rgl isn't doing anything special to
 produce these lists.

 Duncan Murdoch

 aspect3dSet the aspect ratios of the current plot
 axes3d  Draw boxes, axes and other text outside the
data
 ellipse3d   Make an ellipsoid
 grid3d  Add a grid to a 3D plot
 matricesWork with homogeneous coordinates
 par3d   Set or Query RGL Parameters
 par3dinterp Interpolator for par3d parameters
 persp3d Surface plots
 play3d  Play animation of rgl scene
 plot3d  3D Scatterplot
 points3dadd primitive set shape
 qmesh3d 3D Quadrangle Mesh objects
 r3d Generic 3D interface
 rgl-package 3D visualization device system
 rgl.bboxSet up Bounding Box decoration
 rgl.bg  Set up Background
 rgl.bringtotop  Assign focus to an RGL window
 rgl.clear   scene management
 rgl.light   add light source
 rgl.materialGeneric Appearance setup
 rgl.postscript  export screenshot
 rgl.primitive   add primitive set shape
 rgl.setMouseCallbacks   User callbacks on mouse events
 rgl.snapshotexport screenshot
 rgl.spheres add sphere set shape
 rgl.surface add height-field surface shape
 rgl.texts   add text
 rgl.user2window Convert between rgl user and window coordinates
 rgl.viewpoint   Set up viewpoint
 select3dSelect a rectangle in an RGL scene
 spin3d  Create a function to spin a scene at a fixed
rate
 sprites3d   add sprite set shape
 subdivision3d   generic subdivision surface method
 surface3d   add height-field surface shape




 sessionInfo()


 R version 2.7.1 (2008-06-23)
 i686-pc-linux-gnu

 locale:

 LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] rgl_0.79 graph_1.18.1

 loaded via a namespace (and not attached):
 [1] cluster_1.11.11 tools_2.7.1

 On Wed, Jun 25, 2008 at 5:25 PM, Duncan Murdoch [EMAIL PROTECTED]
 wrote:


 Mark Kimpel wrote:


 Ben and Duncan,

 Thanks for your helpful suggestions. Im having some difficulty
 navigating this really good package using my normal learning
 techniques. When I do 'help(package = rgl) it seems only a very

Re: [R] help with cube3d cube size

2008-06-25 Thread Mark Kimpel

Thanks for the pointers. I think the package is great, just want to
use it to its full potential without driving the list crazy with
questions.  Below is my output to help and sessionInfo(). I don't see
the scaling functions, although I now see that they can be retrieved
with ?matrices. You guys are the experts as to what should go where,
but for someone unfamiliar with how rgl works, having them print out
with help(package = rgl) would make some of these functions more
obvious to the newbie.  Mark

aspect3dSet the aspect ratios of the current plot
axes3d  Draw boxes, axes and other text outside the
data
ellipse3d   Make an ellipsoid
grid3d  Add a grid to a 3D plot
matricesWork with homogeneous coordinates
par3d   Set or Query RGL Parameters
par3dinterp Interpolator for par3d parameters
persp3d Surface plots
play3d  Play animation of rgl scene
plot3d  3D Scatterplot
points3dadd primitive set shape
qmesh3d 3D Quadrangle Mesh objects
r3d Generic 3D interface
rgl-package 3D visualization device system
rgl.bboxSet up Bounding Box decoration
rgl.bg  Set up Background
rgl.bringtotop  Assign focus to an RGL window
rgl.clear   scene management
rgl.light   add light source
rgl.materialGeneric Appearance setup
rgl.postscript  export screenshot
rgl.primitive   add primitive set shape
rgl.setMouseCallbacks   User callbacks on mouse events
rgl.snapshotexport screenshot
rgl.spheres add sphere set shape
rgl.surface add height-field surface shape
rgl.texts   add text
rgl.user2window Convert between rgl user and window coordinates
rgl.viewpoint   Set up viewpoint
select3dSelect a rectangle in an RGL scene
spin3d  Create a function to spin a scene at a fixed
rate
sprites3d   add sprite set shape
subdivision3d   generic subdivision surface method
surface3d   add height-field surface shape


 sessionInfo()
R version 2.7.1 (2008-06-23)
i686-pc-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] rgl_0.79 graph_1.18.1

loaded via a namespace (and not attached):
[1] cluster_1.11.11 tools_2.7.1

On Wed, Jun 25, 2008 at 5:25 PM, Duncan Murdoch [EMAIL PROTECTED] wrote:
 Mark Kimpel wrote:

 Ben and Duncan,

 Thanks for your helpful suggestions. Im having some difficulty
 navigating this really good package using my normal learning
 techniques. When I do 'help(package = rgl) it seems only a very
 small subset of functions available show up.

 I think the full list shows up there, if you're using a current version.
  What specific function is missing?

  Perusing the rgl.pdf
 downloaded from CRAN demonstrates the same lack of documentation.


 All of the functions intended for users are documented, and they show up in
 rgl.pdf.

 There is no vignette. In addition, I have found at least one other
 package with 3d functions (emdbook::curve3d()).


 A vignette would be nice, but there isn't one.  Our paper from useR 2007 is
 the most recent reference (see
 http://www.r-project.org/conferences/useR-2007/program/presentations/murdoch.pdf);
 it cites Daniel's 2003 thesis and the 2003 paper about the package.
 After those, the NEWS file lists some recent additions.

 emdbook makes use of rgl and some other 3d engines, as does misc3d.
  scatterplot3d does it's own drawing.  rggobi is a completely different
 interactive package.

 What is the best resource for learning about all the foo3d() and lower
 level functionality that rgl and its dependents provide? I saw a book
 at BN just last week on openGL. Would that be helpful?


 It might, but probably not.  rgl is intended to be a higher level R-style
 interface to the things described in a book like that.  So if you have a
 particular question about how to do something, you'd never find it there.
  On the other hand, if you want to know if something is possible, then that
 might be a place to look for ideas.

 Duncan Murdoch

 Mark

 On Tue, Jun 24, 2008 at 10:54 PM, Ben Bolker [EMAIL PROTECTED] wrote:


 Mark Kimpel mwkimpel at gmail.com writes:



 I'm using the command below on an open3d() object to create a shaded
 cube. Changes to myScalingFactor do not effect changes in the size of
 the cube. What is the correct approach? Mark


  how about scale3d() ?

 shade3d(translate3d(scale3d(cube3d(),5,5,5),-6,1

[R] question on rgl.surface

2008-06-21 Thread Mark Kimpel

I'd like to use rgl.surface (or some other function if more
appropriate) to create a horizontal and vertical transparent grey
slice (plane) running through both the x and y origins and extending
across the z axis, i.e. the 3-d equivalent of the normal 2-d
coordinate axes we are all familiar with. The examples for rgl.surface
are a bit more complex than what I need and I am having trouble
understanding them.

Here is the code if I come up with, but which obviously doesn't work.

require(rgl)
set.seed(123)
d3.mat - matrix(runif(30, min = -5, max = 5), ncol = 3, nrow = 10)
open3d()
plot3d(x = d3.mat, type = s, col = blue, size = 0.33, xlab =x,
ylab = y, zlab = z)
x - 0
y - 0
z - 
matrix(c(floor(min(d3.mat[,1])),ceiling(max(d3.mat[,1])),floor(min(d3.mat[,3])),ceiling(max(d3.mat[,3]))),nrow
= 2, ncol =2)
rgl.surface(x, z, y, color=grey, back=lines)

What am I doing wrong?

And, while I'm at it, there is another minor question I have, which is
how can I exaggerate the size difference in the spheres between front
and back?

Thanks,
Mark

-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] trouble installing Rmpi on 64-bit Ubuntu 8.04 with openmpi

2008-06-10 Thread Mark Kimpel

Thanks to all for the advice. Took me a bit to get back to this, but
the following worked just fine for me with my 64-bit Ubuntu 8.04 OS:
 R CMD INSTALL Rmpi_0.5-5.tar.gz --configure-args=--with-mpi=/usr/lib64/openmpi

On Thu, Jun 5, 2008 at 3:23 AM, Paul Hewson [EMAIL PROTECTED] wrote:
 Or (more simply?) install it from the bash prompt using:

 R CMD INSTALL Rmpi_0.5-5.tar.gz 
 --configure-args=--with-mpi=/opt/openmpi/include/
 (or whatever your path to openmpi might be).

 I missed this as well and confused myself for a bit, but it is mentioned in 
 the R news article on Rmpi H.Yu (2002) Rmpi: Parallel Statistical Computing 
 in R Vol 2(2) page 10-14

 Our cluster has a variety of forms of mpi running, and I also had to follow 
 the readme in the Rmpi library carefully to make sure it didn't spawn stray 
 lam-mpi processes as well.

 Best

 Paul

 -=-=-==-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Paul Hewson
 Lecturer in Statistics
 University of Plymouth
 Drake Circus
 Plymouth PL4 8AA

 tel ++44(0)1752 232778
 email [EMAIL PROTECTED]
 web http://www.plymouth.ac.uk/staff/phewson

 
 From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of tub78 [EMAIL PROTECTED]
 Sent: 04 June 2008 21:11
 To: r-help@r-project.org
 Subject: Re: [R] trouble installing Rmpi on 64-bit Ubuntu 8.04 with openmpi

 The problem here is that the compiler cannot find the include files
 for mpi.  Notice that the first checks that fail are:

 checking mpi.h usability... no
 checking mpi.h presence... no
 checking for mpi.h... no

 One solution is to create a file named ~/.R/Makevars with the
 following line:

 PKG_CPPFLAGS = -I/opt/openmpi/include

 ... where /opt/openmpi/include/ contains the necessary mpi.h include
 file.

 Then, try to recompile.

 Hope this helps,
 - Stu



 On May 6, 12:52 pm, Mark Kimpel [EMAIL PROTECTED] wrote:
 Subject pretty much says it all. I am running 64-bit Ubuntu 8.04, i.e. Hardy
 Heron, have openmpi installed, and get the following error message with
 attempted install of Rmpi. sessionInfo() follows.

 Mark

 checking for ANSI C header files... yes
 checking for sys/types.h... yes
 checking for sys/stat.h... yes
 checking for stdlib.h... yes
 checking for string.h... yes
 checking for memory.h... yes
 checking for strings.h... yes
 checking for inttypes.h... yes
 checking for stdint.h... yes
 checking for unistd.h... yes
 checking mpi.h usability... no
 checking mpi.h presence... no
 checking for mpi.h... no
 Try to find libmpi.so or libmpich.a
 checking for main in -lmpi... yes
 checking for openpty in -lutil... yes
 checking for main in -lpthread... yes
 configure: creating ./config.status
 config.status: creating src/Makevars
 ** libs
 gcc -std=gnu99 -I/home/mkimpel/R_HOME/R-patched/R-build/lib64/R/include
 -DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\
 -DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -DSTDC_HEADERS=1
 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1
 -DHAVE_UNISTD_H=1   -DUNKNOWN -fPIC -I/usr/local/include-fpic  -g -O2 -c
 conversion.c -o conversion.o
 In file included from conversion.c:18:
 Rmpi.h:1:17: error: mpi.h: No such file or directory
 In file included from conversion.c:18:
 Rmpi.h:14: error: expected '=', ',', ';', 'asm' or '__attribute__' before
 'mpitype'
 make: *** [conversion.o] Error 1
 chmod: cannot access `/home/mkimpel/R_HOME/site-library-2.7.0/Rmpi/libs/*':
 No such file or directory
 ERROR: compilation failed for package 'Rmpi'
 ** Removing '/home/mkimpel/R_HOME/site-library-2.7.0/Rmpi'

 The downloaded packages are in
 /tmp/RtmppcK0FI/downloaded_packages
 Warning message:

  sessionInfo()

 R version 2.7.0 Patched (2008-05-04 r45620)
 x86_64-unknown-linux-gnu

 locale:
 LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF- 
 8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_A 
 DDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] graph_1.18.0

 loaded via a namespace (and not attached):
 [1] cluster_1.11.10 tcltk_2.7.0 tools_2.7.0

 --
 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN 46074

 (317) 490-5129 Work,  Mobile  VoiceMail
 (317) 663-0513 Home (no voice mail please)

 **

 [[alternative HTML version deleted]]

 __
 [EMAIL PROTECTED] mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing

Re: [R] strange (to me) p-value distribution

2008-06-08 Thread Mark Kimpel

Wolfgang,

Thank you for both the explanation and the beautiful R code to
demonstrate your point. Even after seeing the empirical evidence,
however, I couldn't get the underlying mechanism into my head. I
tweaked your code a bit to make the batch effect even bigger, to the
point where, ah ha, the distribution no longer approximates normal but
is clearly bivariate (additional histograms).

I went back to my original data and looked at the histogram of logged
expression values. Although not as clear cut, the distribution is not
normal and indeed there is a hint of several humps corresponding to
different batches. I need to see what effect their inclusion into my
model is.

Lesson relearned about the importance of visualizing data before
starting an analysis.

My slightly tweaked code is below, in case anyone wants to look at it.

Mark


library(genefilter)

nr - 31000
nc - 20

x.1  - matrix(rnorm(nr*nc), nrow = nr, ncol = nc)

fact - factor(1:nc %% 2)

sapply(split(x.1, fact), mean)
sapply(split(x.1, fact), sd)

rt1 - rowttests(x.1, fact)

## add a batch effect
x.2 - x.1
x.2[, 1:10] - x.2[, 1:10] + pi

sapply(split(x.2, fact), mean)
sapply(split(x.2, fact), sd)

rt2 - rowttests(x.2, fact)

par(mfrow = c(2,2))
hist(x.1, breaks = 50, col = mintcream)
hist(x.2, breaks = 50, col = mistyrose)
hist(rt1$p.value, breaks = 100, col = mintcream)
hist(rt2$p.value, breaks = 100, col = mistyrose)


On Sat, Jun 7, 2008 at 6:40 PM, Wolfgang Huber [EMAIL PROTECTED] wrote:

 Dear Mark,

 try out the example code below. Such a p-value distribution often occurs  if
 you have batch effects, i.e. if the between-group variability is in fact
 less than the within-group variability.

 In the example below, I do, for each row of x, a t-test between the values
 in the even and odd columns; for rt2, a batch effect has been added to
 columns 1:10.

  hope this helps
Wolfgang


 library(genefilter)

 nr = 31000
 nc = 20

 x  = matrix(rnorm(nr*nc), nrow=nr, ncol=nc)

 rt1 = rowttests(x, factor(1:nc %% 2))

 ## add a batch effect
 x[, 1:10] = x[, 1:10] + pi/2
 rt2 = rowttests(x, factor(1:nc %% 2))

 par(mfrow=c(2,1))
 hist(rt1$p.value, breaks=100, col=mistyrose)
 hist(rt2$p.value, breaks=100, col=mistyrose)


 --
 Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber


 Mark Kimpel a écrit 07/06/2008 18:39:

 I'm working with a genomic data-set with ~31k end-points and have
 performed an F-test across 5 groups for each end-point. The QA
 measurments on the individual micro-arrays all look good. One of the
 first things I do in my work-flow is take a look at the p-valued
 distribution. it is my understanding that, if the findings are due to
 chance alone, the p-value distribution should be uniform. In this case
 the histogram, even with 1000 break points, starts low on the left and
 climbs almost linearly to the right. In other words, very skewed
 towards high p-values. I understand that this could be happening by
 chance alone, but the same behavior is seen in the two contrasts of
 interest I looked at and I have seen it in a couple of our other
 genomic, high-dimensional experiments as well. I might also add that I
 looked at the actual numbers of genes with p-val  X and indeed, for
 each X  0.05, there are far fewer sig. genes than one would expect by
 chance.

 I can't figure out what is causing this and, if there is a cause, I'd
 like to be able to tell the experimenter if it indicates a technical
 factor. I've had other experiments where the p-value dist approximates
 normal and of course those that have nice spikes at low p-values
 indicating we have some significant genes.

 I'm addressing this hear rather than to BioC because I suspect there
 is some basis statistical mechanism that could explain this. Is there?

 Mark







-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] strange (to me) p-value distribution

2008-06-07 Thread Mark Kimpel

I'm working with a genomic data-set with ~31k end-points and have
performed an F-test across 5 groups for each end-point. The QA
measurments on the individual micro-arrays all look good. One of the
first things I do in my work-flow is take a look at the p-valued
distribution. it is my understanding that, if the findings are due to
chance alone, the p-value distribution should be uniform. In this case
the histogram, even with 1000 break points, starts low on the left and
climbs almost linearly to the right. In other words, very skewed
towards high p-values. I understand that this could be happening by
chance alone, but the same behavior is seen in the two contrasts of
interest I looked at and I have seen it in a couple of our other
genomic, high-dimensional experiments as well. I might also add that I
looked at the actual numbers of genes with p-val  X and indeed, for
each X  0.05, there are far fewer sig. genes than one would expect by
chance.

I can't figure out what is causing this and, if there is a cause, I'd
like to be able to tell the experimenter if it indicates a technical
factor. I've had other experiments where the p-value dist approximates
normal and of course those that have nice spikes at low p-values
indicating we have some significant genes.

I'm addressing this hear rather than to BioC because I suspect there
is some basis statistical mechanism that could explain this. Is there?

Mark

-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rggobi is crashing R-2.7.0

2008-05-07 Thread Mark Kimpel

Hard as it is for me to imagine, the ggobi windows stay open and functional
while R (in emacs) has crashed in the background after throwing the error
messages. As a perhaps naive Linux user, I thought that if a parent process
crashed any processes it spawned would crash too. I guess not in this case.

Ubuntu currently is distributing graphviz 2.16.1

Thanks,
Mark

On Wed, May 7, 2008 at 12:14 AM, Michael Lawrence [EMAIL PROTECTED]
wrote:



 On Tue, May 6, 2008 at 6:06 PM, Mark Kimpel [EMAIL PROTECTED] wrote:

  R crashed just after the warnings were issued, but ggobi kept running
  (if that makes sense).


 I am not sure if that makes sense; GGobi should exit when the R process
 does.

 I have Graphviz and Rgraphiz installed and use Rgraphviz regularly without
  a problem, so I'm not sure why it didn't load. Mark
 

 Which version of graphviz? I am not sure which version the Ubuntu binary
 expects, but this may be a binary compatibility issue.

 That said, I am not sure if the GraphLayout plugin is the reason for R
 crashing...


 
 
  On Tue, May 6, 2008 at 4:37 PM, Michael Lawrence [EMAIL PROTECTED]
  wrote:
 
  
  
   On Tue, May 6, 2008 at 10:32 AM, Mark Kimpel [EMAIL PROTECTED]
   wrote:
  
I am running 64-bit Ubuntu 8.04 and when I invoke rggobi the
interactive
graph displays but R crashes. See my sessionInfo() and a short
example
below. Ggobi and rggobi installed without complaints. Mark
   
 sessionInfo()
R version 2.7.0 Patched (2008-05-04 r45620)
x86_64-unknown-linux-gnu
   
locale:
   
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
   
attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base
   
other attached packages:
[1] rggobi_2.1.9   RGtk2_2.12.5-3 graph_1.18.0
   
loaded via a namespace (and not attached):
[1] cluster_1.11.10 tools_2.7.0
   
   
 a - matrix(rnorm(1000), nrow = 10)
   
 g - ggobi(a)
   
** (R:25146): CRITICAL **: Error on loading plugin library
plugins/GraphLayout/plugin.la: libgvc.so.3: cannot open shared
object file:
No such file or directory
   
** (R:25146): CRITICAL **: Error on loading plugin library
plugins/GraphLayout/plugin.la: libgvc.so.3: cannot open shared
object file:
No such file or directory
   
** (R:25146): CRITICAL **: can't locate required plugin routine
addToToolsMenu in GraphLayout

   
  
   It's not clear to me - did R crash or did you just receive these
   warnings? These warnings are due to a missing graphviz, so the GraphLayout
   plugin fails to load.
  
  
   
--
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine
   
15032 Hunter Court, Westfield, IN 46074
   
(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)
   
**
   
   [[alternative HTML version deleted]]
   
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
   
  
  
 
 
  --
  Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
  Indiana University School of Medicine
 
  15032 Hunter Court, Westfield, IN 46074
 
  (317) 490-5129 Work,  Mobile  VoiceMail
  (317) 663-0513 Home (no voice mail please)
 
  **
 




-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rggobi is crashing R-2.7.0

2008-05-07 Thread Mark Kimpel

Uninstalling and reinstalling ggobi via Synaptic solved the problem, at
least for the demo data mtcars. Rotation works fine. No crashes on exit.

Thanks for the good advice.

Mark

On Wed, May 7, 2008 at 2:12 PM, Paul Johnson [EMAIL PROTECTED] wrote:

 On Tue, May 6, 2008 at 12:32 PM, Mark Kimpel [EMAIL PROTECTED] wrote:
  I am running 64-bit Ubuntu 8.04 and when I invoke rggobi the interactive
   graph displays but R crashes. See my sessionInfo() and a short example
   below. Ggobi and rggobi installed without complaints. Mark
 
sessionInfo()
   R version 2.7.0 Patched (2008-05-04 r45620)
   x86_64-unknown-linux-gnu
 

 In the R 2.7 release notes, there is a comment about a change in the
 GUI libraries and it says that one must recompile everything that
 relies on R.  If your R 2.7 was an upgrade, not a fresh install, it
 could explain why this is happening.  If there's some old library or R
 package sitting around, it could account for this.

 The part that concerned me about the R release note is that they don't
 give a very clear guide on how far back in the toolchain we are
 supposed to go.  Certainly, ggobi has to be rebuilt from scratch.  But
 are any of the things on which ggobi depends needing recompliation as
 well.

 pj


 --
 Paul E. Johnson
 Professor, Political Science
 1541 Lilac Lane, Room 504
 University of Kansas




-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] trouble installing Rmpi on 64-bit Ubuntu 8.04 with openmpi

2008-05-06 Thread Mark Kimpel

Subject pretty much says it all. I am running 64-bit Ubuntu 8.04, i.e. Hardy
Heron, have openmpi installed, and get the following error message with
attempted install of Rmpi. sessionInfo() follows.

Mark

checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking mpi.h usability... no
checking mpi.h presence... no
checking for mpi.h... no
Try to find libmpi.so or libmpich.a
checking for main in -lmpi... yes
checking for openpty in -lutil... yes
checking for main in -lpthread... yes
configure: creating ./config.status
config.status: creating src/Makevars
** libs
gcc -std=gnu99 -I/home/mkimpel/R_HOME/R-patched/R-build/lib64/R/include
-DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\
-DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -DSTDC_HEADERS=1
-DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
-DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1
-DHAVE_UNISTD_H=1   -DUNKNOWN -fPIC -I/usr/local/include-fpic  -g -O2 -c
conversion.c -o conversion.o
In file included from conversion.c:18:
Rmpi.h:1:17: error: mpi.h: No such file or directory
In file included from conversion.c:18:
Rmpi.h:14: error: expected '=', ',', ';', 'asm' or '__attribute__' before
'mpitype'
make: *** [conversion.o] Error 1
chmod: cannot access `/home/mkimpel/R_HOME/site-library-2.7.0/Rmpi/libs/*':
No such file or directory
ERROR: compilation failed for package 'Rmpi'
** Removing '/home/mkimpel/R_HOME/site-library-2.7.0/Rmpi'

The downloaded packages are in
/tmp/RtmppcK0FI/downloaded_packages
Warning message:

 sessionInfo()
R version 2.7.0 Patched (2008-05-04 r45620)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] graph_1.18.0

loaded via a namespace (and not attached):
[1] cluster_1.11.10 tcltk_2.7.0 tools_2.7.0


-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rggobi is crashing R-2.7.0

2008-05-06 Thread Mark Kimpel

I am running 64-bit Ubuntu 8.04 and when I invoke rggobi the interactive
graph displays but R crashes. See my sessionInfo() and a short example
below. Ggobi and rggobi installed without complaints. Mark

 sessionInfo()
R version 2.7.0 Patched (2008-05-04 r45620)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] rggobi_2.1.9   RGtk2_2.12.5-3 graph_1.18.0

loaded via a namespace (and not attached):
[1] cluster_1.11.10 tools_2.7.0


 a - matrix(rnorm(1000), nrow = 10)

 g - ggobi(a)

** (R:25146): CRITICAL **: Error on loading plugin library
plugins/GraphLayout/plugin.la: libgvc.so.3: cannot open shared object file:
No such file or directory

** (R:25146): CRITICAL **: Error on loading plugin library
plugins/GraphLayout/plugin.la: libgvc.so.3: cannot open shared object file:
No such file or directory

** (R:25146): CRITICAL **: can't locate required plugin routine
addToToolsMenu in GraphLayout


-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [BioC] RCurl loading problem with 64 bit linux distribution

2008-05-06 Thread Mark Kimpel

 _Jv_RegisterClasses
00208008 d __CTOR_END__
00208000 d __CTOR_LIST__
00208018 d __DTOR_END__
00208010 d __DTOR_LIST__
7e58 r __FRAME_END__
00208020 d __JCR_END__
00208020 d __JCR_LIST__
0020aee0 A __bss_start
 w __cxa_finalize@@GLIBC_2.2.5
6e20 t __do_global_ctors_aux
3890 t __do_global_dtors_aux
00208680 d __dso_handle
 w __gmon_start__
 U __stack_chk_fail@@GLIBC_2.4
 U __strdup@@GLIBC_2.2.5
0020aee0 A _edata
0020aef0 A _end
6e58 T _fini
3188 T _init
5c30 T addFormElement
6050 T buildForm
3870 t call_gmon_start
 U calloc@@GLIBC_2.2.5
4f90 T checkEncoding
0020aee0 b completed.6183
6a30 T createNamedEnum
 U curl_easy_cleanup@@CURL_GNUTLS_3
 U curl_easy_duphandle@@CURL_GNUTLS_3
 U curl_easy_getinfo@@CURL_GNUTLS_3
 U curl_easy_init@@CURL_GNUTLS_3
 U curl_easy_perform@@CURL_GNUTLS_3
 U curl_easy_setopt@@CURL_GNUTLS_3
 U curl_easy_strerror@@CURL_GNUTLS_3
 U curl_escape@@CURL_GNUTLS_3
 U curl_formadd@@CURL_GNUTLS_3
 U curl_formfree@@CURL_GNUTLS_3
 U curl_free@@CURL_GNUTLS_3
 U curl_global_cleanup@@CURL_GNUTLS_3
 U curl_global_init@@CURL_GNUTLS_3
 U curl_multi_add_handle@@CURL_GNUTLS_3
 U curl_multi_fdset@@CURL_GNUTLS_3
 U curl_multi_init@@CURL_GNUTLS_3
 U curl_multi_perform@@CURL_GNUTLS_3
 U curl_multi_remove_handle@@CURL_GNUTLS_3
 U curl_slist_append@@CURL_GNUTLS_3
 U curl_slist_free_all@@CURL_GNUTLS_3
 U curl_unescape@@CURL_GNUTLS_3
 U curl_version@@CURL_GNUTLS_3
 U curl_version_info@@CURL_GNUTLS_3
 U fprintf@@GLIBC_2.2.5
38e0 t frame_dummy
 U free@@GLIBC_2.2.5
4590 T getBinaryDataFromR
43f0 T getCURLPointerRObject
53e0 T getCurlError
5560 T getCurlInfoElement
5900 T getCurlPointerForData
3c40 T getMultiCURLPointerRObject
3ec0 T getRStringsFromNullArray
4190 T makeCURLPointerRObject
47b0 T makeCURLcodeRObject
3d80 T makeMultiCURLPointerRObject
 U malloc@@GLIBC_2.2.5
 U memcpy@@GLIBC_2.2.5
002080a0 d names.7400
00208688 d p.6181
 U realloc@@GLIBC_2.2.5
 U select@@GLIBC_2.2.5
 U sprintf@@GLIBC_2.2.5
 U stderr@@GLIBC_2.2.5
 U strcpy@@GLIBC_2.2.5
 U strlen@@GLIBC_2.2.5
 U strncpy@@GLIBC_2.2.5
mkimpel-m90 ~/bin/curl-7.18.1:




On Tue, May 6, 2008 at 10:36 PM, Martin Morgan [EMAIL PROTECTED] wrote:

 Hi Mark...

 A couple of shots in the dark, as no one else seems to be leaping in...

 The symbol Curl_base64_encode should be defined in
 /home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so.  What
 does

 nm /home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so

 say? Mine says

 3980 T Curl_base64_encode

 with the 'T' indicating that the symbol is defined (make sure nm spits
 out a bunch of lines before concluding that Curl_base64_encode is not
 defined).

 I retrieved the RCurl source, and one thing I notice is that
 RCurl/src/curl_base64.c has the 'execute' bit set, and perhaps a sane
 system would not compile it. Try

 % chmod -x RCurl/src/curl_base64.c

 and then

 % R CMD INSTALL RCurl

 Martin

 Mark Kimpel [EMAIL PROTECTED] writes:

  I'm having same problem on Ubuntu 64-bit Hardy Heron. A bunch of
 security
  patches from Ubuntu came out and I installed them today. After that was
 when
  I first noted the problem (affycoretools, which I use all the time,
 won't
  load). Below is my initial output, what follows is my reinstallation
 output
  followed by the same error messages as obtained intially. I wonder if a
  security patch has changed Curl? Or did RCurl just change? I have been
 using
  R-2.7.0 since half-way through its develoment cycle and this is a new
  problem for me.
 
  Mark
 
  require(RCurl)
  Loading required package: RCurl
  Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared library
  '/home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so':
/home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so: undefined
  symbol: Curl_base64_encode
  install.packages(RCurl)
 
  sessionInfo()
  R version 2.7.0 Patched (2008-05-04 r45620)
  x86_64-unknown-linux-gnu
 
  locale:
 
 LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8

Re: [R] [BioC] RCurl loading problem with 64 bit linux distribution

2008-05-06 Thread Mark Kimpel

Duncan,

I know have two version of libcurl on my system, the Ubuntu installed 7.18.0
and my newly compiled from source 7.18.1 (which I installed after my
problems began with RCurl). I was afraid to uninstall 7.18.0 because
Synaptic wanted to uninstall half of my system if I did so via my package
manger. I must not have my PATH set up correctly because when I do curl
--version, I get:

curl 7.18.1 (x86_64-unknown-linux-gnu) libcurl/7.18.0 OpenSSL/0.9.8g zlib/
1.2.3.3 libidn/1.1
Protocols: tftp ftp telnet dict ldap ldaps http file https ftps
Features: GSS-Negotiate IDN IPv6 Largefile NTLM SSL libz

So curl is new and libcurl is a version older. This probably isn't ideal,
but may help us figure out what is going on because I ran nm against both
versions of libcurl

for libcurl 7.18.1:
mkimpel-m90 /usr/local/lib: nm libcurl.so | grep Curl_base64_encode
9e30 T Curl_base64_encode

for libcurl 7.18.0
mkimpel-m90 /usr/lib: nm libcurl.so | grep Curl_base64_encode
nm: libcurl.so: no symbols

it looks like RCurl is still trying to link against 7.18.0, but, do I
interpret your comments to mean that, if I set up my PATH correctly so that
the newer version is found first, things might work?

Regardless, I hope this is somewhat diagnostic. If I have screwed up the
setup too much to make sense out of, perhaps one of the other guys with
problems could also furnish the info.

BTW, I did cc you on my post to R-help this evening (see below).

Thanks for your help and all your development efforts,
Mark


romMark Kimpel [EMAIL PROTECTED]toLoyal Goff [EMAIL PROTECTED],
[EMAIL PROTECTED],
[EMAIL PROTECTED],
dateTue, May 6, 2008 at 9:18 PM


On Wed, May 7, 2008 at 12:43 AM, Duncan Temple Lang [EMAIL PROTECTED]
wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1


 Hi all

 ~  I'm glad this made it to R-help (or R-devel) so that I saw it
 as this is the sort of problem that should be at least CC'ed to the
 package maintainer.

 ~   Yes, there was a change to RCurl yesterday with one of the changes
 being to synchronize code between libcurl and RCurl regarding base64
 encoding
 which was causing a segfault with recent versions of libcurl.

 ~  The latest RCurl does not include the code for the Curl_base64_encode
 which was in the curl_base64.c file.  The intent was to link against
 the on in libcurl, but what your reports suggest is that one some
 systems this is not available from libcurl.so. Can you confirm this with
 the nm output from libcurl.so

 ~nm libcurl.so | grep Curl_base64_encode

 Precisely where libcurl.so (or libcurl.so.digit...) will vary,
 but it is probably in /usr/local/lib/ and you can see
 by using

 ~curl-config --libs
 and seeing if there is -Ldirectory/path in the output which will
 tell you where it is likely to be.

 If the symbol (Curl_base64_encode) is not there, there will be no output!


 ~  If that is the case, we will have to back to having our own copy
 of that routine and so we will end up with two versions - one for
 the old and one for the new and the configuration will endeavor to
 determine which is appropriate.

 ~ HTH
 ~   D.


 Mark Kimpel wrote:
 | Martin,
 |
 | Well, thanks for jumping in! We need all the help we can get ;)
 |
 | I changed the execute bit as you suggested and recompiled, no luck,
 still
 | the same error message.
 |
 | Below is the output you wanted me to look at, its a bit beyond me so I
 | include both a brief grep summary and then the whole enchilada. I do
 note
 | that my output is different from yours, but I'm not sure how to
 interpret.
 |
 | I also thought about removing curl from my system, but when starting to
 do
 | so with Synaptic, it looked like if I removed libcurl I would trash an
 awful
 | lot of my system. I did download and install the latest curl 7.18.1 on
 top
 | of the other one, put /usr/local/ to the start of my PATH, reinstalled
 | RCurl, and still the same erro message comes up.
 |
 | So, what does it mean that the output of nm is different on our systems
 and
 | is it important?
 |
 | Thanks, Mark
 |
 | mkimpel-m90 ~/bin/curl-7.18.1: nm
 | /home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so | grep
 | base64_encode
 |  U Curl_base64_encode
 | 3910 T R_base64_encode
 | mkimpel-m90 ~/bin/curl-7.18.1: nm
 | /home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so | grep
 | Curl_base64_encode
 |  U Curl_base64_encode
 | mkimpel-m90 ~/bin/curl-7.18.1: nm
 | /home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so
 |  U CDR
 | 00208aa0 d CallEntries
 | 00208c00 D CurlErrorNames
 | 0020aac0 D CurlInfoNames
 | 00209740 D CurlOptionNames
 |  U Curl_base64_decode
 |  U Curl_base64_encode
 |  U INTEGER
 |  U LENGTH
 |  U LOGICAL
 | 0020aee8 B OptionMemoryManager
 |  U PRINTNAME
 |  U RAW
 | 3f70 T

[R] interactive rotatable 3d scatterplot

2008-05-02 Thread Mark Kimpel

I would like to create a 3d scatterplot that is interactive in the sense
that I can spin it on its axes to better visualize some PCA results I have.
What are the options in R? I've looked at RGL and perhaps it will suffice
but it wasn't apparent from the documentation I found.

Any demo scripts available for a package that will work?

Mark

-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] merge with rownames?

2008-04-03 Thread Mark Kimpel

Can merge be tricked into merging via rownames as opposed to via contents of
a particular column? I have two data.frames with overlapping, but out of
order, rownames, but no column contents in common and would like to merge
without cbinding the rownames to the data.frames.
Mark

-- 
Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Journal for R

2008-03-30 Thread Mark Kimpel

As a medical researcher, I keep tabs on the journals Bioinformatics andd BMC
Bioinformatics. If your package is for 'omics, those are good journals to
look at.
Mark

On Sun, Mar 30, 2008 at 10:02 AM, Peter Dalgaard [EMAIL PROTECTED]
wrote:

 Gavin Simpson wrote:
  On Sun, 2008-03-30 at 15:01 +0200, Christophe Genolini wrote:
 
  Hi the list
 
  I made up a new statistical procedure. I will publish it in a medical
  journal, but there will be only the way of using it, no calculation or
  algorithme detail.
  So is there a journal (I mean scientific journal) with selection
 commity
  to submit an article describing the detail of a package?
 
 
  There may be others, perhaps dedicated to a particular area of study
  (e.g. Computers and Geosciences, and Ecological Modelling might be
  appropriate places for the broad area of environmetrics), but the
  Journal of Statistical Software is an obvious choice for what you
  describe.
 
  www.jstatsoft.org
 
  HTH
 
  G
 
 Also, let me remind you and others that R News is also peer reviewed
 (although not yet indexed).
 It specifically includes solicits short to medium length articles of the
 following kinds

* Changes in R: new features of the latest release
* Changes on CRAN: new add-on packages, manuals, binary
  distributions, mirrors,...
* Add-on packages: short introductions to or reviews of R extension
  packages
* Programmer's Niche: nifty hints for programming in R (or S)
* Hints for newcomers: Explaining sides of R that might not be so
  obvious from reading the manuals and FAQs.
* Applications: Examples of analyzing data with R

 Lengthier papers are probably better placed in the JSS, JCGS
 (J.Computational and Graphical Statistics), or CSDA (Computational
 Statistics and Data Analysis), depending on contents. For in-depth
 descriptions of packages, JSS has become quite popular in recent years

-pd
(Current associate editor for R News, past AE for JSS)

 --
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rating R Helpers

2007-12-01 Thread Mark Kimpel

I'll throw one more idea into the mix. I agree with Bill that a rating
system for respondents is probably not that practical and of not the highest
importance. It also seems like a recipe for creating inter-personal problems
that the list doesn't need.

I do like Bill's idea of a review system for packages, which could be
incorporated into my idea that follows...

What I would find useful would be some sort of tagging system for messages.
I can't count the times I've remembered seeing a message that addresses a
question I have down the road but, when Googled, I can't find it. It would
be so nice, for example, to reliably be able to find all messages related to
a certain package or package function posted within the last X days. This
could be implemented as simply as asking posters to provide keywords at the
end of a message, but it would be great if they could somehow be pulled out
of a message and stored in a DB. For instance keywords could be surrounded
by a sequence of special characters, which a parser could then extract and
store in a DB along with the message.

Of course, this would be work to set up, but how many of our experts who
so kindly give of their time, get exasperated when similar questions keep
popping up on the list? Also, if we had a web-accessable DB, the responses,
not the responders, could be rated as to how well a reply takes care of an
issue. Thus, over time, a sort of auto-wiki could be born. I can think of
more uses for this as well. For example a developer could quickly check to
see what usability problems or suggestions have cropped up of on individual
package.

Mark

On Dec 1, 2007 2:21 AM, [EMAIL PROTECTED] wrote:

 This seems a little impractical to me.  People respond so much at random
 and most only tackle questions with which they feel comfortable.  As
 it's not a competition in any sense, it's going to be hard to rank
 people in any effective way.  But suppose you succeed in doing so, then
 what?

 To me a much more urgent initiative is some kind of user online review
 system for packages, even something as simple as that used by Amazon.com
 has for customer review of books.

 I think the need for this is rather urgent, in fact.  Most packages are
 very good, but I regret to say some are pretty inefficient and others
 downright dangerous.  You don't want to discourage people from
 submitting their work to CRAN, but at the same time you do want some
 mechanism that allows users to relate their experience with it, good or
 bad.


 Bill Venables
 CSIRO Laboratories
 PO Box 120, Cleveland, 4163
 AUSTRALIA
 Office Phone (email preferred): +61 7 3826 7251
 Fax (if absolutely necessary):  +61 7 3826 7304
 Mobile: +61 4 8819 4402
 Home Phone: +61 7 3286 7700
 mailto:[EMAIL PROTECTED]
 http://www.cmis.csiro.au/bill.venables/

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 On Behalf Of Doran, Harold
 Sent: Saturday, 1 December 2007 6:13 AM
 To: R Help
 Subject: [R] Rating R Helpers

 Since R is open source and help may come from varied levels of
 experience on R-Help, I wonder if it might be helpful to construct a
 method that can be used to rate those who provide help on this list.

 This is something that is done on other comp lists, like
 http://www.experts-exchange.com/.

 I think some of the reasons for this are pretty transparent, but I
 suppose one reason is that one could decide to implement the advise of
 those with superior or expert levels. In other words, you can trust
 the advice of someone who is more experienced more than someone who is
 not. Currently, there is no way to discern who on this list is really an
 R expert and who is not. Of course, there is R core, but most people
 don't actually know who these people are (at least I surmise that to be
 true).

 If this is potentially useful, maybe one way to begin the development of
 such ratings is to allow the original poster to rate the level of help
 from those who responded. Maybe something like a very simple
 questionnaire on a likert-like scale that the original poster would
 respond to upon receiving help which would lead to the accumulation of
 points for the responders. Higher points would result in higher levels
 of expertise (e.g., novice, ..., wizaRd).

 Just a random thought. What do others think?

 Harold




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
-- 
Mark W. Kimpel MD

64 matches

Mail list logo