[R] variable importance in random forest

2005-02-28 Thread Murad Nayal


Hello,

In Breiman papers on random forests 4 variable importance measures are
described. as far as I can tell only two are available in the random
forest R package. reduction in accuracy when the variable is permuted,
and the mean decrease in the gini index due to the variable (no
permutation). is this gini measure computed on the training set or the
OOB cases?. in any event, Breiman actually seems to prefer a different
measure based on average lowering of margin across all cases when the
variable is permuted. is there any way to get this 'margin-based'
variable importance measure from the result returned by the randomForest
function? or do I have to use the original Breiman code to get access to
this measure?

I am using randomForest package release 4.3

many thanks
Murad Nayal

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] unavoidable loop? a better way??

2004-11-12 Thread Murad Nayal

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[Fwd: Re: [R] unavoidable loop? a better way??]

2004-11-12 Thread Murad Nayal

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] 64-bit R on Intel Xeon EM64T running fine

2004-10-12 Thread Murad Nayal


Roger D. Peng wrote:
 
 This is good news.  As far as I know R has built for quite some time
 now on a number of 64 bit platforms (Linux on AMD Opteron/Athlon64,
 Solaris/Sparc) 


Ill add SGI/IRIX 64 bit platform to the list. I've been running a 64
bit-compiled R on an SGI octane 2 for over a year now without a problem.
R sessions often allocate around 8 GB of memory.


but I can't recall seeing a build on Intel with the 64
 bit extensions.  By the way, did you happen to run `make check' just
 for kicks?
 
 -roger
 
 Michael Seewald wrote:
  Dear mailing-list members,
 
  In the days of cheap RAM and microarray applications feasting on memory,
  64-bit computers become more and more useful - to actually make use of memory
  beyond the magic 4GB border. I would like to report the success of running
  64-bit R on an Intel Xeon EM64T machine under Linux. Just like on an AMD
  Opteron, R v2.0.0 compiles fine (and out of the box) and is happily allocating
  memory until RAM and swap reach their limit.
 
  Hardware:
  - HP xw6200 workstation
  - dual Intel Xeon 3.4GHz with hyper-threading enabled
  - 4GB RAM, 4GB swap
 
  System: either
  - Fedora Core 2 x86_64 bit Linux
  or
  - Red Hat Enterprise Linux Workstation 3.0 x86_64 bit
 
  R:
  - v2.0.0
 
  Really, no problems at all during setup, a big thank you to the R developers
  making this possible!
 
  Best wishes,
  Michael
 
  __
  [EMAIL PROTECTED] mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] debugging non-visible functions

2004-10-12 Thread Murad Nayal


Hi,

I would like to step-through a non-visible function. but apparently I
don't know enough about namespaces to get that to work:

 methods(predict) 
 ... deleted lines ... 
[27] predict.rpart* predict.smooth.spline*
[31] predict.survreg.penal*

Non-visible functions are asterisked


 debug(predict.rpart)
Error: Object predict.rpart not found


 getAnywhere(predict.rpart)
A single object matching 'predict.rpart' was found
It was found in the following places
  registered S3 method for predict from namespace rpart
  namespace:rpart
with value

function (object, newdata = list(), type = c(vector, prob, 
class, matrix), ...) 
{
... deleted code ...
}
environment: namespace:rpart


 debug(predict.rpart,pos=package:rpart)
Error: Object predict.rpart not found


how can I 'debug' non-visible functions, like predict.rpart?

many thanks
Murad Nayal

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] silhoutte.default bugs

2004-01-22 Thread Murad Nayal

Martin Maechler wrote:
 
  Murad == Murad Nayal [EMAIL PROTECTED]
  on Wed, 21 Jan 2004 15:19:28 -0500 writes:
 
 Murad This might have been fixed in later versions (I am
 Murad using R1.7.0),
 
 yes, the bug has been fixed long ago,
 from my ChangeLog (!), it was 2003-07-18.

sorry about that. I have been reluctant to upgrade recently for fear of
disrupting my environment while in the middle of a project. as I
mentioned I searched the archive and found posts citing this problem but
no replies stating that it has been fixed (the Nj=1 case).


 
 I'm still willing to consider your *feature request* (as opposed
 to bug fix) of allowing inputs where the grouping vector does
 contain other than 1:g .

that would be great. it is straightforward to do and will broaden the
utility of silhouette. I'll send you the suggested patch privately.

best regards,
Murad

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] silhoutte.default bugs

2004-01-21 Thread Murad Nayal


Hello all,


This might have been fixed in later versions (I am using R1.7.0), r-help
archive contains messages reporting similar problems but no reports of
codes fixes. I have encountered a couple of problems using the
silhouette function. one occurs when the clustering contains clusters
composed of 1 element (Martin Maechler posted code few months ago that
fixes a similar problem that occurs when clusters have only 2 elements
but not the case with 1 element). the other problem is due to
silhouette's assumption that the clusters are numbered sequentially
starting at 1. one of the clustering programs I use (snob) assigns more
or less arbitrary integer ids to clusters starting from 3! (clusters 1
and 2 have special meaning in snob). the modified code fixing both
problems is included below, changes are commented.

best
Murad

silhouette.default -
function (x, dist, dmatrix, ...) 
{
cll - match.call()
if (!is.null(cl - x$clustering)) 
x - cl
n - length(x)
if (!all(x == round(x))) 
stop(`x' must only have integer codes)
k - length(clid - sort(unique(x)))
if (k = 1 || k = n) 
return(NA)
if (missing(dist)) {
if (missing(dmatrix)) 
stop(Need either a dissimilarity `dist' or diss.matrix
`dmatrix')
if (is.null(dm - dim(dmatrix)) || length(dm) != 2 || 
!all(n == dm)) 
stop(`dmatrix' is not a dissimilarity matrix compatible to
`x')
}
else {
dist - as.dist(dist)
if (n != attr(dist, Size)) 
stop(clustering `x' and dissimilarity `dist' are
incompatible)
dmatrix - as.matrix(dist)
}
wds - matrix(NA, n, 3, dimnames = list(names(x), c(cluster, 
neighbor, sil_width)))
for (j in 1:k) {
Nj - sum(iC - x == clid[j])
#
# the following line changed from  wds[iC, cluster] - j
#
wds[iC, cluster] - clid[j]
a.i - if (Nj  1) 
colSums(dmatrix[iC, iC])/(Nj - 1)
else 0
#
# the following line changed from 
# diC - rbind(apply(dmatrix[!iC, iC], 2, function(r) tapply(r,
# x[!iC], mean)))
#
diC - rbind(apply(cbind(dmatrix[!iC, iC]), 2, function(r)
tapply(r, 
x[!iC], mean)))
minC - max.col(-t(diC))
wds[iC, neighbor] - clid[-j][minC]
#
# the following line changed from 
# b.i - diC[cbind(minC, seq(minC))]
#
b.i - diC[cbind(minC, seq(along=minC))]
s.i - (b.i - a.i)/pmax(b.i, a.i)
wds[iC, sil_width] - s.i
}
attr(wds, Ordered) - FALSE
attr(wds, call) - cll
class(wds) - silhouette
wds
}
-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] model-based clustering

2004-01-14 Thread Murad Nayal


Hello,

I was wondering whether a Poisson mixture modeler/cluster analysis
package is available for R. I scanned CRAN packages and couldn't find
anything but I thought I'd ask. If not could anyone recommend a non-R
open source package. I have found 'snob' but this program seems a bit
hard to use in an automated, non interactive fashion.

regards,
Murad


-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] model-based clustering

2004-01-14 Thread Murad Nayal

Hello Murray,

thanks for the response. I would actually love to hear alternative
suggestions about the problem I am trying to solve. I just thought a
short question will be less of a burden on people's time and have a
higher chance of being answered.

basically the data sets I need to analyze contain 2000-1 objects.
each characterized by, depending on the data set, 9-20 attributes. all
integers greater than zero, typically the range is [0,1000] with numbers
 5 particularly common. there is no apriori reason why these objects
should cluster into discrete groups. and in fact when the data is
explored graphically (xgobi) it doesn't show an obvious clustering
pattern. however, with 9-20 dimensions involved, it is probably easy to
miss subtle patterns. I have tried clustering the data using a number of
standard approaches including hclust,kmeans,fanny etc. but these methods
didn't seem to be able to generate convincingly distinct, homogeneous
clusters. of course given the type of the data involved Poisson mixtures
seem like the natural choice.

I have experimented a bit with snob using contrived data sets (where you
know which class objects really belong to) and it has been fairly
promising, except maybe for snob's tendency to break the known classes
into multiple subclasses. 

I actually would like to try to code this in R. It would be very helpful
to me in fact if you can contribute any code/code fragments/examples
from your earlier work on this, either to the list or privately.

many thanks
Murad



[EMAIL PROTECTED] wrote:
 
 The list could probably be more useful if you gave more details about your
 data and the problem. I have written a bit of R code myself for fitting a
 finite mixture of univariate Poissons by EM and found it very simple to
 program in R. I suspect that your problem is multivariate, but that should
 not present any difficulties.
 
 The Snob program employs a fairly sophisticated model search strategy
 based on the Minimum Message Length criterion. If you do not know much
 about the solution that you are seeking it might be a good way to go. I
 appreciate that Snob can be rather complex to set up and get going but I
 think that you should be able to get quite a bit of help from the Monash
 University people behind the program. They are usually quite keen to
 encourage new users of Snob.
 
 Murray Jorgensen
 
 
  Hello,
 
  I was wondering whether a Poisson mixture modeler/cluster analysis
  package is available for R. I scanned CRAN packages and couldn't find
  anything but I thought I'd ask. If not could anyone recommend a non-R
  open source package. I have found 'snob' but this program seems a bit
  hard to use in an automated, non interactive fashion.
 
  regards,
  Murad
 
 
  --
  Murad Nayal M.D. Ph.D.
  Department of Biochemistry and Molecular Biophysics
  College of Physicians and Surgeons of Columbia University
  630 West 168th Street. New York, NY 10032
  Tel: 212-305-6884 Fax: 212-305-6926
 
  __
  [EMAIL PROTECTED] mailing list
  https://www.stat.math.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 

-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] SVM question

2003-11-18 Thread Murad Nayal


Hello all,

I am trying to use svm (from the e1071 package) to solve a binary
classification problem. The two classes in my particular data set are
unequally populated. class 'I' (for important) has about 3000 instances
while class B (for background) has about 20,000. experimenting with
different classifiers I realized that in cases where such an asymmetry
exists there is a danger in trivially inflating accuracy levels by
biasing the classifier towards the more prevalent class. for example,
using the numbers cited above, if the testing set maintains the same
distribution of classes as the original data set then you can get an
accuracy of about 85% by simply classifying everything as a B. an
unsatisfactory classifier given the 'importance' of detecting the I
class.

which brings me to my question: I am trying to adjust for these issues
by 

- using the class.weights parameter of svm: I couldn't quite get a sense
of how to use this parameter from the svm help page (or the introductory
papers on the libsvm web site). Is this supposed to be a vector of the
priors for the two classes i.e. c(I=.15,B=.85) (which gave me horrible
coverage of the 'I' class). is there any 'correct' or conventional
values to use for this parameter in cases of unequal sample sizes (for
example, the 'complement' of the priors: c(I=0.85,B=0.15) on the grounds
that these values will give the two classes in the dataset equal
weights. or is it simply another tunable parameter. 

- choosing training sets that contain randomly selected but equal
numbers of cases of each class (and testing on the remaining cases. this
is repeated to assess stability of the accuracy and coverage values).
here i get mediocre accuracy but respectable coverage of I. This is
not strictly an R question, but I thought someone on the list might have
had recent experience with these types of problems and can offer some
comments about such an approach.

many thanks

 
-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] graphics reset

2003-11-16 Thread Murad Nayal


Hello,

Is there a specific command to clear the graphics window. On occasion I
need to construct plots using commands that don't clear the graphics
window (like text, lines and points etc.) -only- and hence need to clear
the graphics completely before hand. 

also, is there a way to restore the graphics parameters to default
values, say in these cases where you forgot to save the original values
and want to restore the graphics to some sane state after a long R
session.

many thanks


-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] kmeans error (bug?)

2003-11-10 Thread Murad Nayal



Prof Brian Ripley wrote:
 
 This is not a bug.  It just means that the algorithm sometimes finds an
 empty cluster, and as you asked for 34 clusters and it had 33 or less it
 stops.
 
 What to do in this situation is currently under discussion, but the advice
 given is good: try another set of initial centres.

I am running kmeans in a loop for a range of possible cluster numbers.
The error terminates the loop. is there a mechanism by which I can
'trap' the error so that I can rerun kmeans with another set of initial
centers and hence allow the loop to run to completion. something like
try {} catch() mechanism of C++ for example. A flag for kmeans that
would have it return say a NULL value rather than an error would also
help in this type of application.


In fact, I wonder if anyone can point me to research, or better still R
functions/package/recipe, that help in choosing the best number of
clusters for the data. What I have tried so far is to do a manova using
the clustering result from kmeans, plot the approximate F statistic
and/or the p-value and look for cluster numbers where a sharp increase
in F or -log(pvalue) occur. what I would like to do but don't know how
is to formally compare successive clustering models. I know you can
compare models using the R function anova. but anova does not seem to
work with mlm models?


 
 Please do read the description of a bug in the R FAQ, and do not misuse
 the term to mean `something I do not understand'.

This wasn't really a declaration that this behavior is a bug, rather it
was a question of whether it is (hence the question mark). I guess what
I found somewhat confusing is that if kmeans was selecting data points
at random as the initial cluster centers then, at least initially, non
of these clusters would start out empty. It wasn't immediately clear how
could further refinement result in clusters becoming empty.

thanks for the feedback


 
 On Mon, 10 Nov 2003, Murad Nayal wrote:
 
  I have been getting the following intermittent error from kmeans:
 
  str(cavint.p.r)
   num [1:1967, 1:13] 0.691 0.123 0.388 0.268 0.485 ...
   - attr(*, dimnames)=List of 2
..$ : chr [1:1967] 6 49 87 102 ...
..$ : chr [1:13] HYD NEG POS OXY ...
   set.seed(34)
   kmeans(cavint.p.r,centers=34)
  Error: empty cluster: try a better set of initial centers
 
  the seed being equal to the number of centers in this case is just a
  coincidence. I've encountered the same error with or without setting the
  seed at different numbers of clusters.
 
  there is nothing particularly unusual about cavint.p.r (no NAs, NULLs),
  except maybe for the fact that the rows sum to 1.
 
   sum(is.na(cavint.p.r))
  [1] 0
   sum(is.nan(cavint.p.r))
  [1] 0
  
 
  I thought kmeans should select initial centers from the data if not
  given explicitly! any idea what might be going wrong?
 
 And what makes you think it did not?
 
 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] kmeans error (bug?)

2003-11-09 Thread Murad Nayal

Hello,

I have been getting the following intermittent error from kmeans:

str(cavint.p.r)
 num [1:1967, 1:13] 0.691 0.123 0.388 0.268 0.485 ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:1967] 6 49 87 102 ...
  ..$ : chr [1:13] HYD NEG POS OXY ...
 set.seed(34)
 kmeans(cavint.p.r,centers=34)
Error: empty cluster: try a better set of initial centers

the seed being equal to the number of centers in this case is just a
coincidence. I've encountered the same error with or without setting the
seed at different numbers of clusters.

there is nothing particularly unusual about cavint.p.r (no NAs, NULLs),
except maybe for the fact that the rows sum to 1.

 sum(is.na(cavint.p.r))
[1] 0
 sum(is.nan(cavint.p.r))
[1] 0
 

I thought kmeans should select initial centers from the data if not
given explicitly! any idea what might be going wrong?

I am running R 1.7.0

many thanks

Murad

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] clustering distributions

2003-11-07 Thread Murad Nayal


Hi,

I have a dataset where each case is characterized by a histogram. I
would like to cluster these cases using a sensible distance measure,
possibly relative entropy? Is there a way I can use R facilities to do
this (hclust etc.). I couldn't find an alternative to dist that would
compute something like relative entropies. 

thanks
Murad

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] anova model refinement/clustering question

2003-10-23 Thread Murad Nayal


Hi,

I am trying to refine models of a continuous response variable and a
number of categorical predictor variables. I know of some model
refinement tools available in R that help in the selection of model
terms like dropterm and addterm from MASS etc. However, I would also
like to try to refine the model by 'coalescing' some levels of some of
the predictor factors. Is there a standard procedure / R-functions that
will allow me to do this.

This might be naive but I thought that one way to do this is to perform
a pairwise comparison between all levels, say using tukeyHSD, and
coalesce levels that do not have a statistically significant difference
in the average of the response variable between them. so in a way this
becomes a clustering problem. is there a relatively easy way to do this
in R, say short of trying to figure out how to make the relevant
tukeyHSD output look like a dist object and trick hclust into using it. 

I am somewhat of an amateur in the field (and R) and I am probably
making that obvious. any guidance to the 'right' path to approach this
(privately or on the list) is really appreciated.

many thanks
Murad



-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] multiple character matching within a string

2003-10-11 Thread Murad Nayal


Hello all,

I need to count the number of times certain characters occur in a
string. The only way I have found so far to accomplish this is by using
strsplit i.e.

my.string - DDDRRHIH
my.char   - D
num.char - -1 + length(unlist(strsplit(my.string,my.char)))

now you probably won't be surprised if I say that this has proven to be
extremely slow (I am not sure exactly why though, is it because strsplit
creates new list for every call?). Is there an alternative way to do
this short of going to compiled code?

many thanks,





-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] EMACS/ESS problems

2003-10-02 Thread Murad Nayal


Hello all,

since we're on the topic of R-editors. I am using emacs/ess on a unix
workstation (to interact with R and have been having a little problem. I
usually write the R commands I need to run in a separate buffer then
copy and paste them into the *R* buffer for evaluation. The problem is,
if any command is spread over multiple lines emacs/R hangs when I paste
it in the R buffer for evaluation. if I use a debugger to see what's
going on in both programs they're usually waiting on a select statement
(input/output). Anybody has had to deal with a similar situation. any
advice for a workaround? both emacs/ess are relatively recent versions
(installed a few months ago). I tried using ess-eval-buffer/region
instead of cutting and pasting and the same thing happens for me.

many thanks

-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] EMACS/ESS problems

2003-10-02 Thread Murad Nayal

Hi,


A.J. Rossini wrote:
 
 1. I've never seen this behavior, ever.  Do you get the same with C-c C-r
 (highlight region, then C-c C-r sends to the R process in Emacs).  Or,
 if you use C-c C-n to step through the lines?

maybe my environment is not set up correctly. C-c C-r doesn't do
anything:

highlight: (either in a text buffer or *ESS*)

v = c(1,
  2,
  3)

C-c C-r
switch to *R*
v
Error: Object v not found

OR

ess-eval-region
prints in the one line buffer at the bottom: 
Starting evalutions...
and hangs (I have to stop it with C-g)

now if v was defined on one line 
v=c(1,2,3)
ess-eval-region returns with Finished evaluation. but the v vector is
still not defined in R

I am sure I am doing something silly. just can't figure out what?

 
 2. [EMAIL PROTECTED] might be a better place to send this.

thanks for the advice. I'll try that too.

 
 Murad Nayal [EMAIL PROTECTED] writes:
 
  Hello all,
 
  since we're on the topic of R-editors. I am using emacs/ess on a unix
  workstation (to interact with R and have been having a little problem. I
  usually write the R commands I need to run in a separate buffer then
  copy and paste them into the *R* buffer for evaluation. The problem is,
  if any command is spread over multiple lines emacs/R hangs when I paste
  it in the R buffer for evaluation. if I use a debugger to see what's
  going on in both programs they're usually waiting on a select statement
  (input/output). Anybody has had to deal with a similar situation. any
  advice for a workaround? both emacs/ess are relatively recent versions
  (installed a few months ago). I tried using ess-eval-buffer/region
  instead of cutting and pasting and the same thing happens for me.
 
  many thanks
 
  --
  Murad Nayal M.D. Ph.D.
  Department of Biochemistry and Molecular Biophysics
  College of Physicians and Surgeons of Columbia University
  630 West 168th Street. New York, NY 10032
  Tel: 212-305-6884 Fax: 212-305-6926
 
  __
  [EMAIL PROTECTED] mailing list
  https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 
 
 --
 [EMAIL PROTECTED]http://www.analytics.washington.edu/
 Biomedical and Health Informatics   University of Washington
 Biostatistics, SCHARP/HVTN  Fred Hutchinson Cancer Research Center
 UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
 FHCRC  (M/W): 206-667-7025 FAX=206-667-4812 | use Email

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] EMACS/ESS problems

2003-10-02 Thread Murad Nayal


that was exactly what I was missing, Everything now works as advertised.
Thank you all so much for the help. you just turned my already very
satisfying experience using R into a even more enjoyable one.

all the best

Rich Heiberger wrote:
 
 It looks like you are not using ESS correctly.  ESS is designed to
 work from a buffer containing a file whose name has the .r extension.
 Thus, open a file, for example,
C-x C-f myfile.r
 and then start using R.
 
 My diagnosis is based on your line
  highlight: (either in a text buffer or *ESS*)
 
 ESS won't work from a text buffer because R code has different
 requirements from ordinary written paragraphs in English or another
 natural language.
 
 The *ESS* buffer is part of the background mechanism that makes ESS work.
 It is not intended that a user ever look at the *ESS* buffer.
 
 One other issue that your original email suggests.  We do not
 recommend the statement
v=c(1,2,3)
 for assignment.  It is much better to use the assignment arrow
v - c(1,2,3)
 (with spaces on both sides of the arrow for legibility).  It is true
 that the = will usually do what you expect, but there are some subtle
 differences (mostly in argument lists to functions).  While I can expand on
 the reasons, for the moment I just want to suggest that you get into the
 habit of using the assignment arrow.
 
 Rich

-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] lattice question

2003-08-29 Thread Murad Nayal


Hello,

I am using lattice to plot histograms of one variable conditioned on
another continuous variable. for this I am using equal.count on the
conditioning variable to get the appropriate shingle. I would like to
have in my plot a representation of the shingle's intervals including
the min/max values and maybe tick marks. some sort of axis for the
conditioning variable. while the 'strips' of the lattice plot do
represent the single intervals as a darkly shaded region. I can't find a
way to also include in the plot the actual min/max numbers corresponding
to the shingle's intervals. is that possible?

regards

-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] help on barplot

2003-07-20 Thread Murad Nayal


Hello,

I am trying to compare two histograms using barplot. the idea is to plot
the histograms as pairs of columns side by side for each x value. I was
able to do it using barplot before but I can't remember now for the life
of me now how I did it in the past:

 d
  [,1] [,2]
-37.5 0.00 2.789396e-05
-32.5 0.0001394700 5.578801e-05
-27.5 0.0019804742 1.732218e-02
-22.5 0.0217294282 1.380474e-01
-17.5 0.0938912134 4.005579e-02
-12.5 0.0630683403 4.351464e-03
-7.5  0.0163179916 8.368201e-05
-2.5  0.0025941423 5.578801e-05
2.5   0.0002789400 0.00e+00
7.5   0.00 0.00e+00

 barplot(d,beside=TRUE)

barplot here plots two separate 'sets' of columns, on the left side a
bar plot of d[,1] is plotted while on the right side a separate bar plot
of d[,2] is plotted. how can I combine the two?

actually, while on the subject of histograms. is it possible to plot a
3D-histogram in R (a true 3D bar plot, without using image).

many thanks
Murad


-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Formal definitions of R-language.

2003-07-17 Thread Murad Nayal


some comments. I am still learning S/R so please let me know if I am
missing something.

M.Kondrin wrote:
 
 Hello!
 Some CS-guys (the type who knows what Church formalism is) keep asking
 me questions about formal definitions of R-language that I can not
 answer (or even understand). Is there some freely available papers which
 I can throw at them where it would be explained is R
 functional/OOP/procedural language,

R is object oriented in the sense that the basic entities of the
language (data elements, language elements etc.) are complex entities
with type and defined behavior (objects). it has inheritance,
polymorphism, operator overloading etc. you can define constructors for
objects, but as far as I know no destructors (you can define clean up
routines on libraries though).

it has has elements of being a functional language like the fact that
programs are composed of expressions that are turned into function
objects that get evaluated. it also supports lambda expressions. it is
not a pure functional language because functions can and do have side
effects, it has persistent state and assignments, and it has flow of
control statements. also, recursion, as far as I know, is inefficient in
S/R. which tend to discourage purely functional programming.

 does it use weak/strong,

weak typing. variables are not typed and keep type information once
constructed. conversion between types is often automatic or can be
programmed to be so, hence operations on disparate types can often be
carried out.

 dynamic/static typization,

it is dynamically typed. objects carry and supply type information at
run time. types (as well as behavior) can (only) be defined at run time.
the S-evaluator has to start first, it then constructs class and
function definitions in the run-time environment. object type can be
changed, modified and augmented at run time, at least with old style
classes, (can you add or remove slots in the new style classes?).

 does it use lazy or ...(do not know what)
 evaluation,

uses lazy evaluation of expressions. expressions are constructed by the
S-evaluator, but not evaluated until needed.

 what sort of garbage collector it uses?

No garbage collector. uses reference counting to discard objects that
are no longer needed.


 Thanks.
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help

-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] help on R programming.

2003-06-24 Thread Murad Nayal


thanks you all for the replies, it's been very helpful.

regards

Prof Brian Ripley wrote:
 
 On Mon, 23 Jun 2003, Murad Nayal wrote:
 
  - what is the correct way to -remove- a component from a list. this
  seems to do the trick: list[[1]] = NULL, however, you'd think this
  should simply attach a NULL object at the first component position?
 
 This is in the FAQ, section 3.3.3, and is an S/R difference that catches
 people quite often.  It's related to the difference between [] and [[]].
 
 Generally you will find that it is better to program by generating whole
 lists with lapply() or to copy lists retaining what you want (which does
 not copy the components, in general, and so is cheap).
 
 As for your comments on books: `S Programming' does discuss the design of
 classes (both informal and formal), the main data sructures in R. As
 others have said, the Green Book (Chambers, 1998) is by not means out of
 date, except in the sense that the precise langage it describes has never
 been available: it is not a description of any version of S-PLUS nor R.
 
 Generally, though, you need to make sure you have at your fingertips the
 resources which come with R: the various manuals (including R-lang) and
 the on-line help.  For example, I have just spend several days documenting
 in the help pages exactly how subscripting of data frames works (and
 correcting dozens of anomalies and bugs).
 
 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help

-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] help on R programming.

2003-06-23 Thread Murad Nayal


Hello all,

I am looking for books to help me gain a firmer grasp on the S/R
programming language ,  programing / data structures etc. it seems that
for this purpose two books are typically recommended: 

Programming with Data: A Guide to the S Language, John M. Chambers and
S Programming by Venables  Ripley.

- The Chambers book is published 1998. is it a bit dated at this point.
- is the Venables and Ripley's book a good source on the design and
manipulation of data structures in R (it seems mostly focused on R
extensions).
- are there any other books, possibly published more recently, that you
could recommend.


I also have a couple of particular programming questions:

-coming from a C++/java programming background I found that I often end
up in R with lists of objects (each constructed, in turn, as a list, say
list(x=x,y=y,z=z)). often, these individual objects have recursive
'attributes' so a matrix representation of this set of objects is not an
option. although a data.frame might be. I typically need to access
certain attributes of these objects for plotting or analysis etc.
however, I have not been able to come up with a clean way to do this?
e.g.

object.list = list(o1=list(x=1,y=2,z=3), o2=list(x=11,y=22,z=33))

what I would like to do is say get a vector of x values for the objects
in object.list, but something like
object.list[[1:length(object.list)]]$x, for example, returns NULL.

is there a better way to set up such an object list data structure that
will allow me to do this?

- what is the correct way to -remove- a component from a list. this
seems to do the trick: list[[1]] = NULL, however, you'd think this
should simply attach a NULL object at the first component position?  

many thanks for any help

-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884   Fax: 212-305-6926

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help