Re: [R] Month end calculations

2007-08-29 Thread Jim Porzak
Hi Shubha,


By using the tautology that the end of a month is immediately followed
by the first of a month, the following returns a TRUE when the date is
the last day of a month

IsMonthEnd - format(MyDates + 1, %d) == 01

where MyDates is a vector, or column in a data frame, typed as Date
(eg with as.Date)

-- 
HTH,
Jim Porzak
Responsys, Inc.
San Francisco, CA
http://www.linkedin.com/in/jimporzak

On 8/29/07, Shubha Vishwanath Karanth [EMAIL PROTECTED] wrote:
 Hi R users,



 Is there a function in R, which does some calculation only for the month
 end in a daily data?... In other words, is there a command in R,
 equivalent to last. function in SAS?



 BR, Shubha


 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Text Mining

2007-07-06 Thread Jim Porzak
Also see Ingo Feinerer's tm package and his nice vignette.

On 7/6/07, LE PAPE Gilles [EMAIL PROTECTED] wrote:

 Hi everybody,
 I am a new R user. Is there any package devoted to text mining analysis
 in
 R ?
 Thanks
 Gilles
 [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
HTH,
Jim Porzak
Responsys, Inc.
San Francisco, CA
http://www.linkedin.com/in/jimporzak

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest

2007-04-23 Thread Jim Porzak
Rubin,

I'm assuming you really do want to do a classification?

check out
?factor

I'm guessing you have coded MMS_ENABLED_HANDSET as 0, 1; or some such
numeric coding.

suggest you do:
dat$MMS_ENABLED_HANDSET - factor(dat$MMS_ENABLED_HANDSET)
to force your response variable to be a factor (AKA categorical)

And, perhaps, label your levels with something like:
levels(dat$MMS_ENABLED_HANDSET) - c(Not Enabled, MMS Enabled)

On 4/23/07, Ruben Feldman [EMAIL PROTECTED] wrote:
 Hi R-wizards,

 I ran a random forest on a dataset where the response variable had two
 possible values. It returned a warning telling me that it did regression and
 if that was really what I wanted.
 Does anybody know what is being in terms of the algorithm when it does a
 regression? (the random forest is used as a regression, how does that work?)

 Thanks for your time!

 Ruben

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
HTH,
Jim Porzak
San Francisco, CA
http://www.linkedin.com/in/jimporzak

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest

2007-04-23 Thread Jim Porzak
Google random forests see Leo Brieman's site, Wikipedia,  esp link
at bottom of wikipedia page to Andy  Matt's article in RNews

I did a DMA/AC webinar in January. Slides are at:
http://www.porzak.com/JimArchive/JimPorzak_RFwithR_DMAAC_Jan07_webinar.pdf


On 4/23/07, Ron Michael [EMAIL PROTECTED] wrote:
 Dear all R gurus,

 I am really sorry if my query embraces anyone. Can anyone give me some 
 introductory papers or suggestions about what Random Forest is?

 Thanks and regards,

 - Original Message 
 From: Weiwei Shi [EMAIL PROTECTED]
 To: Ruben Feldman [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Sent: Monday, April 23, 2007 8:56:29 PM
 Subject: Re: [R] Random Forest

 Hi, Ruben:

 fit$confusion

 if you provide your test data, then you can also access the confusion
 matrix of test data by

 fit$test$confusion

 there are details of how to use randomForest by reading:
 ?randomForest

 HTH,

 Weiwei

 On 4/22/07, Ruben Feldman [EMAIL PROTECTED] wrote:
  Hi,
 
  I am trying to print out my confusion matrix after having created my random
  forest.
  I have put in this command:
  fit-randomForest(MMS_ENABLED_HANDSET~.,data=dat,ntree=500,mtry=14,
  na.action=na.omit,confusion=TRUE)
   but I can't get it to give me the confusion matrix, anyone know how this
  works?
 
  Thansk!
 
  Ruben
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
 Weiwei Shi, Ph.D
 Research Scientist
 GeneGO, Inc.

 Did you always know?
 No, I did not. But I believed...
 ---Matrix III

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.







 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
HTH,
Jim Porzak
San Francisco, CA
http://www.linkedin.com/in/jimporzak

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest

2007-04-22 Thread Jim Porzak
Rubin,

just type fit or print(fit). confusion = TRUE will not be recognized by
randomForest.

If you are not seeing the confusion matrix, MMS_ENABLED_HANDSET is not a
factor and, thus, a regression fit is being done, not classification as you
apparently desire.

On 4/22/07, Ruben Feldman [EMAIL PROTECTED] wrote:

 Hi,

 I am trying to print out my confusion matrix after having created my
 random
 forest.
 I have put in this command:
 fit-randomForest(MMS_ENABLED_HANDSET~.,data=dat,ntree=500,mtry=14,
 na.action=na.omit,confusion=TRUE)
 but I can't get it to give me the confusion matrix, anyone know how this
 works?

 Thansk!

 Ruben

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
HTH,
Jim Porzak
San Francisco, CA
http://www.linkedin.com/in/jimporzak

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] any way to append a table in SQL server

2007-03-21 Thread Jim Porzak
see...
 library(RODBC)
 ?sqlSave

and rest of RODBC docs. You could also pass a SQL INSERT statement
through sqlQuery()

On 3/20/07, Wensui Liu [EMAIL PROTECTED] wrote:
 Dear Lister,
 Is there an interface in R with SQL server that allows me to append
 records to table in the DB? Might I do that using RODBC?
 Thanks a lot.

 --
 WenSui Liu
 A lousy statistician who happens to know a little programming
 (http://spaces.msn.com/statcompute/blog)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
HTH/Best,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA
http://www.linkedin.com/in/jimporzak

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot.randomForest default mtry values

2007-03-15 Thread Jim Porzak
Joe,

I'm guessing you are doing a 2-category problem. The three lines are
OOB errors for overall error and each of the two categories.

There is only one default value of mtry. You can specify a different
mtry when the forest is built (in your call to randomForest()), but it
applies to the entire forest.

On 3/15/07, Joseph Retzer [EMAIL PROTECTED] wrote:
 When using the plot.randomForest method, 3 error series (by number of trees) 
 are plotted. I suspect they are associated with the 3 default values of mtry 
 that are used, for example, in the tuneRF method but I'm not sure. Could 
 someone confirm?

 Also, is it possible to force different values of mtry to be used when 
 creating the plots? I specified them explicitly in the randomForest statement 
 but it did not seem to have an effect.
 Many thanks,
 Joe Retzer

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA
http://www.linkedin.com/in/jimporzak

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R book advice

2007-02-15 Thread Jim Porzak
Hi Paul,

All three are excellent choices, so you won't go wrong with random
choice. Here is your first R lesson:

RBooks - c(Verzani, Crawley, Dalgaard)
sample(RBooks, 1)

Seriously, I  expect you will end up with all three. Here are my
mini-reviews (in order of publication)

Peter Dalgaard's book came out just before I first discovered R in the
winter of 2002. It was my intro to R and a good stats refresher.
Charles' assessment correct. At only ~250 pages, it is not at all
intimidating, however Peter does build up to some intermediate topics
like logistic regression and survival analysis. My copy is now
somewhat tattered  I should get a replacement!

John Verzani had, and still has, a preliminary version of his book on
CRAN: http://cran.cnr.berkeley.edu/doc/contrib/Verzani-SimpleR.pdf so
I was very excited when it come out in hard copy - much expanded - as
Using R. He has more visualization examples - which I like. I do
wish John would have used - instead of = for assignment. It's
important to start thinking in R - = drags me back to my FORTRAN
days.

Being a mid-western American, I love Michael Crawley's British view of
the world! He really forces you to get an intuitive feel for what is
going on. Also good visualization emphasis. My only criticism is he
suggests using Word to save your work. You should really use a more
serious text editor/environment. I generally use JGR today, having
moved from RWinEdt and TextPad. The Linux folks love ESS, but that is
how they were brought up.

On 2/15/07, Paul Lynch [EMAIL PROTECTED] wrote:
 I'm looking for a book for someone completely ignorant of statistics
 who wishes to learn both statistics and R.  I've found three
 possibilities, one by Verzani (Using R for Introductory Statistics),
 one by Crawley (Statistics: An Introduction using R), and one by
 Dalgaard (Introductory Statistics with R).  Do these books have
 different emphases, perspectives, or strengths?  Should I just pick
 one at random and buy it?

 Thanks,
 --Paul

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA
http://www.linkedin.com/in/jimporzak

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] JGR data editor question

2007-02-10 Thread Jim Porzak
Hi Bob,

I can not reproduce your problem, with possible exception in your step 2:
In data editor, you need to click off of the last cell you edited for
the changes to take

On 2/10/07, Muenchen, Robert A (Bob) [EMAIL PROTECTED] wrote:
 Hi All,

 I'm learning JGR 1.4-15 with R 2.4.1 in Windows XP (all patches
 applied). JGR looks great but I'm having trouble getting the data editor
 to save my results. I don't see anything in R-help about it. Here are
 the steps I followed:

 1. I chose ToolsObject Browser  double-clicked on a data frame,
 mydata.
 2. A spreadsheet editor popped up and allowed me to make changes.
 3. I clicked Update at the bottom right of the data editor screen.
 4. It asked, Export to R? and has Export as: mydata filled in.
 5. I clicked Yes and then closed the window by clicking the usual [X]
 in the top right corner.
 6. Double-clicking the data file again opened it back up but the changes
 were gone.

 Am I missing a step?

 Thanks,
 Bob

 =
 Bob Muenchen (pronounced Min'-chen), Manager
 Statistical Consulting Center
 U of TN Office of Information Technology
 200 Stokely Management Center, Knoxville, TN 37996-0520
 Voice: (865) 974-5230
 FAX: (865) 974-4810
 Email: [EMAIL PROTECTED]
 Web: http://oit.utk.edu/scc,
 News: http://listserv.utk.edu/archives/statnews.html

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA
http://www.linkedin.com/in/jimporzak

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in Industry - new SIG

2007-02-08 Thread Jim Porzak
Thanks Max ( Martin)!

I was about to encourage this. Once the head hunters get wind of this,
I expect a lot of activity - hopefully most will be relevant.

Max, I'd be willing to chip in if you need admin help.

-- 
Best,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA
http://www.linkedin.com/in/jimporzak


On 2/8/07, Kuhn, Max [EMAIL PROTECTED] wrote:
 Martin Maechler called my bluff on this suggestion. I'm now the admin
 for the new special interest group for R related job postings:

https://stat.ethz.ch/mailman/listinfo/r-sig-jobs

 Please send appropriate emails to this list. There are some simple rules
 for postings (e.g. no attachments etc).

 Max

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Kuhn, Max
 Sent: Tuesday, February 06, 2007 5:10 PM
 To: Doran, Harold; R-help@stat.math.ethz.ch
 Subject: Re: [R] R in Industry

 As someone who has (reluctantly) sent job postings to R Help, I think
 that a SIG would be a good idea.

 Max

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Doran, Harold
 Sent: Tuesday, February 06, 2007 2:08 PM
 To: R-help@stat.math.ethz.ch
 Subject: [R] R in Industry

 The other day, CNN had a story on working at Google. Out of curiosity, I
 went to the Google employment web site (I'm not looking, but just
 curious). In perusing their job posts for statisticians, preference is
 given to those who use R and python. Other languages, S-Plus and
 something called SAS were listed as lower priorities.

 When I started using Python, I noted they have a portion of the web site
 with job postings. CRAN does not have something similar, but think it
 might be useful. I think R is becoming more widely used in industry and
 I wonder if helping it move along a bit, the maintainer of CRAN could
 create a section of the web site devoted to jobs where R is a
 requirement.

 Hence, we could have our own little monster.com kind of thing going
 on. Of the multitude of ways the gospel can be spread, this is small.
 But, I think every small step forward is good.

 Anyone think this is useful?

 Harold


 --
 LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] comparing random forests and classification trees

2007-01-31 Thread Jim Porzak
Amy, et al,

I agree with you and the group that comparing test set classification
errors between the two methods is the way to go.

On interpretation, I find the partial dependence plots from
randomForest are useful - especially when talking to clients about
what the forest means. See slides 32 to 38 in my recent DMA
presentation below for some examples. (When looking at the plots for
continuous variables, it's really important to pay attention to the
decile rug plot on the x-axis so as to not get distracted by the edges
which apply to a small part of the population)

I would argue that, except for simple text book examples, a full
classification tree is not all that easy to interpret. Sure, anyone
can walk through each branch, the over all meaning gets lost in the
trees.

http://loyaltymatrix.com/JimPorzak_RFwithR_DMAAC_Jan07_webinar.pdf


On 1/30/07, Darin A. England [EMAIL PROTECTED] wrote:
 Amy,

 I have also had this issue with randomForest, that is, you lose the
 ability to explain the classifier in a simple way to
 non-specialists (everyone can understand the single decision tree.)
 As far as comparing the accuracy of the two, I think that you are
 correct in comparing them by the actual vs predicted tables.
 randomForest reports this as the confusion matrix, and it also
 reports the out-of-bag error, which I think you are referring to. I
 would not compare the rf out-of-bag error with the rpart relative
 error (or cross-validated error if you are doing cross validation.)

 So, for what it's worth I think you are correct. Also, do you know
 about ctree in the party package? If you want to retain the
 explanatory power of a single tree and have a nice accurate
 classifier, I have found ctree to work quite well.

 HTH,

 Darin

 On Mon, Jan 29, 2007 at 11:34:51AM +1100, Amy Koch wrote:
  Hi,
 
  I have done an analysis using 'rpart' to construct a Classification Tree. I
  am wanting to retain the output in tree form so that it is easily
  interpretable. However, I am wanting to compare the 'accuracy' of the tree
  to a Random Forest to estimate how much predictive ability is lost by using
  one simple tree. My understanding is that the error automatically displayed
  by the two functions is calculated differently so it is therefore incorrect
  to use this as a comparison. Instead I have produced a table for both
  analyses comparing the observed and predicted response.
 
  E.g. table(data$dependent,predict(model,type=class))
 
  I am looking for confirmation that (a) it is incorrect to compare the error
  estimates for the two techniques and (b) that comparing the
  misclassification rates is an appropriate method for comparing the two
  techniques.
 
  Thanks
 
  Amy
 
 
 
 
 
  Amelia Koch
 
  University of Tasmania
 
  School of Geography and Environmental Studies
 
  Private Bag 78 Hobart
 
  Tasmania, Australia 7001
 
  Ph: +61 3 6226 7454
 
  [EMAIL PROTECTED]
 
 
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with RandomForest classwt option

2007-01-28 Thread Jim Porzak
See Andy's previous post on this.

-- 
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA


===
Liaw, Andy [EMAIL PROTECTED]  Thu, Oct 27, 2005 at 8:37 AM
To: David L. Van Brunt, Ph.D. [EMAIL PROTECTED], Gabor
Grothendieck [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
classwt in the current version of the randomForest package doesn't work
too well.  (It's what was in version 3.x of the original Fortran code by
Breiman and Cutler, not the one in the new Fortran code.)  I'd advise
against using it.

sampsize and strata can be use in conjunction.  If strata is not
specified, the class labels will be used.  Take the iris data as an example:

randomForest(Species ~ ., iris, sampsize=c(10, 30, 10))

says to randomly draw 10, 30 and 10 from the three species (with
replacement) to grow each tree.  If you are unsure of the labels, use named
vector, e.g.,

randomForest(Species ~ ., iris,
sampsize=c(setosa=10, versicolor=30, virginica=10))

Now, if you want the stratified sampling to be done using a different
variable than the class labels; e.g., for multi-centered clinical trial
data, you want to draw the same number of patients per center to grow each
tree (I'm just making things up, not that that necessarily makes any sense),
you can do something like:

randomForest(..., strata=center,
sampsize=rep(min(table(center))), nlevels(center)))

which draws the same number of patients (minimum at any center) from each
center to grow each tree.

Hope that's clear.  Eventually all such things will be in the yet to be
written package vignette...

Andy


On 1/28/07, Betty Health [EMAIL PROTECTED] wrote:
 Hello there,

 I am working on an extremely unbalanced two class classification problems. I
 wanna use classwt with down sampling together. By checking the rfNews()
 in R, it looks that classwt is not working yet. Then I looked at the
 software from Salford. I did not find the down sampling option.  I am
 wondering if you have any experience to deal with this problem. Do you know
 any method or softwares can handle this problem?

 Thank you very much!!

 Betty

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] xlab, ylab in balloonplot(tab)?

2006-07-03 Thread Jim Porzak
I'm not understanding something.

I'm trying to add xlab  ylab to a balloon plot of a table object. From docs
I thought following should work:

require(gplots)
# From balloonplot example:
 # Create an example using table
 xnames - sample( letters[1:3], 50, replace=2)
 ynames - sample( 1:5, 50, replace=2)

 tab - table(xnames, ynames)

 balloonplot(tab)

# Try xlab, ylab:
balloonplot(tab, xlab = MyX, ylab = MyY)


But second plot is no different from first.

R.version.string: Version 2.3.1 (2006-06-01)
gplots version: 2.3.0
on WinXP SP1

-- 
TIA,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] xlab, ylab in balloonplot(tab)?

2006-07-03 Thread Jim Porzak
Dear ListRs,

I'm not understanding something.

I'm trying to add xlab  ylab to a balloon plot of a table object. From
gplots docs I thought following should work:

require(gplots)
# From balloonplot example:
 # Create an example using table
 xnames - sample( letters[1:3], 50, replace=2)
 ynames - sample( 1:5, 50, replace=2)

 tab - table(xnames, ynames)

 balloonplot(tab)

# Try xlab, ylab:
balloonplot(tab, xlab = MyX, ylab = MyY)


But second plot is no different from first.

R.version.string: Version 2.3.1 (2006-06-01)
gplots version: 2.3.0
on WinXP SP1

-- 
TIA,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] distribution of daily rainfall values in binned categories

2006-06-27 Thread Jim Porzak
?hist

 read about breaks

On 6/27/06, etienne [EMAIL PROTECTED] wrote:

 Hi,

 I'm a newbie in using R and I would like to have a few
 clues as to how I could compute and plot a
 distribution of daily rainfall intensity in different
 categories.  I have daily values (mm/day) for several
 years and I need to show the frequency of 0-1, 1-2.5,
 2.5-5, 5-10, 10-20, 20+ mm/day.  Can this be done
 easily?

 Thanks,
 Etienne

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html




-- 
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Get list of ODBC data sources?

2006-05-22 Thread Jim Porzak
Hello R Helpers,

Before setting up a connection with RODBC, I would like to present my
users with a pick list of ODBC data sources available in their
environment. I may be missing something, but don't see anything in
RODBC itself to return list of sources for use in select.list(). Any
hints?

I'm running 2.3.0 on Win XP SP2.

-- 
TIA,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] data management on R

2006-03-23 Thread Jim Porzak
C - cbind(A[, 1], B[, 2])


On 3/23/06, zhijie zhang [EMAIL PROTECTED] wrote:
 Dear friends,
  i have two dataset: A and B
 A:
  x  y
 1  2
 3  4

 B:
  m  n
  1   2
 7   8

 How to generate datasetC:
 C:
  x  n
  1   2
 3   8
  i know sas can do it easily, what about R?

 --
 Kind Regards,

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



--
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Plotting FAQ?

2006-03-03 Thread Jim Porzak
Then there is the Graphing section of Paul Johnson's Rtips ...

http://pj.freefaculty.org/R/Rtips.html#5

On 3/3/06, Dan Bolser [EMAIL PROTECTED] wrote:
 Hi,

 Since I started to make some 'final' plots of my data I found that I
 have tons of questions related to 'the little things'. Rather than
 bother the list with all the questions (ahem), or search the archives
 for similar questions and translate the context, I would like to find a
 FAQ for plotting in particular (and R programming in general). I know
 for sure (searching the list) that my questions have been answered many
 times and in many different contexts, however, I can't find any list of
 generic (best) solutions to common problems.

 For example, (a bit on the 'details' side, but...)

 How do I make my y axis labels / names appear horizontally?

 How do I put a plot within a plot?

 How do I scale the legend text in barplot(...,legend=T)?

 How do I generate a legend just like barplot(...,legend=T) using legend()?

 How do I give my axis labels a bit more space? (Shift the left/bottom of
 the plotting area right/up from the left/bottom of the device area)?


 These questions spring to mind because they are problems I am trying to
 deal with, I am sure you could imagine loads of more basic plotting
 questions.

 An FAQ is a great place to archive all the best community knowledge
 about what library is good for what functionality and what 'tips 
 tricks' have the coolest code.

 Where should I look?

 Thanks for any help,
 Dan.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



--
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] fft

2006-02-10 Thread Jim Porzak
Bill,

?fft

Do you have a specific question?

On 2/9/06, Bill Hunsicker [EMAIL PROTECTED] wrote:
 R-help:

 I need to do a fft on a data set.  I was wondering if any guidance may
 be available.

 Regards,
 Bill

 Bill Hunsicker
 RF Micro Devices
 7625 Thorndike Road
 Greensboro, NC  27409
 336-678-5260
 610-597-9985(m)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



--
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Glossay of available R functions

2006-02-04 Thread Jim Porzak
Alexandre  Patricia,

As Bert Gunter periodically points out:
Newbies (and others!) may find useful the R Reference Card made available by
Tom Short and Rpad at http://www.rpad.org/Rpad/Rpad-refcard.pdf  or through
the Contributed link on CRAN (where some other reference cards are also
linked). It categorizes and organizes a bunch of R's basic, most used
functions so that they can be easily found. For example, paste() is under
the Strings heading and expand.grid() is under Data Creation. For
newbies struggling to find the right R function as well as veterans who
can't quite remember the function name, it's very handy.

I still keep a hard copy of Tom Short's referncece card handy, as do
most of my colleagues at Loyalty Matrix.

--
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

On 1/30/06, Patricia J. Hawkins [EMAIL PROTECTED] wrote:
  ASA == Alexandre Santos Aguiar [EMAIL PROTECTED] writes:

 ASA I am new to R and read this list to learn. It is amazing how
 ASA frequently new functions pop in messages. Useful and timesaving
 ASA functions like subset (above) must be documented somewhere.

 ASA Is there a glossary of functions?

 I'm also new to R, and was wondering the same thing.  Took a bunch of
 tries, but if you run start.help() and then choose Packages, then
 Base, you will get the list of functions.

 As a newcomer, I hesitate to suggest this, but maybe there should be a
 comment on the index page to that effect?

 --
 Patricia J. Hawkins
 Hawkins Internet Applications
 www.hawkinsia.com

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R training courses

2006-02-02 Thread Jim Porzak
Internally, we did an R Programming Study Group using Thomas Lumley's
excellent slides

http://faculty.washington.edu/tlumley/Rcourse/

over 2 1/2 weeks. Individual success depended on individual motivation. The
couple of folks that needed to come up to speed quickly did so.

I broke Thomas's course into 7 sections.  Assigned sections every other day
(Mon, Wed, Fri, ...). We used our internal wiki to exchange questions,
commnts, hints, etc.

Our need was really R programming. If your needs are more statistical
focused, you could use one of introductory texts.

--
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

On 2/2/06, Wei Qiu [EMAIL PROTECTED] wrote:

 I am also looking for this kind of courses. Any suggestion will be greatly
 appreciated. Lucy

 On Thu, 2 Feb 2006, Walker, Russell wrote:

  Hi All,
 
 
 
  I am interested in learning about people's experience with R training or
  courses. What worked, what didn't? What do you recommend?
 
 
 
  Also, if there any groups or individuals that have and can offer R
  training courses, please contact me directly. I would like to learn
  about your services.
 
 
 
  Thanks for your input and help. Please feel free to contact me directly.
 
 
 
 
 
  Russ
 
 
 
 
 
  Russell Walker, Ph.D.
 
 
 
  Senior Strategist
 
 
 
  CapitalOne Financial, Inc.
 
 
 
  1500 Capital One Drive
 
 
 
  Richmond, VA 23238 USA
 
 
 
  Internal Zip 12074-0340
 
 
 
  External +1 804 855-3512
 
 
 
  [EMAIL PROTECTED]
 
 
 
 
 
 
 
 
 
 
 
 
  The information contained in this e-mail is confidential
 and...{{dropped}}
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html
 
 
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] In which application areas is R used?

2006-01-23 Thread Jim Porzak
 my favorites: customer/marketing analytics

--
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] reg peak detection

2005-12-08 Thread Jim Porzak
Ram,
See excelent thread here last month.
Search for finding peaks

On 12/8/05, SHRIRAM R SAMPAT [EMAIL PROTECTED] wrote:

 Hallo everybody,

 I am doing a thesis in video extensometry and one my
 approaches requires peak detection in a two
 dimensional data.

 If would be grateful if anyone can throw some light on
 this for me by giving me some hints on how to do it or
 give me some links for it.

 thank very much in advance.

 Ram

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html




--
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] getting started, reading listing and saving data

2005-11-22 Thread Jim Porzak
Ronnie, try

head?
tail?

and, I also fine useful after loading
str?


On 11/22/05, Ronnie Babigumira [EMAIL PROTECTED] wrote:
 Dear List
 I am new to R and to the list and will try best as I can be clear and
 concise. My apologies if anything I write contravenes the posting code
 on this list. I would also like to say I have run through most of the
 material on the R website before writing this email however, I am
 stuck.

 Here is what I want to do and what I have done

 1. Read a comma seperated text file into R
 I have used read.csv and it seems to have worked

 2. List the a few observations to make sure the right stuff came in
 I have failed to find a command that allows me to list a few
 observations and would appreciate some help on this (I have used edit
 which pops up an spreadsheet however, I would prefer a command that
 allows me to list a few observations for inspection)

 3. Save this data as an r dataset
 I cant seem to figure this out (I tried  save(mytextfile, file =
 myrdata) but when I try to load what I saved, I get an error message
 Error: bad restore file magic number (file may be corrupted) -- no data 
 loaded

 4. Load this r dataset and proceed to work on it

 I would appreciate some help.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



--
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sorting of data

2005-10-26 Thread Jim Porzak
One of the most useful functions:
?subset

as in

nsmSubData - subset(nsmalldata, SESSIONID==7757513)


On 10/26/05, Bill Hunsicker [EMAIL PROTECTED] wrote:
 R-Help,



 I am trying to reduce at data set to rows where a specified value occurs
 in a specified value.  Below is a screen capture of Rgui:



  nsmalldata -read.csv(c:\\DATA\\UNITY\\\PASS0_DOWNFADE\\nsmall.csv)

  nsmalldata

BOARDNUMBER SESSIONID MATRIXID ARRAYPOINT Temperature   PS1   PS2
 PS13 PS14 PS15

 1LB0DC  3043  7757513  1   - - -
 3.65   -5

 2LB0DC  3043  7757515  1   - - -
 3.65   -5

 3LB0DC  3043  7757517  1   - - -
 3.65   -5

 4LB0DC  3043  7757520  1   - - -
 3.65   -5

 5LB0DC  3043  7757522  1   - - -
 3.65   -5

 6LB0DC  3043  7757524  1   - - -
 3.65   -5

 7LB0DC  3043  7757526  1   - - -
 3.65   -5

 8LB0DC  3043  7757528  1   - - -
 3.65   -5

 9LB0DC  3043  7757531  1   - - -
 3.65   -5

 10   LB0DC  3043  7757533  1   - - -
 3.65   -5

 11   LB0DC  3043  7757535  1   - - -
 3.65   -5

 12   LB0DC  3043  7757537  1   - - -
 3.65   -5

 13   LB0DC  3043  7757540  1   - - -
 3.65   -5

 14   LB0DC  3043  7757542  1   - - -
 3.65   -5

 15   LB0DC  3043  7757544  1   - - -
 3.65   -5

 16   LB0DC  3043  7757547  1   - - -
 3.65   -5

 17   LB0DC  3043  7757549  1   - - -
 3.65   -5

 18   LB0DC  3043  7757551  1   - - -
 3.65   -5

 19   LB0DC  3043  7757554  1   - - -
 3.65   -5

 20   LB0DC  3043  7757556  1   - - -
 3.65   -5

 



 For Example

 I would like to reduce nsmalldata to only the rows where
 SESSIONID==7757513

 I have spent quite a bit of time with R manuals and still have not
 gotten there, can you help me?



 Thanks in advance.



 Regards,

 Bill



 Bill Hunsicker

 RF Micro Devices

 7625 Thorndike Road

 Greensboro, NC  27409

 336-678-5260(W)

 610-579-9985(M)




 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



--
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R OLAP engines, an integration?

2005-10-14 Thread Jim Porzak
Hi Emmanuel,

We are doing some work along these lines. See www.OpenI.org for
details of our open souce OLAP solution  contact our CTO, Sandeep
Giri (via link on that site), for details. We haven't released any R
integration yet, but we are doing some things internally  it is on
the OpenI development roadmap.

I think OLAP definitely needs integration with some hard analytics.
Sure, our business analysts are very comfortable moving around in OLAP
space  come up with some amazing insights, but OLAP just provides
simple counts, sums etc. No sense of estimated errors or tests of
significance. And, of course, no advanced techniques.

One technical point, some methods would require a drill-through to
the underlying data set. For example, while a mosaic plot can be
generated from a 2-dimensional OLAP result table, generating a box
plot corresponding to a OLAP bar chart (my personal favorite) needs
the raw data points.

We welcome everyone interested in this idea to join in the discussion
 effort. The OpenI forum
http://sourceforge.net/forum/?group_id=142873 is probably a better
place than here.


On 10/14/05, Emmanuel Maroye [EMAIL PROTECTED] wrote:
 Hi.

 I am a consultant at KAE: Marketing Intelligence (http://www.kae.co.uk) 
 working on market evaluation and forecasting.  Working on large datasets I am 
 looking for a solution to use R on datasets stored in an OLAP engine (like 
 MIS Alea, Applix TM1 or Mondrian).  Have you ever heard about such a solution?

 The idea is to apply R methods directly on data stored in an OLAP.  The 
 results being part of the OLAP as well (results write back)...

 Thanks in advance.

 Best regards,

 Emmanuel




 _
 Emmanuel Maroye

 kae: marketing intelligence
 209 - 215 Blackfriars Road
 London SE1 8NL
 United Kingdom
 D +44 20 7960  3358
 M +44 7914 010 728
 F +44 20 7960  3301
 E [EMAIL PROTECTED]


 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



--
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] adding 1 month to a date

2005-10-12 Thread Jim Porzak
 OTOH,

  seq(as.Date(2004-01-31), by = month, length = 14)
  [1] 2004-01-31 2004-03-02 2004-03-31 2004-05-01 2004-05-31
  [6] 2004-07-01 2004-07-31 2004-08-31 2004-10-01 2004-10-31
 [11] 2004-12-01 2004-12-31 2005-01-31 2005-03-03

 I would prefer to see dates forced to be within each month, not
 leaking into next month.

 IOW:
  [1] 2004-01-31 2004-02-29 2004-03-31 2004-04-30 2004-05-31, etc


 --
 Jim Porzak
 Loyalty Matrix Inc.
 San Francisco, CA


 On 10/12/05, Marc Schwartz [EMAIL PROTECTED] wrote:
  Thanks to Prof. Ripley for pointing this out.
 
  One of the approaches that I had considered here was to set up a vector
  of the number of days in each month (adjusting of course for leap
  years), and use day arithmetic to add/subtract the appropriate number
  of days.
 
  However, it was easier to use seq.Date() and to further consider putting
  a wrapper around it to make it yet even easier to use.
 
  Marc
 
  On Wed, 2005-10-12 at 13:23 +0100, Prof Brian Ripley wrote:
   On Wed, 12 Oct 2005, bogdan romocea wrote:
  
Simple addition and subtraction works as well:
 as.Date(1995/12/01,format=%Y/%m/%d) + 30
If you have datetime values you can use
 strptime(1995-12-01 08:00:00,format=%Y-%m-%d %H:%M:%S) + 30*24*3600
where 30*24*3600 = 30 days expressed in seconds.
  
   Sorry, not in general, as a month is not generally of 30 days (including
   in your example).
  
   seq.Date is a good way to do this.
  
   
   
-Original Message-
From: Marc Schwartz [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 11, 2005 10:16 PM
To: t c
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] adding 1 month to a date
   
   
On Tue, 2005-10-11 at 16:26 -0700, t c wrote:
Within an R dataset, I have a date field called date_.
(The dates are
in the format -MM-DD, e.g. 1995-12-01.)
   
How can I add or subtract 1 month from this date, to get
1996-01-01 or
1995-11-01.
   
There might be an easier way to do this, but using seq.Date(), you can
increment or decrement from a Time 0 by months:
   
Add 1 month:
   
This takes your Time 0, generates a 2 element sequence (which begins
with Time 0) and then takes the second element:
   
seq(as.Date(1995-12-01), by = month, length = 2)[2]
[1] 1996-01-01
   
   
   
Subtract 1 month:
   
Same as above, but we use 'by = -1 month' and take the
second element:
   
seq(as.Date(1995-12-01), by = -1 month, length = 2)[2]
[1] 1995-11-01
   
   
See ?as.Date and ?seq.Date for more information. The former
function is
used to convert from a character vector to a Date class object. Note
that in your case, the date format is consistent with the default. Pay
attention to the 'format' argument in as.Date() if your dates
should be
in other formats.
   
HTH,
   
Marc Schwartz
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Replicate

2005-09-16 Thread Jim Porzak
Hi Marc,

 x = c(1,1,1,2,2,2,3,3,3,3)
 unique(x)
[1] 1 2 3


Being a database guy myself, it took me a while to think unique
rather than distinct

-- 
HTH,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA


On 9/16/05, Marc Bernard [EMAIL PROTECTED] wrote:
 Dear All,
 
 I have a vector x = (1,1,1,2,2,2,3,3,3,3)
 I am looking for a function to return a vector containing  the distinct 
 elements of x i,e y = (1,2,3)
 
 The following code gives the desired results:
 
 as.numeric(levels(as.factor(x)))
 
 Is there any other elegant  way?
 
 Thanks,
 
 B
 
 
 
 
 -
 
 
 [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] [handling] Missing [values in randomForest]

2005-09-11 Thread Jim Porzak
On 9/11/05, Uwe Ligges [EMAIL PROTECTED] wrote:
 Jan-Paul Roodbol wrote:
 
  Does anyone know if randomForest in R can handle
  dataset with missings?
 
 See ?randomForest, you can omit observations including NAs by specifying
 na.action=na.omit

Uwe, 
While strictly true, this tells randomForest to ignore any rows with
one or more NAs in the predictor variables.

Since, randomForest is often used for problems with a lot of
(canidate) predictors, na.omit can result in a lot of rows being
discarded. Hence, my reply to Jan-Paul's original posting suggesting
the impute functions in randomForest.

JIm Porzak

 Please do not cross-post!
 Please specify a sensible subject!
 
 Uwe Ligges
 
 
  Thank you
 
  Kind regards
 
  Jan-Paul
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Plot of multiple data sets

2005-09-09 Thread Jim Porzak
Paul's book has been available in the states since mid-Aug. Is on my
local bookseller's shelf (, of course, mine)

I would recommend it to anyone doing more than off the shelf
graphics in R. As expected, an especially good look at grid.

Hopefully it will be available in Europe soon.

-- 
Best,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA


On 9/9/05, Chris Buddenhagen [EMAIL PROTECTED] wrote:
 I found a lot of answers at this type of problem website wrt graphics and
 multiple plots- I bet the book will be useful when it comes out.
 
 
 
 http://www.stat.auckland.ac.nz/~paul/RGraphics/rgraphics.html
 
 
 
 Chris Buddenhagen, Botany Department, Charles Darwin Research Station, Santa
 Cruz,Galapagos. Mail: Charles Darwin Foundation, Casilla 17-01-3891 Avenida
 6 de Diciembre N36-109 y Pasaje California Quito, ECUADOR
 
 
 
 
 
 
 
 __
 EL CONTENIDO DE ESTE MENSAJE ES DE ABSOLUTA RESPONSABILIDAD DEL AUTOR.
 FUNDACION CHARLES DARWIN
 WWW.DARWINFOUNDATION.ORG
 [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] data manipulation

2005-09-08 Thread Jim Porzak
Also see Hadley Wickham's reshape package for more bells  whistles.
-- 
HTH!
Jim Porzak
Loyalty Matrix Inc.



On 9/8/05, Thomas Lumley [EMAIL PROTECTED] wrote:
 
 This is what reshape() does.
 
 -thomas
 
 On Thu, 8 Sep 2005, Marc Bernard wrote:
 
  Dear All,
 
  I would be grateful if you can help me. My problem is the following:
  I have a data set like:
 
  ID  time  X1  X2
  11  x111  x211
  12  x112  x212
  21  x121  x221
  22  x122  x222
  23  x123  x223
 
  where X1 and X2 are 2 covariates and time is the time of observation and 
  ID indicates the cluster.
 
  I want to merge the above data by creating a new variable  X and type 
  as follows:
 
  ID   timeXtype
  1 1  x111 X1
  1 2  x112 X1
  1 1  x211 X2
  1 2  x212 X2
  2 1  x121 X1
  2 2  x122 X1
  2 3  x123 X1
  2 1  x221 X2
  2 2  x222 X2
  2 3  x223 X2
 
 
  Where type is a factor variable indicating if the observation is related 
  to X1 or X2...
 
  Many thanks in advance,
 
  Bernard
 
 
  -
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 
 Thomas Lumley   Assoc. Professor, Biostatistics
 [EMAIL PROTECTED]University of Washington, Seattle
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Order of boxes in boxplot()

2005-04-27 Thread Jim Porzak
On Thu, 7 Apr 2005, michael watson (IAH-C) wrote: 
 Sorry for such an inane question - how do I control the order in which 
 the boxes are plotted using boxplot() when I pass it a formula and a 
 data.frame? It seems that the groups are plotted in alphabetical 
 order... I want to change this 

Mick,
Here's the code I use to order boxes by decreasing median value. 
SubtDays is variable of interest
ConChnl is original grouping factor.
tMedians is a temp data frame 
dConChnl is new grouping factor with desired order


boxplot(SubtDays ~ ConChnl, .  ### Default ordering of boxes

tMedians - aggregate(SubtDays, list(ConChnl), median, na.rm = TRUE)
dConChnl - factor(ConChnl, levels = tMedians[order(tMedians$x), 1])

boxplot(SubtDays ~ dConChnl, . ### Ordered by decreasing median


HTH,
Jim Porzak
Director of Analytics
Loyalty Matrix, Inc.
(415) 296-1141 x210
R.LoyaltyMatrix.com
www.LoyaltyMatrix.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Opening for a Statistics Practitioner in San Francisco

2005-02-03 Thread Jim Porzak
Statistics Practitioner Fluent in R - San Francisco CA

Loyalty Matrix Inc., in downtown San Francisco, is expanding our team. We
are a young, dynamic and growing team of multidisciplinary marketing and
technical professionals. We deliver value to our clients by discovering
actionable tactical and strategic insights in actual customer data augmented
with demographics and research.

You will work in the Client Services team doing EDA, basic statistics, data
mining  modeling. You will also assist RD to develop and integrate new
methods into our proprietary customer intelligence platform
MatrixOptimizerR.

You will have a degree in statistics, be fluent in R and the Microsoft
Office suite (especially Excel, Word and PowerPoint). SQL query skills are
very helpful as is real-world business and marketing experience.

The successful candidate will demonstrate creative ability to solve
practical problems, juggle multiple projects, work in a multidisciplinary
team and have fun.

For more information on us see www.LoyaltyMatrix.com and
R.LoyaltyMatrix.com.

If interested, reply to [EMAIL PROTECTED] by February 18, 2005.

In addition to your resume, please include a cover letter stating how your
training, experience and skills would specifically contribute to our
clients' success.

We look forward to hearing from you.


Best,
Jim Porzak
Director of Analytics
Loyalty Matrix, Inc.
R.LoyaltyMatrix.com
www.LoyaltyMatrix.com
(415) 296-1141 x 210

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] My useR! slides: Doing Customer Intelligence with R

2004-06-11 Thread Jim Porzak
FYI for useR! attendees  others.
The slides for my talk Doing Customer Intelligence with R are up on our R 
weblog: R.LoyaltyMatrix.com

Also note any books ordered through the links on the blog will benefit the 
R Foundation - see blog for details.

I will be using the blog as an informal log of our adventures using R for 
data mining, business intelligence and, in particular, customer 
intelligence. Comments on my blog posts are encouraged. Guest authors are 
welcome - contact me directly.

Once again, many thanks to the useR! organizers and the local team in 
Vienna for the great meeting!

Jim Porzak
Director of Analytics
Loyalty Matrix, Inc.
R.LoyaltyMatrix.com
www.LoyaltyMatrix.com
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Data Analyst Intern position in San Francisco

2004-02-20 Thread Jim Porzak

   We've sent this position out to SF Bay Area schools. Since we have
   standardized on R as our preferred analytics platform it seemed
   appropriate to post here.
   We also have a full time Data Analyst position open. Search for
   Loyalty Matrix on [1]www.craigslist.org= for details.
   Jim Porzak
   Director of Analytics
   Loyalty Matrix, Inc.
   [2]www.LoyaltyMatrix.com
   
   Data Analyst Intern
   Loyalty Matrix is a successful  profitable two-year-old start-up   company based 
in downtown San Francisco. We are seeking a bright,   organized, and motivated team 
player who appreciates the challenges
   and= rewards of working in an environment with a strong client focus.
   Primary Responsibilities:
   ·   Brainstorm with= business analysts to develop and customize
   OLAP models on customer= intelligence
   ·   Conduct customer= intelligence analysis using analytical and
   statistical models
   ·   Perform gap= analysis of client infrastructure on developing
   customer intelligence
   ·   Report directly= to the Director of Analytics
   Qualifications:
   ·   Graduating= senior or college grad with degree in Computer
   Science, Engineering, Math= or equivalent
   ·   Basic knowledge= of statistical principles, methodologies 
   techniques
   ·   Intermediate= knowledge of Microsoft Excel and PowerPoint
   Preferred Qualifications:
   ·   Knowledge of= statistical tools (like R, SAS, JMP,, or SPSS)
   ·   Experience doing= SQL queries
   ·   Experience in= OLAP
   ·   Microsoft SQL= Server  Analysis Services
   What You Will Learn:
   ·   Statistical= tools (like R, SAS, JMP, or SPSS)
   ·   OLAP concepts= and practice
   ·   Microsoft SQL= Server  Analysis Services
   ·   Implementing= direct response marketing principles and
   techniques, including audience= selection recommendations, data
   extractions and manipulation, and program= measurement and evaluation
   ·   CRM systems= (E.Piphany, Siebel, SAP, etc.)
   ·   Reporting tools   ·   ETL (Extraction,= Transformation and Load) tools
   Please note that this is a three-month internship that may lead to a   permanent 
position. Submit your resume to [EMAIL PROTECTED] with   [R] Internship  in the 
subject heading.
   -

References

   1. 3Dhttp://www.craigslist.org/;
   2. 3Dhttp://www.loyaltymatrix.com/__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R]Running R remotely in Windows Environment? Thanks!

2004-01-30 Thread Jim Porzak
Thanks to Prof Ripley, Arne, Andy  Bill for unambiguous suggestions!
Linux box is on order.
I'll take notes on our experience  post a follow-up
in a few weeks. May be useful to other folks stuck in the Windows world.
-Jim

At 06:24 AM 1/29/2004, Pikounis, Bill wrote:
Jim,
I would really like to reiterate Professor Ripley's and Arne Henningsen
comments. The problem goes for any analytic software or system you might
want to use, not just R. My impression is that at least for part of it, you
want the individual users to use R as they would on their own desktops.  (If
that is not the case, much of the rest of this note is pure FYI.) Even in
its most advanced 2003 Server edition, Windows is simply not designed to be
a multi-user system.  Sure, it can reliably host a web server that may need
to run quick bursts of R batch-type jobs (analytics) and return results to
a client (e.g. web browser), but that does not sound like what you are
looking for (at least in part). And beyond the technical limitations, use of
Windows Terminal Server (Remote Desktop) / Citrix, etc. will cost much money
and implementation hassle and probably even legal headaches.  We have had
colleagues here at Merck (over my and Andy Liaw's disbelief) that have tried
to shoehorn Windows this way, and even the speed of single, small jobs by 1
logged-on took longer on the server than on their much less powerful
laptops.
A Linux solution is very flexible, in our experiences (we have Windows XP as
corporate desktop standard).  As stated, with Samba, you can map directories
that look like just another drive in Windows Explorer.  Printing is just as
transparent in either direction.  VNC (Virtual Network Computing) is very,
very nice to provide the individual user's Linux environment as just another
window on their Windows desktop. With the free utility of autocutsel,
clipboards can be synchronized for ease of cutting and pasting. And KDE, one
of several window manager analogues to Windows, is very sophisticated and
shares a lot in common with the Windows GUI from a user operations
standpoint. While it may sound like a hassle to get up and running now if
your shop is currently 99% Windows, the benefit will absolutely be clear
later.
Hope that helps,
Bill

Bill Pikounis, Ph.D.
Biometrics Research Department
Merck Research Laboratories
PO Box 2000, MailDrop RY33-300
126 E. Lincoln Avenue
Rahway, New Jersey 07065-0900
USA
Phone: 732 594 3913
Fax: 732 594 1565
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Arne Henningsen
 Sent: Thursday, January 29, 2004 3:45 AM
 To: Jim Porzak
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R]Running R remotely in Windows Environment?


 Hi,

 I also suggest to use a Linux Server. You can work on this
 machine via ssh
 (e.g. with PuTTY) and transfer the input and output files
 with scp or a samba
 server (which is easy to install and very convenient to use
 for windows
 users).

 Arne

 On Thursday 29 January 2004 08:53, Prof Brian Ripley wrote:
  On Wed, 28 Jan 2004, Jim Porzak wrote:
   We are considering setting up a fast, RAM loaded machine
 as an R-server
   to handle the big problems not suitable for individual
 desktops and,
   also, to process ad hoc analysis requests via our portal.
 We are 99% a
   Windows shop, so first choice is a windows server. We'll
 use (D)COM for
   the portal interface and understand that.
  
   What has me stumped is how to easily interface individual
 analyst's
   Windows desktops to the R-server. I haven't seen anything in the
   archives, but I can't imagine this hasn't been done. What
 am I missing?
 
  R is not designed to be client-server on Windows.  People I
 know who do
  this use Windows Terminal Server or Citrix.
 
  I would question the value of this approach.  Unless you
 propose to run
  64-bit Windows, a `RAM loaded' machine isn't `loaded', and
 R under Windows
  handles large amounts of memory much less effectively than
 under Linux.
  64-bit Windows is uncharted territory for R, whereas 64-bit
 Unix/Linux is
  well trodden.

 --
 Arne Henningsen
 Department of Agricultural Economics
 University of Kiel
 Olshausenstr. 40
 D-24098 Kiel (Germany)
 Tel: +49-431-880 4445
 Fax: +49-431-880 1397
 [EMAIL PROTECTED]
 http://www.uni-kiel.de/agrarpol/ahenningsen/

 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
__
[EMAIL PROTECTED

[R]Running R remotely in Windows Environment?

2004-01-28 Thread Jim Porzak
We are considering setting up a fast, RAM loaded machine as an R-server 
to handle the big problems not suitable for individual desktops and, also, 
to process ad hoc analysis requests via our portal. We are 99% a Windows 
shop, so first choice is a windows server. We'll use (D)COM for the portal 
interface and understand that.

What has me stumped is how to easily interface individual analyst's Windows 
desktops to the R-server. I haven't seen anything in the archives, but I 
can't imagine this hasn't been done. What am I missing?

TIA!

Jim Porzak
Director of Analytics
Loyalty Matrix, Inc.
www.LoyaltyMatrix.com
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html