[R] extract column's from different dataframe

2013-10-17 Thread catalin roibu
Dear R users,

I want to extract column's from different data frame with different row
length.
How can I do this in R?

Thank you very much!

best regards!

CR

-- 
---
Catalin-Constantin ROIBU
Lecturer PhD, Forestry engineer
Forestry Faculty of Suceava
Str. Universitatii no. 13, Suceava, 720229, Romania
office phone +4 0230 52 29 78, ext. 531
mobile phone   +4 0745 53 18 01
   +4 0766 71 76 58
FAX:+4 0230 52 16 64
silvic.usv.ro

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot time series data irregularly hourly-spaced

2013-10-17 Thread Charles Novaes de Santana
Wow!! Thank you so much for your suggestions! For now, A.K's suggestion #1
is perfect for me!

Thank you very much!

Best,

Charles


On Thu, Oct 17, 2013 at 2:34 AM, William Dunlap wdun...@tibco.com wrote:

 You could bump up the day each time an hour was less than the previous
 one.  E.g.,
   testtime -
 c(20:00:00,22:10:00,22:20:00,23:15:00,23:43:00,00:00:00,00:51:00,01:00:00)
   var - seq_along(testtime) # so you know what the plot should look like
   # turn it ino a POSIXlt object so you can do arithmetic on it
   t - strptime(testtime,format=%H:%M:%S)
   # now add a day each time t[i]t[i-1]
   td - t + .difftime(cumsum(c(FALSE, diff(t)0)), units=days)
   # compare plots
   par(mfrow=c(2,1))
   plot(t,var,type=b,xlab=Time,ylab=Var)
   plot(td,var,type=b,xlab=Time,ylab=Var)
 This is dicey because you may have skipped more than one day.

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com


  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf
  Of Law, Jason
  Sent: Wednesday, October 16, 2013 5:04 PM
  To: Charles Novaes de Santana; r-help@r-project.org
  Subject: Re: [R] Plot time series data irregularly hourly-spaced
 
   You just need the date, otherwise how would it know what time comes
 first?  In
  strptime(), a date is being assumed.
 
  Try this:
 
  testtime-
 
 c(20:00:00,22:10:00,22:20:00,23:15:00,23:43:00,00:00:00,00:51:00,01:00:
  00)
  testday - rep(Sys.Date() - c(1,0), times = c(5,3))
  plot(as.POSIXct(paste(testday, testtime)), var)
 
  Jason
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf
  Of Charles Novaes de Santana
  Sent: Wednesday, October 16, 2013 2:58 PM
  To: r-help@r-project.org
  Subject: [R] Plot time series data irregularly hourly-spaced
 
  Dear all,
 
  I have a time series of data that I would like to represent in a plot.
 But I am facing some
  problems to do it because the time is represented in hours, it can
 start in one day and
  end in another day, and it is not regularly spaced.
 
  My problem is that when I plot my data, my X-axis always starts from the
 lower values of
  my time data. For example, I would like to plot data that starts at
 20:00:00 and ends at
  01:00:00, but R considers that 01:00:00 is lower than 21:00:00 and my
 plot is kind of
  crossed over time.
 
  Please try this example to see it graphically:
 
  testtime-
 
 c(20:00:00,22:10:00,22:20:00,23:15:00,23:43:00,00:00:00,00:51:00,01:00:
  00)
  var-runif(length(testtime),0,1)
 
 plot(strptime(testtime,format=%H:%M:%S),var,type=b,xlab=Time,ylab=Var)
 
  In this case, I would like to have a plot that starts at 20:00:00 and
 ends at 01:00:00.
 
  Does anybody know how to make R understand that 00:00:00 comes after
 20:00:00 in
  this case? Or at least does anybody know a tip to make a plot with this
 kind of X-axis?
 
  Thanks for your time and thanks in advance for any help.
 
  Best regards,
 
  Charles
  --
  Um axé! :)
 
  --
  Charles Novaes de Santana, PhD
  http://www.imedea.uib-csic.es/~charles
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
Um axé! :)

--
Charles Novaes de Santana, PhD
http://www.imedea.uib-csic.es/~charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] saveXML() prefix argument

2013-10-17 Thread Milan Bouchet-Valat
Le mercredi 16 octobre 2013 à 23:45 -0400, Earl Brown a écrit :
 I'm using the XML package and specifically the saveXML() function but I 
 can't get the prefix argument of saveXML() to work:
 
 library(XML)
 concepts - c(one, two, three)
 info - c(info one, info two, info three)
 root - newXMLNode(root)
 for (i in 1:length(concepts)) {
   cur.concept - concepts[i]
   cur.info - info[i]
   cur.tip - newXMLNode(tip, attrs = c(id = i), parent = root)
   newXMLNode(h1, cur.concept, parent = cur.tip)
   newXMLNode(p, cur.info, parent = cur.tip)
 }
 
 # None of the following output a prefix on the first line of the exported 
 document
 saveXML(root)
 saveXML(root, file = test.xml)
 saveXML(root, file = test.xml, prefix = '?xml version=1.0?\n')
 
 Am I missing something obvious? Any ideas?
It looks like the function XML:::saveXML.XMLInternalNode() does not use
the 'prefix' parameter at all. So it won't be taken into account when
calling saveXML() on objects of class XMLInternalNode.

I think you should report this to Duncan Temple Lang, as this is
probably an oversight.


Regards


 Thanks in advance. Earl Brown
 
 -
 Earl K. Brown, PhD
 Assistant Professor of Spanish Linguistics
 Advisor, TEFL MA Program
 Department of Modern Languages
 Kansas State University
 www-personal.ksu.edu/~ekbrown
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Constraint on regression parameters

2013-10-17 Thread Robert U
Dear all,

I have been trying to  find a simple solution to my problem without success, 
though i have a feeling a simple syntaxe detail coul make the job.

I am doing a polynomial linear regression with 2 independent variables such as :

lm(A ~ B + I(B^2) + I(lB^3) + C, data=Dataset))

R return me a coefficient per independent variable, and I  would need the 
coefficient of the C parameter to equal 1. 


I've been loonking at parameter constraints on the  internet but it's always 
much more complicated that just removing the fit of a coefficient (or setting 
it to 1). 


I know many package allows to not fit an intercept with a -1 parameter in 
the syntaxe, does that exists for independent variables ? 

Regards,
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constraint on regression parameters

2013-10-17 Thread S Ellison


 -Original Message-
 I am doing a polynomial linear regression with 2 independent variables
 such as :
 
 lm(A ~ B + I(B^2) + I(lB^3) + C, data=Dataset))
 
 R return me a coefficient per independent variable, and I  would need
 the coefficient of the C parameter to equal 1.

Leaving aside the question of fitting simple polynomial coefficients instead of 
orthogonal polynomials - generally frowned upon, but not always serious - the 
problem you describe is one in which you are not fitting C at all; you're 
assuming C adds exactly. What you're really fitting is the difference between A 
and C. 

Try fitting 
A-C ~ B + I(B^2) + I(lB^3) 

to obtain the coefficients you're looking for. But be aware that you will still 
have a constant intercept, so the model you will have fitted is

A = b0 + b1.B +b2.B^2 +b3.B^3 + C + error

S Ellison


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extract a predictors form constparty object (CHAID output) in R

2013-10-17 Thread Christiaan Pauw
For the record. I have found a possible sollution:

nn - nodeapply(z)
n.names= names(unlist(nn[[1]]))
ext - unlist(sapply(n.names, function(x) grep(split.varid., x, value=T)))
ext - gsub(kids.split.varid., , ext)
ext - gsub(split.varid., , ext)
dep.var - as.character(terms(z)[1][[2]])
plus = paste(ext, collapse= + )
mul = paste(ext, collapse= * )
shortform - as.formula(paste (dep.var, plus, sep =  ~ ))
satform - as.formula(paste (dep.var, mul, sep =  ~ ))
mosaic(shortform, data = ContraceptiveChoice)
#stp - step(glm(satform, data=ContraceptiveChoice, family=binomial),
direction=both)


On 16 October 2013 20:18, Christiaan Pauw cjp...@gmail.com wrote:

 I have a large dataset (questionnaire results) of mostly categorical
 variables. I have tested for dependency between the variables using
 chi-square test. There are an incomprehensible number of dependencies.
 I used the chaid() function in the CHAID package to detect
 interactions and separate out (what I hope to be) the underlying
 structure of these dependencies for each variable. What typically
 happens is that the chi-square test will reveal a large number of
 dependencies (say 10-20) for a variable and the chaid function will
 reduce this to something much more comprehensible (say 3-5). What I
 want to do is to extract the names of those variable that were shown
 to be relevant in the chaid() results.

 The chaid() output is in the form of a constparty object. My question
 is how to extract the variable names associated with the nodes in such
 an object.

 Here is a self contained code example:

 library(evtree) # for the ContraceptiveChoice dataset
 library(CHAID)
 library(vcd)
 library(MASS)

 data(ContraceptiveChoice)
 longform - formula(contraceptive_method_used ~ wifes_education +
  husbands_education +  wifes_religion + wife_now_working +
  husbands_occupation + standard_of_living_index +
 media_exposure)
 z - chaid(longform, data = ContraceptiveChoice)
 # plot(z)
 z
 # This is the part I want to do programatically
 shortform - formula(contraceptive_method_used ~ wifes_education +
 husbands_occupation)
 # The thing I want is a programatic way to extract 'shortform'  from 'z'

 # Examples of use of 'shortfom'
 loglm(shortform, data = ContraceptiveChoice)

 Thanks in advance
 Christiaan
 --
 Christiaan Pauw
 Nova Institute
 www.nova.org.za




-- 
Christiaan Pauw
Nova Institute
www.nova.org.za

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extract column's from different dataframe

2013-10-17 Thread Jim Lemon

On 10/17/2013 06:17 PM, catalin roibu wrote:

Dear R users,

I want to extract column's from different data frame with different row
length.
How can I do this in R?


Hi catalin,
If I understand your question, which I think is:

I want to extract columns from different data frames with differing 
numbers of rows and store them in a single object.


The answer is probably to use a list:

datalist-list()
datalist[[1]]-dataframe1[,variable1]
datalist[[2]]-dataframe2[,variable3]
...

where each element of datalist may have different numbers of values.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How would i sum the number of NA's in multiple vectors

2013-10-17 Thread Carl Witthoft
mattbju2013 wrote
 Hi guys this is my first post, i need help summing the number of NA's in a
 few vectors
 
 for example..
 
 c1-c(1,2,NA,3,4)
 c2-c(NA,1,2,3,4)
 c3-c(NA,1,2,3,4)
 
 how would i get a result that only sums the number of NA's in the vector?
 the.result.i.want-c(2,0,1,0,0)

See ?is.na .   
Now, if I can interpret your question correctly, you're actually looking for
the number of NA per *position* in the vectors, so let's make them into a
matrix first.

cmat-rbind(c1,c2,c3)
then use apply over columns
apply(cmat,2,function(k)sum(is.na(k)))





--
View this message in context: 
http://r.789695.n4.nabble.com/How-would-i-sum-the-number-of-NA-s-in-multiple-vectors-tp4678411p4678432.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] match values in dependence of ID and Date

2013-10-17 Thread Mat
hello togehter,

i have a little problem, maybe you can help me.

I have a data.frame like this one:

IDName
1 Andy
2 John
3 Amy

and a data.frame like this:

ID   DateValue
12013-10-0110
12013-10-0215
22013-10-017
22013-10-0310
22013-10-0415
32013-10-0110

the result should be this one:

IDName   First   SecondThird
1 Andy10 15
2 John 7  10   15
3 Amy 10

maybe you can help me, to do this?

Thank you.

Mat



--
View this message in context: 
http://r.789695.n4.nabble.com/match-values-in-dependence-of-ID-and-Date-tp4678433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How would i sum the number of NA's in multiple vectors

2013-10-17 Thread Joshua Wiley
Or faster (both computational speed and amount of code):

colSums(is.na(rbind(c1, c2, c3)))


On Thu, Oct 17, 2013 at 4:34 AM, Carl Witthoft c...@witthoft.com wrote:

 mattbju2013 wrote
  Hi guys this is my first post, i need help summing the number of NA's in
 a
  few vectors
 
  for example..
 
  c1-c(1,2,NA,3,4)
  c2-c(NA,1,2,3,4)
  c3-c(NA,1,2,3,4)
 
  how would i get a result that only sums the number of NA's in the vector?
  the.result.i.want-c(2,0,1,0,0)

 See ?is.na .
 Now, if I can interpret your question correctly, you're actually looking
 for
 the number of NA per *position* in the vectors, so let's make them into a
 matrix first.

 cmat-rbind(c1,c2,c3)
 then use apply over columns
 apply(cmat,2,function(k)sum(is.na(k)))





 --
 View this message in context:
 http://r.789695.n4.nabble.com/How-would-i-sum-the-number-of-NA-s-in-multiple-vectors-tp4678411p4678432.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://joshuawiley.com/
Senior Analyst - Elkhart Group Ltd.
http://elkhartgroup.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain restricted estimates from coxph()?

2013-10-17 Thread Andrews, Chris
Consider the function f(x) = x on the open interval (0,1).  It does not have a 
maximum.
That is what your likelihood function will look like.  The MLE does not exist.
Chris

(Although if everything is continuous and you are okay with limits there is an 
extension that gets you to Terry's original answer.)

-Original Message-
From: Y [mailto:yuhan...@gmail.com] 
Sent: Wednesday, October 16, 2013 7:08 PM
To: Göran Broström
Cc: r-help@r-project.org
Subject: Re: [R] How to obtain restricted estimates from coxph()?

Thanks very much for your help, Terry and G?ran!

As pointed out by G?ran, the difficult part is that it's an open set. How
to obtain a valid MLE in this case?


Thanks,
YH







On Wed, Oct 16, 2013 at 9:55 AM, G?ran Brostr?m goran.brost...@umu.sewrote:



 On 2013-10-16 14:33, Terry Therneau wrote:



 On 10/16/2013 05:00 AM, r-help-requ...@r-project.org wrote:

 Hello,

 I'm trying to use coxph() function to fit a very simple Cox proportional
 hazards regression model (only one covariate) but the parameter space is
 restricted to an open set (0, 1). Can I still obtain a valid estimate by
 using coxph function in this scenario? If yes, how? Any suggestion would
 be
 greatly appreciated. Thanks!!!


 Easily:
  1.  Fit the unrestricted model.  If the solution is in 0-1 you are
 done.
  2.  If it is outside, fix the coefficient.  Say that the solution is
 1.73, then the
 optimal solution under contraint is 1.


 OK, except for the small annoyance that 1 is not a member of the open set
 (interval) (0, 1). Maybe the answer is No in this case? Depends on what
 lies in the word 'valid'. If 'MLE', the answer is No.

   Redo the fit adding the paramters  init=1, iter=0.  This
 forces the program to
 give the loglik and etc for the fixed coefficient of 1.0.

 Terry Therneau

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]


**
Electronic Mail is not secure, may not be read every day, and should not be 
used for urgent or sensitive issues 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S4 base class

2013-10-17 Thread Michael Meyer
Greetings,

Meanwhile I have figured out how do do it only to find out that I have more 
serious problems.
Generally calling Base::f on the base class object is not what you want, 
instead you want to call
Base::f on the full object for the following reasons:

If the base class is virtual, then Base::f might use virtual functions 
(not defined in Base but defined in derived classes).

If you then call Base::f on an object of class Base the call will fail.

Is it possible in R to call Base::f from within Derived (when there is also 
Derived::f) on the full object this?

I suspect not, which would be a serious drawback to the R class mechanism.


Thanks,


Michael Meyer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S4 base class

2013-10-17 Thread Michael Meyer
Sorry,

if the previous message seems without context.
Indeed, the first message was bounced by filtering rules (triggered by subject 
heading than which nothing could be more benign or less liable to suspician). 
It was:

Greetings,

I have an S4 class B (Base) which defines a function f=f(this=B,...) 
Dervided from B we have a derived class D which also defines a function 
f=f(this=D,...)

In the definition of D::f we want to call the version B::f and could do this by 
simply calling

f(baseClassObject(this),...)

The question is the following:

How do I refer to the base class object from the derived class?



Many thanks 


Michael Meyer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] flatten a list of lists

2013-10-17 Thread Michael Friendly
I have functions that generate lists objects of class foo and lists of 
lists of these, of class

foolist, similar to what is shown below.

How can I flatten something like this to remove the top-level list 
structure, i.e.,

return a single-level list of foo objects, of class foolist?

foo - function(n) {
result - list(x=sample(1:10,n), y=sample(1:10,n))
class(result) - foo
result
}

multifoo - function(vec, label, ...) {
result - lapply(vec, foo, ...)
names(result) - paste0(label, vec)
class(result) - foolist
result
}

foo1 - multifoo(1:2, A)
foo2 - multifoo(1:2, B)

mfoo - list(A=foo1, B=foo2)

str(mfoo, 2)

 str(mfoo, 2)
List of 2
 $ A:List of 2
  ..$ A1:List of 2
  .. ..- attr(*, class)= chr foo
  ..$ A2:List of 2
  .. ..- attr(*, class)= chr foo
  ..- attr(*, class)= chr foolist
 $ B:List of 2
  ..$ B1:List of 2
  .. ..- attr(*, class)= chr foo
  ..$ B2:List of 2
  .. ..- attr(*, class)= chr foo
  ..- attr(*, class)= chr foolist

In this case, what is wanted is a single-level list, of 4 foo objects, 
A1, A2, B1, B2,

all of class foolist

--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.  Chair, Quantitative Methods
York University  Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4 base class

2013-10-17 Thread Duncan Murdoch

On 17/10/2013 9:01 AM, Michael Meyer wrote:

Sorry,

if the previous message seems without context.
Indeed, the first message was bounced by filtering rules (triggered by subject 
heading than which nothing could be more benign or less liable to suspician). 
It was:

Greetings,

I have an S4 class B (Base) which defines a function f=f(this=B,...)
Dervided from B we have a derived class D which also defines a function 
f=f(this=D,...)

In the definition of D::f we want to call the version B::f and could do this by 
simply calling

f(baseClassObject(this),...)

The question is the following:

How do I refer to the base class object from the derived class?


You're asking the wrong question.  You should be asking how to call the 
method for the inherited class .  callNextMethod() is the answer to that 
question.


By the way, your use of the syntax D::f and B::f suggests that you're 
thinking from a C++ point of view.  That's very likely to lead to 
frustration:  the S4 object system is very different from C++.  Methods 
don't belong to classes, they belong to generics. There is no such thing 
as D::f or B::f, only f methods with different signatures.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] flatten a list of lists

2013-10-17 Thread Ista Zahn
unlist(mfoo, recursive = FALSE) gets you pretty close.

Best,
Ista

On Thu, Oct 17, 2013 at 9:15 AM, Michael Friendly frien...@yorku.ca wrote:
 I have functions that generate lists objects of class foo and lists of
 lists of these, of class
 foolist, similar to what is shown below.

 How can I flatten something like this to remove the top-level list
 structure, i.e.,
 return a single-level list of foo objects, of class foolist?

 foo - function(n) {
 result - list(x=sample(1:10,n), y=sample(1:10,n))
 class(result) - foo
 result
 }

 multifoo - function(vec, label, ...) {
 result - lapply(vec, foo, ...)
 names(result) - paste0(label, vec)
 class(result) - foolist
 result
 }

 foo1 - multifoo(1:2, A)
 foo2 - multifoo(1:2, B)

 mfoo - list(A=foo1, B=foo2)

 str(mfoo, 2)

 str(mfoo, 2)
 List of 2
  $ A:List of 2
   ..$ A1:List of 2
   .. ..- attr(*, class)= chr foo
   ..$ A2:List of 2
   .. ..- attr(*, class)= chr foo
   ..- attr(*, class)= chr foolist
  $ B:List of 2
   ..$ B1:List of 2
   .. ..- attr(*, class)= chr foo
   ..$ B2:List of 2
   .. ..- attr(*, class)= chr foo
   ..- attr(*, class)= chr foolist

 In this case, what is wanted is a single-level list, of 4 foo objects, A1,
 A2, B1, B2,
 all of class foolist

 --
 Michael Friendly Email: friendly AT yorku DOT ca
 Professor, Psychology Dept.  Chair, Quantitative Methods
 York University  Voice: 416 736-2100 x66249 Fax: 416 736-5814
 4700 Keele StreetWeb:   http://www.datavis.ca
 Toronto, ONT  M3J 1P3 CANADA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] flatten a list of lists

2013-10-17 Thread Duncan Murdoch

On 17/10/2013 9:15 AM, Michael Friendly wrote:

I have functions that generate lists objects of class foo and lists of
lists of these, of class
foolist, similar to what is shown below.


You can use c() to join lists.  So in the example below,

c(mfoo$A, mfoo$B)

will give you a list with the components you want, though the class 
won't be set.   More generally, do.call(c, unname(mfoo)) will join any 
number of components.  (Without unname(), the names at the top level 
will be combined with the component names;

maybe you'd actually want that, but your example didn't do it.)

This won't work if your list doesn't have the regular list of lists 
structure, e.g. if it mixes foo objects with foolist objects at the same 
level.  Then you probably need a more complicated recursive approach.  
You might be able to do it with rapply().


Duncan Murdoch



How can I flatten something like this to remove the top-level list
structure, i.e.,
return a single-level list of foo objects, of class foolist?

foo - function(n) {
  result - list(x=sample(1:10,n), y=sample(1:10,n))
  class(result) - foo
  result
}

multifoo - function(vec, label, ...) {
  result - lapply(vec, foo, ...)
  names(result) - paste0(label, vec)
  class(result) - foolist
  result
}

foo1 - multifoo(1:2, A)
foo2 - multifoo(1:2, B)

mfoo - list(A=foo1, B=foo2)

str(mfoo, 2)

   str(mfoo, 2)
List of 2
   $ A:List of 2
..$ A1:List of 2
.. ..- attr(*, class)= chr foo
..$ A2:List of 2
.. ..- attr(*, class)= chr foo
..- attr(*, class)= chr foolist
   $ B:List of 2
..$ B1:List of 2
.. ..- attr(*, class)= chr foo
..$ B2:List of 2
.. ..- attr(*, class)= chr foo
..- attr(*, class)= chr foolist

In this case, what is wanted is a single-level list, of 4 foo objects,
A1, A2, B1, B2,
all of class foolist



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] flatten a list of lists

2013-10-17 Thread David Carlson
Does this get you the rest of the way?

 mfoo2 - unlist(mfoo, recursive = FALSE)
 names(mfoo2) - unlist(lapply(mfoo, names))
 class(mfoo2) - foolist
 str(mfoo2)
List of 4
 $ A1:List of 2
  ..$ x: int 3
  ..$ y: int 10
  ..- attr(*, class)= chr foo
 $ A2:List of 2
  ..$ x: int [1:2] 6 4
  ..$ y: int [1:2] 8 9
  ..- attr(*, class)= chr foo
 $ B1:List of 2
  ..$ x: int 2
  ..$ y: int 2
  ..- attr(*, class)= chr foo
 $ B2:List of 2
  ..$ x: int [1:2] 3 6
  ..$ y: int [1:2] 4 2
  ..- attr(*, class)= chr foo
 - attr(*, class)= chr foolist

-
David L Carlson
Department of Anthropology
Texas AM University
College Station, TX 77840-4352




-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Ista Zahn
Sent: Thursday, October 17, 2013 8:23 AM
To: Michael Friendly
Cc: R-help
Subject: Re: [R] flatten a list of lists

unlist(mfoo, recursive = FALSE) gets you pretty close.

Best,
Ista

On Thu, Oct 17, 2013 at 9:15 AM, Michael Friendly
frien...@yorku.ca wrote:
 I have functions that generate lists objects of class foo
and lists of
 lists of these, of class
 foolist, similar to what is shown below.

 How can I flatten something like this to remove the top-level
list
 structure, i.e.,
 return a single-level list of foo objects, of class
foolist?

 foo - function(n) {
 result - list(x=sample(1:10,n), y=sample(1:10,n))
 class(result) - foo
 result
 }

 multifoo - function(vec, label, ...) {
 result - lapply(vec, foo, ...)
 names(result) - paste0(label, vec)
 class(result) - foolist
 result
 }

 foo1 - multifoo(1:2, A)
 foo2 - multifoo(1:2, B)

 mfoo - list(A=foo1, B=foo2)

 str(mfoo, 2)

 str(mfoo, 2)
 List of 2
  $ A:List of 2
   ..$ A1:List of 2
   .. ..- attr(*, class)= chr foo
   ..$ A2:List of 2
   .. ..- attr(*, class)= chr foo
   ..- attr(*, class)= chr foolist
  $ B:List of 2
   ..$ B1:List of 2
   .. ..- attr(*, class)= chr foo
   ..$ B2:List of 2
   .. ..- attr(*, class)= chr foo
   ..- attr(*, class)= chr foolist

 In this case, what is wanted is a single-level list, of 4 foo
objects, A1,
 A2, B1, B2,
 all of class foolist

 --
 Michael Friendly Email: friendly AT yorku DOT ca
 Professor, Psychology Dept.  Chair, Quantitative Methods
 York University  Voice: 416 736-2100 x66249 Fax: 416
736-5814
 4700 Keele StreetWeb:   http://www.datavis.ca
 Toronto, ONT  M3J 1P3 CANADA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible
code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] flatten a list of lists

2013-10-17 Thread Michael Friendly

Thanks to all who replied.

Here are two versions of a function (sans sanity checks) that do what I 
want:


foo1 - multifoo(1:2, A)
foo2 - multifoo(1:2, B)

mfoo - list(A=foo1, B=foo2)
class(mfoo) - c(foolist, list)

#' flatten a list of lists

# from Duncan Murdoch
flatten - function(list, unname=TRUE) {
res - do.call(c, if(unname) unname(list) else list)
class(res) - class(list)
res
}

# from David Carlson
flatten2 - function(list, unname=TRUE) {
res - unlist(list, recursive = FALSE)
if(unname) names(res) - unlist(lapply(list, names))
class(res) - class(list)
res
}

mflat1 - flatten(mfoo)
mflat2 - flatten2(mfoo)
all.equal(mflat1,mflat2)

 all.equal(mflat1,mflat2)
[1] TRUE

-Michael

On 10/17/2013 9:39 AM, David Carlson wrote:

Does this get you the rest of the way?


mfoo2 - unlist(mfoo, recursive = FALSE)
names(mfoo2) - unlist(lapply(mfoo, names))
class(mfoo2) - foolist
str(mfoo2)

List of 4
  $ A1:List of 2
   ..$ x: int 3
   ..$ y: int 10
   ..- attr(*, class)= chr foo
  $ A2:List of 2
   ..$ x: int [1:2] 6 4
   ..$ y: int [1:2] 8 9
   ..- attr(*, class)= chr foo
  $ B1:List of 2
   ..$ x: int 2
   ..$ y: int 2
   ..- attr(*, class)= chr foo
  $ B2:List of 2
   ..$ x: int [1:2] 3 6
   ..$ y: int [1:2] 4 2
   ..- attr(*, class)= chr foo
  - attr(*, class)= chr foolist

-
David L Carlson
Department of Anthropology
Texas AM University
College Station, TX 77840-4352




-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Ista Zahn
Sent: Thursday, October 17, 2013 8:23 AM
To: Michael Friendly
Cc: R-help
Subject: Re: [R] flatten a list of lists

unlist(mfoo, recursive = FALSE) gets you pretty close.

Best,
Ista

On Thu, Oct 17, 2013 at 9:15 AM, Michael Friendly
frien...@yorku.ca wrote:

I have functions that generate lists objects of class foo

and lists of

lists of these, of class
foolist, similar to what is shown below.

How can I flatten something like this to remove the top-level

list

structure, i.e.,
return a single-level list of foo objects, of class

foolist?

foo - function(n) {
 result - list(x=sample(1:10,n), y=sample(1:10,n))
 class(result) - foo
 result
}

multifoo - function(vec, label, ...) {
 result - lapply(vec, foo, ...)
 names(result) - paste0(label, vec)
 class(result) - foolist
 result
}

foo1 - multifoo(1:2, A)
foo2 - multifoo(1:2, B)

mfoo - list(A=foo1, B=foo2)

str(mfoo, 2)


str(mfoo, 2)

List of 2
  $ A:List of 2
   ..$ A1:List of 2
   .. ..- attr(*, class)= chr foo
   ..$ A2:List of 2
   .. ..- attr(*, class)= chr foo
   ..- attr(*, class)= chr foolist
  $ B:List of 2
   ..$ B1:List of 2
   .. ..- attr(*, class)= chr foo
   ..$ B2:List of 2
   .. ..- attr(*, class)= chr foo
   ..- attr(*, class)= chr foolist

In this case, what is wanted is a single-level list, of 4 foo

objects, A1,

A2, B1, B2,
all of class foolist

--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.  Chair, Quantitative Methods
York University  Voice: 416 736-2100 x66249 Fax: 416

736-5814

4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible

code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.




--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.  Chair, Quantitative Methods
York University  Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Rich Shepard

  When I specify pch = 19 for a scatter plot the points are filled circles.
Deapite reading ?points and trial-and-error experimentation I have not found
how to have the legend symbols (now open circles) filled.

  An example command is:

xyplot(pct.quant ~ sampdate, data = ffg.st, groups = func_feed_grp, type =
'p', pch = 19, key = simpleKey(text = levels(ffg.st$func_feed_grp), space =
'right', points = T, lines = F),par.settings = list(superpose.points =
list(col = rainbow(7)), superpose.lines = list(col = rainbow(7))), main =
'Functional Feeding Groups (Individuals)', xlab = 'Year', ylab = 'Proportion
of Individuals')

  Please pass me a pointer on how to fill the legend points.

TIA,

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Richard M. Heiberger
put the pch into the par.settings

On Thu, Oct 17, 2013 at 11:17 AM, Rich Shepard rshep...@appl-ecosys.com wrote:
   When I specify pch = 19 for a scatter plot the points are filled circles.
 Deapite reading ?points and trial-and-error experimentation I have not found
 how to have the legend symbols (now open circles) filled.

   An example command is:

 xyplot(pct.quant ~ sampdate, data = ffg.st, groups = func_feed_grp, type =
 'p', pch = 19, key = simpleKey(text = levels(ffg.st$func_feed_grp), space =
 'right', points = T, lines = F),par.settings = list(superpose.points =
 list(col = rainbow(7)), superpose.lines = list(col = rainbow(7))), main =
 'Functional Feeding Groups (Individuals)', xlab = 'Year', ylab = 'Proportion
 of Individuals')

   Please pass me a pointer on how to fill the legend points.

 TIA,

 Rich

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] saveXML() prefix argument

2013-10-17 Thread Duncan Temple Lang
Milan is correct.
The prefix is used when saving the XML content that is represented in
a different format in R.

To get the prefix 
 ?xml version=1.0?
on the XML content that you save, use a document object

doc = newXMLDoc()
root = newXMLNode(foo, doc = doc)

saveXML(doc)


?xml version=1.0?
foo/

Sorry for the confusion.
 D

On 10/17/13 2:36 AM, Milan Bouchet-Valat wrote:
 Le mercredi 16 octobre 2013 à 23:45 -0400, Earl Brown a écrit :
 I'm using the XML package and specifically the saveXML() function but I 
 can't get the prefix argument of saveXML() to work:

 library(XML)
 concepts - c(one, two, three)
 info - c(info one, info two, info three)
 root - newXMLNode(root)
 for (i in 1:length(concepts)) {
  cur.concept - concepts[i]
  cur.info - info[i]
  cur.tip - newXMLNode(tip, attrs = c(id = i), parent = root)
  newXMLNode(h1, cur.concept, parent = cur.tip)
  newXMLNode(p, cur.info, parent = cur.tip)
 }

 # None of the following output a prefix on the first line of the exported 
 document
 saveXML(root)
 saveXML(root, file = test.xml)
 saveXML(root, file = test.xml, prefix = '?xml version=1.0?\n')

 Am I missing something obvious? Any ideas?
 It looks like the function XML:::saveXML.XMLInternalNode() does not use
 the 'prefix' parameter at all. So it won't be taken into account when
 calling saveXML() on objects of class XMLInternalNode.
 
 I think you should report this to Duncan Temple Lang, as this is
 probably an oversight.
 
 
 Regards
 
 
 Thanks in advance. Earl Brown

 -
 Earl K. Brown, PhD
 Assistant Professor of Spanish Linguistics
 Advisor, TEFL MA Program
 Department of Modern Languages
 Kansas State University
 www-personal.ksu.edu/~ekbrown

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with combining functions

2013-10-17 Thread Carl Witthoft
This is the world-famous fizzbuzz problem.   You should be able to find
lots of implementations by Googling that word.  Here's a pointless
collection I wrote once:

# a really dumb fizzbuzz alg competition
#fbfun1 is 2.5x faster than fbfun2
# fbfun3 is 10x faster than fbfun1
# fbfun1 is 2x faster than fbfun4 
# fbfun5 is 20x faster than fbrun3
# Those are user times; in most cases the system time is very small indeed. 

fbfun1 - function(xfoo) {
xfoo-1:xfoo
fbfoo - 1+(!as.logical(mod(xfoo,3)))*(as.logical(mod(xfoo,5))) +
2*(as.logical(mod(xfoo,3)))*(!as.logical(mod(xfoo,5)))+3*(!as.logical(mod(xfoo,3)))*(!as.logical(mod(xfoo,5)))

fbbar - unlist(lapply(fbfoo, function(x)
switch(x,0,'fizz','buzz','fizzbuzz')))
return(fbbar)
}


fbfun3 - function(xfoo) {
xfoo-1:xfoo
fbfoo - 1+(!as.logical(mod(xfoo,3)))*(as.logical(mod(xfoo,5))) +
2*(as.logical(mod(xfoo,3)))*(!as.logical(mod(xfoo,5)))+3*(!as.logical(mod(xfoo,3)))*(!as.logical(mod(xfoo,5)))
fbtab-cbind(1:4,c('','fizz','buzz','fizzbuzz'))
fbbar - fbtab[fbfoo,2]
return(fbbar)
}

# can I do it with recycled vectors, e.g. c('','','fizz') and
c('','','','','buzz') ?
fbfun4 - function(xfoo) {
fiz- rep(c('','','fizz'),length.out=xfoo)
buz-rep(c('','','','','buzz'),length.out=xfoo)
fbbar - unlist(lapply(1:xfoo, function(j)paste(fiz[j],buz[j]) ) )
return(fbbar)
}

# or completely sleazy:
fbfun5 - function(xfoo) {
fiz-
rep(c('','','fizz','','buzz','fizz','','','fizz','buzz','','fizz','','','fizzbuzz'),length.out=xfoo)
return(fiz)
}





--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-combining-functions-tp4678212p4678272.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S4 base class

2013-10-17 Thread Michael Meyer
Quote

By the way, your use of the syntax D::f and B::f suggests that youapos;re 
thinking from a C++ point of view.  Thatapos;s very likely to lead to 
frustration:  the S4 object system is very different from C++.  Methods 
donapos;t belong to classes, they belong to generics. There is no such thing 
as D::f or B::f, only f methods with different signatures.

Duncan Murdoch 


#---#

I am aware of this.
We can probably agree that we should use S4 classes and generic functions to 
duplicate more usual object oriented architecture as far as possible while 
remaining conscious of the regrettable differences.

For example we can pretend we are defining a virtual function in class Base by 
writing:

setGeneric(F,
function(this) standardGeneric(F)
)

where the code for Base  is, even though it has nothing to do with the class 
Base. 
We can even use it in other functions defined in class Base by writing 


setGeneric(G,
function(this) standardGeneric(G)
)
setMethod(G,
signature(this=Base),
definition=function(this){

F(this)
})

which will work on all derived classes which implement F in some fashion:

setMethod(F,
signature(this=Derived),
definition=function(this){

# do something appropriate for derived.
})

With this we can reproduce some semblance of object oriented programming
However, apparently we cannot solve in this manner a common problem of object 
oriented programming (from now on C++ parlance):

Suppose you have a base class Base which implements a function Base::F 
which works in most contexts but not in the context of ComplicatedDerived 
class
where some preparation has to happen before this very same function can be 
called.

You would then define

void ComplicatedDerived::F(...){

preparation();
Base::F();
}

You can nealry duplicate this in R via 

setMethod(F,
signature(this=ComplicatedDerived),
definition=function(this){

preparation(this)
F(as(this,Base))
})

but it will fail whenever F uses virtual functions (i.e. generics) which are 
only defined
for derived classes of Base, whereas this is not a problem at all in normal 
object oriented
languages.

This is not a contrived problem but is rather basic.
I wonder if you can do it in R in some other way.


Many thanks,

Michael

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subseting a data.frame

2013-10-17 Thread Katherine Gobin
Dear Forum,

I have a data frame as 

mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

 mydat
  basel_asset_class defa_frequency
1                 2          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001


I need to get the subset of this data.frame where no of records for the given 
basel_asset_class is  2, i.e. I need to obtain subset of above data.frame as 
(since there is only 1 record, against basel_asset_class = 2, I want to filter 
it)

 mydat_a
  basel_asset_class defa_frequency
1                 8          0.070
2                 8          0.030
3                 8          0.001

Kindly guide

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparing two groups

2013-10-17 Thread Andrej
So why not start with some statistical textbook? There are plenty of them
available in CRAN. 

I wasn't implying, that I haven't read any textbook, or didn't do any
research. I read some textbooks/Papers/etc. during the research about what
to do and came across the wilcox test. I meant to imply that I could have
problems understanding some of the answers, and that maybe additional
explaining would be necessary.

My doubts stem from the fact, that the wilcox test is a - as far as I know -
ranking test, that states if two groups are different. My assumption is, due
to the fact that the second group has a much higher sample size, it is clear
that it differs from the first group. I performed a t-test (just to see; I
am aware that I am not allowed to perform it, because my samples aren't
normally distributed) and it gave me a p-value of 0.3.
Actually I am not even entirely sure, if wilcox is the right test. I just
want to know if the means of the two groups are significantly different.



--
View this message in context: 
http://r.789695.n4.nabble.com/Comparing-two-groups-tp4678190p4678277.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Rich Shepard

On Thu, 17 Oct 2013, Richard M. Heiberger wrote:


put the pch into the par.settings


Richard,

  Tried this again, but I'm not finding the proper location within
par.settings.

par.settings = list(superpose.points = list(col = rainbow(7)),
superpose.lines = list(col = rainbow(7)), pch = 19)


If I put it prior to the (list ... group there's an error of an extra = ;
when I put it anywhere in the list (the above is one of my tries), it has no
effect on the legend symbols: they remain as outlines.

  What have I missed?

Thanks,

Rich

--
Richard B. Shepard, Ph.D.  |  Have knowledge, will travel.
Applied Ecosystem Services, Inc.   |
http://www.appl-ecosys.com Voice: 503-667-4517  Fax: 503-667-8863

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread Charles Determan Jr
Katherine,

There are multiple ways to do this and I highly recommend you look into a
basic R manual or search the forums.  One quick example would be:

mysub - subset(mydat, basel_asset_class  2)

Cheers,
Charles


On Thu, Oct 17, 2013 at 1:55 AM, Katherine Gobin
katherine_go...@yahoo.comwrote:

 Dear Forum,

 I have a data frame as

 mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency =
 c(0.15, 0.07, 0.03, 0.001))

  mydat
   basel_asset_class defa_frequency
 1 2  0.150
 2 8  0.070
 3 8  0.030
 4 8  0.001


 I need to get the subset of this data.frame where no of records for the
 given basel_asset_class is  2, i.e. I need to obtain subset of above
 data.frame as (since there is only 1 record, against basel_asset_class = 2,
 I want to filter it)

  mydat_a
   basel_asset_class defa_frequency
 1 8  0.070
 2 8  0.030
 3 8  0.001

 Kindly guide

 Katherine
 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] map with inset

2013-10-17 Thread markw
Hi David,

That worked brilliantly! Many thanks. I also had trouble getting subplot()
to work with either TeachingDemos or Hmisc.

Best,
Mark



--
View this message in context: 
http://r.789695.n4.nabble.com/map-with-inset-tp4678341p4678426.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Weighted regression markers on scatter plots

2013-10-17 Thread Msugarman
Hi all,

I'm trying to graph the results of a weighted regression analysis. Is anyone
aware of a way to make my markers appear a different sizes to be consistent
with their respective weights?

Thanks,
-Mike Sugarman
Wayne State University



--
View this message in context: 
http://r.789695.n4.nabble.com/Weighted-regression-markers-on-scatter-plots-tp4678370.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S4 base class

2013-10-17 Thread Michael Meyer
Greetings,

I have an S4 class B (Base) which defines a function f=f(this=B,...) 
Dervided from B we have a derived class D which also defines a function 
f=f(this=D,...)

In the definition of D::f we want to call the version B::f and could do this by 
simply calling

f(baseClassObject(this),...)

The question is the following:

How do I refer to the base class object from the derived class?



Many thanks 

 
Michael Meyer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Singular Matrix 'a' in solve

2013-10-17 Thread CL Tee
Hi,


I have a set of matrix data named “invest” consists of 450 observations (75
countries, 6 years) with 7 variables (set as I, pop, inv, gov, c, life, d;
which each is “numeric[450]”). The procedure is modify from code provided
by B.E. Hansen at http://www.ssc.wisc.edu/~bhansen/progs/ecnmt_00.html.



*Then the variable is being transformed to*

y- lag_v(i,0)

cf   - lag_v(c,0)

lpop   - lag_v(pop,0)

linv- lag_v(inv,0)

lgov   - lag_v(gov,0)

d1 - lag_v(d,0)

llife- lag_v(life,0)

yt  - tr(y)

ct  - tr(cf)



y, cf, lpop, linv, lgov, d1, llife each is in “375x1 double matrix”

yt and ct each is “300x1 double matrix”

(I use R Studio so these characteristics are stated).



*The lag_v() and tr() process is as below:*

max_lag - 1

tt - t-max_lag

ty - n*(t-max_lag-1)



lag_v - function(x,lagn){

  yl - matrix(c(0),nrow=n,ncol=t)

  for (i in 1:n) {

  yl[i,]-x[(1+(i-1)*t):(t*i)]

  }

  yl - yl[,(1+max_lag-lagn):(t-lagn)]

  out - matrix(t(yl),nrow=nrow(yl)*ncol(yl),ncol=1)

  out

}



tr - function(y){

   yf - matrix(c(0),nrow=n,ncol=tt)

   for (i in 1:n) {

   yf[i,]-y[(1+(i-1)*tt):(tt*i)]

   }

   yfm - yf- colMeans(t(yf))

   yfm - yfm[,1:(tt-1)]

   out - matrix(t(yfm),nrow=nrow(yfm)*ncol(yfm),ncol=1)

   out

}



*Then before the computation, something is being setup*

x - cbind(lpop, linv, lgov, llife, cf)

… (skip as I think is unrelated with the problem encounter)



*And, in the early stage of computation:*

sse_calc - function(y,x){

 e - y-x%*%qr.solve(x,y)

 out - t(e)%*%e

 out

}

…



*It comes out with*

Error in qr.solve(x, y) : singular matrix 'a' in solve



I thought only square matrix would have this kind of problem. Would qr()
help in this case? Or is there any other possible solution for this problem?



Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] representing points in 3D space with trajectories over time

2013-10-17 Thread Umut Toprak
Dear all,

I have a problem where I must represent points with XYZ coordinates
changing over time. I will do a number of operations on this data such as
calculating the YZ-projection distance of the points to the origin over
time, the frequency spectrum of the X-T data etc. I am trying to find a
good way of representing this data with an appropriate data structure.

It appears like higher-dimensional data frames are not allowed and I do not
know if I should use a list of data frames or if there is a better
solution, possibly as part of an external package.

Thank you for your time
Umut Toprak

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Incorporate Julia into R

2013-10-17 Thread Timo Schmid
Hi,

I have some code in R with a lot of matrix multiplication and inverting. R can 
be very slow for larger matrices like 5000x5000.
I have seen the new programming language Julia (www.julialang.org) which is 
quite fast in doing matrix algebra. So my idea is to set up the simulations in 
R and start the first calculations, then I want to give some objects to Julia 
and do there some matrix algebra and give the results back to R. 
Is this possible or does anybody know how to do this? Is there a package 
available?
A short example with some lines of code would be also very helpful. 

Thanks in advance,
Timo
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] match values in dependence of ID and Date

2013-10-17 Thread arun
Hi,

I think based on your title, the output you provided is not clear. If it 
depends on Date, there should be four columns.
library(reshape2)

res1 - dcast(merge(dat,dat2,by=ID),ID+Name~Date,value.var=Value)
 colnames(res1)[3:6] - c(First, Second, Third, Fourth)
 rownames(res1) - 1:nrow(res1)


#or
res2 - 
reshape(merge(dat,dat2,by=ID),idvar=c(ID,Name),timevar=Date,direction=wide)
 dimnames(res2) - dimnames(res1)

 res2
#  ID Name First Second Third Fourth
#1  1 Andy    10 15    NA NA
#2  2 John 7 NA    10 15
#3  3  Amy    10 NA    NA NA


A.K.






On Thursday, October 17, 2013 9:31 AM, arun smartpink...@yahoo.com wrote:
Hi,
Try:
dat - read.table(text=
ID    Name
1    Andy
2    John
3    Amy,sep=,header=TRUE,stringsAsFactors=FALSE)

dat2 - read.table(text=
ID  Date    Value
1    2013-10-01    10
1    2013-10-02    15
2    2013-10-01    7
2    2013-10-03    10
2    2013-10-04    15
3    2013-10-01    
10,sep=,header=TRUE,colClasses=c(numeric,Date,numeric))

library(plyr)

 res - 
reshape(ddply(merge(dat,dat2,by=ID),.(ID),mutate,id=((seq_along(ID)-1)%%3+1))[,-3],idvar=c(ID,Name),timevar=id,direction=wide)
 rownames(res) - 1:nrow(res)
 colnames(res)[3:5] - c(First, Second, Third)

 res
#  ID Name First Second Third
#1  1 Andy    10 15    NA
#2  2 John 7 10    15
#3  3  Amy    10 NA    NA
A.K.







On Thursday, October 17, 2013 7:42 AM, Mat matthias.we...@fnt.de wrote:
hello togehter,

i have a little problem, maybe you can help me.

I have a data.frame like this one:

ID    Name
1     Andy
2     John
3     Amy

and a data.frame like this:

ID   Date            Value
1    2013-10-01    10
1    2013-10-02    15
2    2013-10-01    7
2    2013-10-03    10
2    2013-10-04    15
3    2013-10-01    10

the result should be this one:

ID    Name   First   Second    Third
1     Andy    10     15
2     John     7      10           15
3     Amy     10

maybe you can help me, to do this?

Thank you.

Mat



--
View this message in context: 
http://r.789695.n4.nabble.com/match-values-in-dependence-of-ID-and-Date-tp4678433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Richard M. Heiberger
par.settings = list(
   superpose.points = list(col = rainbow(7), pch = 19),
   superpose.lines = list(col = rainbow(7))
)

On Thu, Oct 17, 2013 at 11:48 AM, Rich Shepard rshep...@appl-ecosys.com wrote:
 On Thu, 17 Oct 2013, Richard M. Heiberger wrote:

 put the pch into the par.settings


 Richard,

   Tried this again, but I'm not finding the proper location within
 par.settings.


 par.settings = list(superpose.points = list(col = rainbow(7)),
 superpose.lines = list(col = rainbow(7)), pch = 19)


 If I put it prior to the (list ... group there's an error of an extra = ;
 when I put it anywhere in the list (the above is one of my tries), it has no
 effect on the legend symbols: they remain as outlines.

   What have I missed?


 Thanks,

 Rich

 --
 Richard B. Shepard, Ph.D.  |  Have knowledge, will travel.
 Applied Ecosystem Services, Inc.   |
 http://www.appl-ecosys.com Voice: 503-667-4517  Fax: 503-667-8863

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Rich Shepard

On Thu, 17 Oct 2013, Richard M. Heiberger wrote:


par.settings = list(
  superpose.points = list(col = rainbow(7), pch = 19),
  superpose.lines = list(col = rainbow(7))
)


  I had tried that, too. Legend symbols stubbornly remain unfilled.

Thanks, Richard,

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] match values in dependence of ID and Date

2013-10-17 Thread arun
Hi,
Try:
dat - read.table(text=
ID    Name
1    Andy
2    John
3    Amy,sep=,header=TRUE,stringsAsFactors=FALSE)

dat2 - read.table(text=
ID  Date    Value
1    2013-10-01    10
1    2013-10-02    15
2    2013-10-01    7
2    2013-10-03    10
2    2013-10-04    15
3    2013-10-01    
10,sep=,header=TRUE,colClasses=c(numeric,Date,numeric))

library(plyr)

 res - 
reshape(ddply(merge(dat,dat2,by=ID),.(ID),mutate,id=((seq_along(ID)-1)%%3+1))[,-3],idvar=c(ID,Name),timevar=id,direction=wide)
 rownames(res) - 1:nrow(res)
 colnames(res)[3:5] - c(First, Second, Third)

 res
#  ID Name First Second Third
#1  1 Andy    10 15    NA
#2  2 John 7 10    15
#3  3  Amy    10 NA    NA
A.K.






On Thursday, October 17, 2013 7:42 AM, Mat matthias.we...@fnt.de wrote:
hello togehter,

i have a little problem, maybe you can help me.

I have a data.frame like this one:

ID    Name
1     Andy
2     John
3     Amy

and a data.frame like this:

ID   Date            Value
1    2013-10-01    10
1    2013-10-02    15
2    2013-10-01    7
2    2013-10-03    10
2    2013-10-04    15
3    2013-10-01    10

the result should be this one:

ID    Name   First   Second    Third
1     Andy    10     15
2     John     7      10           15
3     Amy     10

maybe you can help me, to do this?

Thank you.

Mat



--
View this message in context: 
http://r.789695.n4.nabble.com/match-values-in-dependence-of-ID-and-Date-tp4678433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot - how to vary the distances of the x axis?

2013-10-17 Thread Hermann Norpois
Hello,


my dots of 0 and 2 are quite close to the marging. So I would like to move
the 0 and the 2 both towards the 1. I wish to be my dots more centered.
And: I dont need so much space between 0,1 and 2.

How does it work?
I tried:

plot (data, axes=FALSE, main=i, ylab= expression (z^2))
  plot.window (xlim=c (0,2), ylim=c(0,80))
  box (lwd=2)
  axis (side=1, at = c (0,1,2))
  axis (side =2)

dput (data)
structure(list(Genotype = c(0, 0, 0, 1, 1, 1, 1, 1, 2), z =
c(0.66429502114682,
0.258444359570075, 0.0702937908415368, 0.694376498254858,
0.0967863570760579,
0.213966209301163, 0.671497050546114, 0.60318070802847, 75.6011068681301
)), .Names = c(Genotype, z), row.names = c(NA, 9L), class =
data.frame)


Thanks
attachment: move.png__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread Bert Gunter
Kindly guide ...

This is a very basic question, so the kindest guide I can give is to read
an Introduction to R (ships with R) or a R web tutorial of your choice so
that you can learn how R works instead of posting to this list.

Cheers,
Bert


On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin katherine_go...@yahoo.com
 wrote:

 Dear Forum,

 I have a data frame as

 mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency =
 c(0.15, 0.07, 0.03, 0.001))

  mydat
   basel_asset_class defa_frequency
 1 2  0.150
 2 8  0.070
 3 8  0.030
 4 8  0.001


 I need to get the subset of this data.frame where no of records for the
 given basel_asset_class is  2, i.e. I need to obtain subset of above
 data.frame as (since there is only 1 record, against basel_asset_class = 2,
 I want to filter it)

  mydat_a
   basel_asset_class defa_frequency
 1 8  0.070
 2 8  0.030
 3 8  0.001

 Kindly guide

 Katherine
 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot - how to vary the distances of the x axis?

2013-10-17 Thread Bretschneider (R)

On 17 Oct 2013, at 13:44 , Hermann Norpois wrote:

 Hello,
 
 
 my dots of 0 and 2 are quite close to the marging. So I would like to move
 the 0 and the 2 both towards the 1. I wish to be my dots more centered.
 And: I dont need so much space between 0,1 and 2.
 
 How does it work?
 I tried:
 
 plot (data, axes=FALSE, main=i, ylab= expression (z^2))
  plot.window (xlim=c (0,2), ylim=c(0,80))
  box (lwd=2)
  axis (side=1, at = c (0,1,2))
  axis (side =2)
 
 dput (data)
 structure(list(Genotype = c(0, 0, 0, 1, 1, 1, 1, 1, 2), z =
 c(0.66429502114682,
 0.258444359570075, 0.0702937908415368, 0.694376498254858,
 0.0967863570760579,
 0.213966209301163, 0.671497050546114, 0.60318070802847, 75.6011068681301
 )), .Names = c(Genotype, z), row.names = c(NA, 9L), class =
 data.frame)
 
 
 Thanks




If I understand what you want, set xlim() a bit wider, within in the 
plot-statement: xlim=c (-0.4,2.4), ylim=c(0,80)

Hope this helps, 
Best wishes,


Franklin
-




Dr. Franklin Bretschneider
Dept of Biology
Utrecht Unversity
Padualaan 8
3584 CH  Utrecht
The Netherlands
f.bretschnei...@uu.nl



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Richard M. Heiberger
That should have worked.  I think something else is interfering.
Did you redefine either T or F?

Please send the output from dput(head(ffg.st))
so we can experiment in your setting.

Rich

On Thu, Oct 17, 2013 at 12:12 PM, Rich Shepard rshep...@appl-ecosys.com wrote:
 On Thu, 17 Oct 2013, Richard M. Heiberger wrote:

 par.settings = list(
   superpose.points = list(col = rainbow(7), pch = 19),
   superpose.lines = list(col = rainbow(7))
 )


   I had tried that, too. Legend symbols stubbornly remain unfilled.

 Thanks, Richard,

 Rich


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constraint on regression parameters

2013-10-17 Thread Greg Snow
You want the offset function in the formula:

lm( A ~ B + I(B^2) + offset(C), data=Dataset)

This will force the coefficient on C to be 1, if you wanted a coefficient
of another value then just do the multiplication yourself, e.g. offset( 2 *
C ) for a slope of 2.

Also you can use poly(B,2) to fit a linear and quadratic terms on B.


On Thu, Oct 17, 2013 at 3:45 AM, Robert U tacsun...@yahoo.fr wrote:

 Dear all,

 I have been trying to  find a simple solution to my problem without
 success, though i have a feeling a simple syntaxe detail coul make the job.

 I am doing a polynomial linear regression with 2 independent variables
 such as :

 lm(A ~ B + I(B^2) + I(lB^3) + C, data=Dataset))

 R return me a coefficient per independent variable, and I  would need the
 coefficient of the C parameter to equal 1.


 I've been loonking at parameter constraints on the  internet but it's
 always much more complicated that just removing the fit of a coefficient
 (or setting it to 1).


 I know many package allows to not fit an intercept with a -1 parameter
 in the syntaxe, does that exists for independent variables ?

 Regards,
 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Selecting maximums between different variables

2013-10-17 Thread Tim Umbach
Hi there,

another beginners question, I'm afraid. Basically i want to selct the
maximum of values, that correspond to different variables. I have a table
of oil production that looks somewhat like this:

oil - data.frame( YEAR = c(2011, 2012),
   TX = c(2, 3),
   CA = c(4, 25000),
   AL = c(2,
21000),

   ND = c(21000,6))

Now I want to find out, which state produced most oil in a given year. I
tried this:

attach(oil)
last_year = oil[ c(YEAR == 2012), ]
max(last_year)

Which works, but it doesnt't give me the corresponding values (i.e. it just
gives me the maximum output, not what state its from).
So I tried this:

oil[c(oil == max(last_year)),]
and this:
oil[c(last_year == max(last_year)),]
and this:
oil[which.max(last_year),]
and this:
last_year[max(last_year),]

None of them work, but they don't give error messages either, the output is
just NA. The problem is, in my eyes, that I'm comparing the values of
different variables with each other. Because if i change the structure of
the dataframe (which I can't do with the real data, at least not with out
doing it by hand with a huge dataset), it looks like this and works
perfectly:

oil2 - data.frame (
  names = c('YEAR', 'TX', 'CA', 'AL', 'ND'),
  oil_2011 = c(2011, 2, 4, 2, 21000),
  oil_2012 = c(2012, 3, 25000, 21000, 6)
  )
attach(oil2)
oil2[c(oil_2012 == max(oil_2012)),]

Any help is much appreciated.

Thanks, Tim Umbach

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Rich Shepard

On Thu, 17 Oct 2013, Richard M. Heiberger wrote:


That should have worked.


  That's what I thought when I first tried it.


I think something else is interfering. Did you redefine either T or F?


  Not intentionally.


Please send the output from dput(head(ffg.st)) so we can experiment in
your setting.


structure(list(sampdate = structure(c(13326, 13326, 13326, 13326, 
13326, 13326), class = Date), func_feed_grp = structure(c(1L, 
2L, 3L, 4L, 6L, 7L), .Label = c(Filterer, Gatherer, Grazer, 
Omnivore, Parasite, Predator, Shredder), class = factor),

quant = c(812L, 1880L, 624L, 11L, 948L, 1540L), pct.quant = c(0.14,
0.323, 0.107, 0.002, 0.163, 0.265), num.taxa = c(11L, 28L,
4L, 1L, 12L, 3L), pct.num.taxa = c(0.186, 0.475, 0.068, 0.017,
0.203, 0.051)), .Names = c(sampdate, func_feed_grp, quant, 
pct.quant, num.taxa, pct.num.taxa), row.names = 102:107, class =

data.frame)

Rich

--
Richard B. Shepard, Ph.D.  |  Have knowledge, will travel.
Applied Ecosystem Services, Inc.   |
http://www.appl-ecosys.com Voice: 503-667-4517  Fax: 503-667-8863

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparing two groups

2013-10-17 Thread Greg Snow
From your question it is not clear what your question/concerns really are,
and from what we can see it could very well be that you do not understand
the statistics that you are computing (not just the R implementation).  We
ask for a reproducible example because that helps us to help you, just a
couple of boxplots let us make some guesses, but we do not know the data
values or even the means and standard deviations, even the actual sample
sizes could help.

From the graph it is not surprising that the wilcox test say that the 2
groups are different and that the t test says that they are not (but
knowing data values would help even more).  The 2 tests are testing very
different hypotheses.  The wilcox test is testing that the 2 distributions
are identical and the more specific way it tests that is by looking at all
possible pairs between the 2 groups and seeing what proportion of them have
each group higher, if the null were true then half the time the data point
from mixed would be higher than the data point from monoculture and half
the time the other way.  From the boxplot we can see that the median of
monoculture is below the 1st quartile of mixed, so it is not surprising at
all that the wilcox test rejects the null hypothesis.

The t-test (which version you used you do not say) is testing if the means
are equal, since monculture is clearly skewed to the right with potential
outliers, it would not be surprising if the sample means were close enough
to each other that the t-test does not see a significant difference.  The 2
tests give different answers because they are answering very different
questions.

You state that I am not allowed to perform it referring to the t-test.
 This indicates that you don't have a full understanding or appreciation of
the Central Limit Theorem (an important enough theorem that I have a
cross-stitch based on it hanging on my wall (along with 2 other
cross-stitches of Bayes theorem and the mean value theorem of
integration)).  The plot shows 18 outliers in the monoculture group which
implies a sample size of at least 72, which means the other group has a
sample size of at least 14 if I interpret five times as big correctly.
 This is a large enough sample size for the CLT to tell us the t-test will
give a reasonable approximation (provided the other assumptions hold
reasonably well and you are interested in the question being answered).

So, I believe that the advice to read a textbook, or otherwise get some
help in basic understanding of the statistical tools is reasonable.  Once
you have that, then if you still need help then give us a reproducible
example and make it clear what your question really is and you will be much
more likely to receive an answer.


On Tue, Oct 15, 2013 at 6:01 AM, Andrej andrej.g.mil...@web.de wrote:

 So why not start with some statistical textbook? There are plenty of them
 available in CRAN.

 I wasn't implying, that I haven't read any textbook, or didn't do any
 research. I read some textbooks/Papers/etc. during the research about what
 to do and came across the wilcox test. I meant to imply that I could have
 problems understanding some of the answers, and that maybe additional
 explaining would be necessary.

 My doubts stem from the fact, that the wilcox test is a - as far as I know
 -
 ranking test, that states if two groups are different. My assumption is,
 due
 to the fact that the second group has a much higher sample size, it is
 clear
 that it differs from the first group. I performed a t-test (just to see; I
 am aware that I am not allowed to perform it, because my samples aren't
 normally distributed) and it gave me a p-value of 0.3.
 Actually I am not even entirely sure, if wilcox is the right test. I just
 want to know if the means of the two groups are significantly different.



 --
 View this message in context:
 http://r.789695.n4.nabble.com/Comparing-two-groups-tp4678190p4678277.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RWeka and multicore package

2013-10-17 Thread Luís Paulo F . Garcia
I work very mutch with the packages RWeka and multicore. If you try to run
J48 or any tree of RWeka with multicore we hava some errors.

Example I:

library(RWeka);
library(multicore);

mclapply(1:100, function(i) {
J48(Species ~., iris);
});


Output:  Error in .jcall(o, \Ljava/lang/Class;\, \getClass\) : \n
java.lang.ClassFormatError: Incompatible magic value 1347093252 in class
file java/lang/ProcessEnvironment$StringEnvironment\n


Example II:

library(multicore);

mclapply(1:100, function(i) {
RWeka::J48(Species ~., iris);
});

Output: Erro em .jcall(x$classifier, S, toString) :
  RcallMethod: attempt to call a method of a NULL object.


Do you know some way to work with parallel processing and RWeka? I tried
MPI and SNOW without success.

R version 3.0.2 (2013-09-25) -- Frisbee Sailing
Ubuntu 12.04 x64


-- 
Luís Paulo Faina Garcia
Engenheiro de Computação - Universidade de São Paulo
São Carlos - SP - Brasil

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4 base class

2013-10-17 Thread Martin Morgan

On 10/17/2013 08:54 AM, Michael Meyer wrote:


Suppose you have a base class Base which implements a function Base::F
which works in most contexts but not in the context of ComplicatedDerived 
class
where some preparation has to happen before this very same function can be 
called.

You would then define

void ComplicatedDerived::F(...){

 preparation();
 Base::F();
}

You can nealry duplicate this in R via

setMethod(F,
signature(this=ComplicatedDerived),
definition=function(this){

 preparation(this)
 F(as(this,Base))
})

but it will fail whenever F uses virtual functions (i.e. generics) which are 
only defined
for derived classes of Base


With

  .A - setClass(A, representation(a=numeric))
  .B - setClass(B, representation(b=numeric), contains=A)

  setGeneric(f, function(x, ...) standardGeneric(f))

  setMethod(f, A, function(x, ...) {
  message(f,A-method)
  g(x, ...)   # generic with methods only for derived classes
  })

  setMethod(f, B, function(x, ...) {
  message(f,B-method)
  callNextMethod(x, ...)  # earlier response from Duncan Murdoch
  })

  setGeneric(g, function(x, ...) standardGeneric(g))

  setMethod(g, B, function(x, ...) {
  message(g,B-method)
  x
  })

one has

 f(.B())
f,B-method
f,A-method
g,B-method

An object of class B
Slot b:
numeric(0)

Slot a:
numeric(0)

?


--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RWeka and multicore package

2013-10-17 Thread CEO'Riley
I received the following error message with the multicore package:

install.packages(multicore)
Warning in install.packages :
  package ‘multicore’ is not available (for R version 3.0.2)
Warning in install.packages :
  package ‘multicore’ is not available (for R version 3.0.2)
Warning message:
package ‘multicore’ is not available (for R version 3.0.2)


With gratitude,
CEO'Riley Jr.
Charles Ellis O'Riley Jr.

Ambition is a state of permanent dissatisfaction with the present


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Luís Paulo F. Garcia
Sent: Thursday, October 17, 2013 12:22 PM
To: r-help@r-project.org
Subject: [R] RWeka and multicore package

I work very mutch with the packages RWeka and multicore. If you try to run
J48 or any tree of RWeka with multicore we hava some errors.

Example I:

library(RWeka);
library(multicore);

mclapply(1:100, function(i) {
J48(Species ~., iris);
});


Output:  Error in .jcall(o, \Ljava/lang/Class;\, \getClass\) : \n
java.lang.ClassFormatError: Incompatible magic value 1347093252 in class
file java/lang/ProcessEnvironment$StringEnvironment\n


Example II:

library(multicore);

mclapply(1:100, function(i) {
RWeka::J48(Species ~., iris);
});

Output: Erro em .jcall(x$classifier, S, toString) :
  RcallMethod: attempt to call a method of a NULL object.


Do you know some way to work with parallel processing and RWeka? I tried MPI
and SNOW without success.

R version 3.0.2 (2013-09-25) -- Frisbee Sailing
Ubuntu 12.04 x64


--
Lums Paulo Faina Garcia
Engenheiro de Computagco - Universidade de Sco Paulo Sco Carlos - SP -
Brasil

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting maximums between different variables

2013-10-17 Thread arun
Hi,
You may try:


unlist(lapply(seq_len(nrow(oil)),function(i) oil[i,-1][which.max(oil[i,-1])])) 
 #  CA    ND 
#4 6 
#or
library(reshape2)

datM - melt(oil,id.var=YEAR)


datM[as.logical(with(datM,ave(value,list(YEAR),FUN= function(x) x%in% 
max(x,]
#  YEAR variable value
#3 2011   CA 4
#8 2012   ND 6

A.K.




On Thursday, October 17, 2013 12:50 PM, Tim Umbach tim.umb...@hufw.de wrote:
Hi there,

another beginners question, I'm afraid. Basically i want to selct the
maximum of values, that correspond to different variables. I have a table
of oil production that looks somewhat like this:

oil - data.frame( YEAR = c(2011, 2012),
                   TX = c(2, 3),
                   CA = c(4, 25000),
                   AL = c(2,
21000),

                   ND = c(21000,6))

Now I want to find out, which state produced most oil in a given year. I
tried this:

attach(oil)
last_year = oil[ c(YEAR == 2012), ]
max(last_year)

Which works, but it doesnt't give me the corresponding values (i.e. it just
gives me the maximum output, not what state its from).
So I tried this:

oil[c(oil == max(last_year)),]
and this:
oil[c(last_year == max(last_year)),]
and this:
oil[which.max(last_year),]
and this:
last_year[max(last_year),]

None of them work, but they don't give error messages either, the output is
just NA. The problem is, in my eyes, that I'm comparing the values of
different variables with each other. Because if i change the structure of
the dataframe (which I can't do with the real data, at least not with out
doing it by hand with a huge dataset), it looks like this and works
perfectly:

oil2 - data.frame (
  names = c('YEAR', 'TX', 'CA', 'AL', 'ND'),
  oil_2011 = c(2011, 2, 4, 2, 21000),
  oil_2012 = c(2012, 3, 25000, 21000, 6)
  )
attach(oil2)
oil2[c(oil_2012 == max(oil_2012)),]

Any help is much appreciated.

Thanks, Tim Umbach

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted regression markers on scatter plots

2013-10-17 Thread Greg Snow
The simplest approach is to specify the cex parameter in the call to plot.
 plot(1:3, 1:3, cex=3:1) for example will plot the 1st point 3 times as
big, the 2nd 2 times as big, and the 3rd at the standard size.

You can get more control by using the symbols function instead of the plot
function and set the diameter of circles directly.  In either case you
probably want to scale by the square root of the weight.

The my.symbols function in the TeachingDemos package is another option if
the symbols function does not include the symbol you want or if you want a
little different level of control.


On Wed, Oct 16, 2013 at 11:04 AM, Msugarman mike.sugar...@wayne.edu wrote:

 Hi all,

 I'm trying to graph the results of a weighted regression analysis. Is
 anyone
 aware of a way to make my markers appear a different sizes to be consistent
 with their respective weights?

 Thanks,
 -Mike Sugarman
 Wayne State University



 --
 View this message in context:
 http://r.789695.n4.nabble.com/Weighted-regression-markers-on-scatter-plots-tp4678370.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting maximums between different variables

2013-10-17 Thread Berend Hasselman

On 17-10-2013, at 18:48, Tim Umbach tim.umb...@hufw.de wrote:

 Hi there,
 
 another beginners question, I'm afraid. Basically i want to selct the
 maximum of values, that correspond to different variables. I have a table
 of oil production that looks somewhat like this:
 
 oil - data.frame( YEAR = c(2011, 2012),
   TX = c(2, 3),
   CA = c(4, 25000),
   AL = c(2,
 21000),
 
   ND = c(21000,6))
 
 Now I want to find out, which state produced most oil in a given year. I
 tried this:
 
 attach(oil)
 last_year = oil[ c(YEAR == 2012), ]
 max(last_year)
 

For a single year do

year - which(oil[,YEAR]==2011)
oil[year,which.max(oil[year,]),drop=FALSE]

In the help look at base::[.data.frame  

Berend


 Which works, but it doesnt't give me the corresponding values (i.e. it just
 gives me the maximum output, not what state its from).
 So I tried this:
 
 oil[c(oil == max(last_year)),]
 and this:
 oil[c(last_year == max(last_year)),]
 and this:
 oil[which.max(last_year),]
 and this:
 last_year[max(last_year),]
 
 None of them work, but they don't give error messages either, the output is
 just NA. The problem is, in my eyes, that I'm comparing the values of
 different variables with each other. Because if i change the structure of
 the dataframe (which I can't do with the real data, at least not with out
 doing it by hand with a huge dataset), it looks like this and works
 perfectly:
 
 oil2 - data.frame (
  names = c('YEAR', 'TX', 'CA', 'AL', 'ND'),
  oil_2011 = c(2011, 2, 4, 2, 21000),
  oil_2012 = c(2012, 3, 25000, 21000, 6)
  )
 attach(oil2)
 oil2[c(oil_2012 == max(oil_2012)),]
 
 Any help is much appreciated.
 
 Thanks, Tim Umbach
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] representing points in 3D space with trajectories over time

2013-10-17 Thread Greg Snow
If all your data is numeric then you can use an array instead of a data
frame and arrays can easily be 3, 4, or higher dimensional.  Or you can use
a data frame with a column each for x, y, z, and time; with possible other
columns representing groups or other attributes, essentially a 3
dimensional data frame with the 3rd dimension being stacked rather than
projecting out.


On Thu, Oct 17, 2013 at 6:59 AM, Umut Toprak umut.top...@unige.ch wrote:

 Dear all,

 I have a problem where I must represent points with XYZ coordinates
 changing over time. I will do a number of operations on this data such as
 calculating the YZ-projection distance of the points to the origin over
 time, the frequency spectrum of the X-T data etc. I am trying to find a
 good way of representing this data with an appropriate data structure.

 It appears like higher-dimensional data frames are not allowed and I do not
 know if I should use a list of data frames or if there is a better
 solution, possibly as part of an external package.

 Thank you for your time
 Umut Toprak

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread Katherine Gobin
 I am sorry perhaps  was not able to put the question properly. I am not 
looking for the subset of the data.frame where the basel_asset_class is  2. I 
do agree that would have been a basic requirement. Let me try to put the 
question again. 

I have a data frame as 

mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

# Please note I have changed the basel_asset_class to 4 from 2, to avoid 
confusion.

 mydat
  basel_asset_class defa_frequency
1                 4          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001



This is just an representative example. In reality, I may have no of basel 
asset classes. 4, 8 etc are the IDs can be anything thus I cant hard code it as 
subset(mydat, mydat$basel_asset_class  2).


What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies 
w.r.t. basel aseet class 4, similarly there could be another basel asset class 
having say 5 default frequncies. Thus, I need to take subset of the data.frame 
s.t. the no of corresponding defa_frequencies is greater than 2.

The idea is we try to fit exponential curve Y = A exp( BX ) for each of the 
basel asset classes and to estimate values of A and B, mathematically one needs 
to have at least two values of X.

I hope I may be able to express my requirement. Its not that I need the subset 
of mydat s.t. basel asset class is  2 (now 4 in revised example), but sbuset 
s.t. no of default frequencies is greater than or equal to 2. This 2 is not 
same as basel asset class 2.

Kindly guide

With warm regards

Katherine Gobin




On Thursday, 17 October 2013 9:33 PM, Bert Gunter gunter.ber...@gene.com 
wrote:
 
Kindly guide ...

This is a very basic question, so the kindest guide I can give is to read an 
Introduction to R (ships with R) or a R web tutorial of your choice so that you 
can learn how R works instead of posting to this list.

Cheers,
Bert




On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin katherine_go...@yahoo.com 
wrote:

Dear Forum,

I have a data frame as 

mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

 mydat
  basel_asset_class defa_frequency
1                 2          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001


I need to get the subset of this data.frame where no of records for the given 
basel_asset_class is  2, i.e. I need to obtain subset of above data.frame as 
(since there is only 1 record, against basel_asset_class = 2, I want to filter 
it)

 mydat_a
  basel_asset_class defa_frequency
1                 8          0.070
2                 8          0.030
3                 8          0.001

Kindly guide

Katherine
        [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread Katherine Gobin
Correction. (2nd para first three lines)
 
Pl read following line 

What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies 
w.r.t. basel aseet class 4,


as

What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are THREE default frequencies 
w.r.t. basel aseet class 8,



I alpologize for the incovenience.

Regards

KAtherine








On , Katherine Gobin katherine_go...@yahoo.com wrote:
 
 I am sorry perhaps  was not able to put the question properly. I am not 
looking for the subset of the data.frame where the basel_asset_class is  2. I 
do agree that would have been a basic requirement. Let me try to put the 
question again. 

I have a data frame as 

mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

# Please note I have changed the basel_asset_class to 4 from 2, to avoid 
confusion.

 mydat
  basel_asset_class defa_frequency
1                 4          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001



This is just an representative example. In reality, I may have no of basel 
asset classes. 4, 8 etc are the IDs can be anything thus I cant hard code it as 
subset(mydat, mydat$basel_asset_class  2).


What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies 
w.r.t. basel aseet class 4, similarly there could be another basel asset class 
having say 5 default frequncies. Thus, I need to take subset of the data.frame 
s.t. the no of corresponding defa_frequencies is greater than 2.

The idea is we try to fit exponential curve Y = A exp( BX ) for each of the 
basel asset classes and to estimate values of A and B, mathematically one needs 
to have at least two values of X.

I hope I may be able to express my requirement. Its not that I need the subset 
of mydat s.t. basel asset class is  2 (now 4 in revised example), but sbuset 
s.t. no of default frequencies is greater than or equal to 2. This 2 is not 
same as basel asset class 2.

Kindly guide

With warm regards

Katherine Gobin




On Thursday, 17 October 2013 9:33 PM, Bert Gunter gunter.ber...@gene.com 
wrote:
 
Kindly guide ...

This is a very basic question, so the kindest guide I can give is to read an 
Introduction to R (ships with R) or a R web tutorial of your choice so that you 
can learn how R works instead of posting to this list.

Cheers,
Bert




On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin katherine_go...@yahoo.com 
wrote:

Dear Forum,

I have a data frame as 

mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

 mydat
  basel_asset_class defa_frequency
1                 2          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001


I need to get the subset of this data.frame where no of records for the given 
basel_asset_class is  2, i.e. I need to obtain subset of above data.frame as 
(since there is only 1 record, against basel_asset_class = 2, I want to filter 
it)

 mydat_a
  basel_asset_class defa_frequency
1                 8          0.070
2                 8          0.030
3                 8          0.001

Kindly guide

Katherine
        [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Richard M. Heiberger
I always get lost in simpleKey.  The approach of directly modifying
the trellis object usually works.

 tmp - xyplot(pct.quant ~ sampdate, data = ffg.st, groups = func_feed_grp, 
 type =
+ 'p', pch = 19, key = simpleKey(text = levels(ffg.st$func_feed_grp), space =
+ 'right', points = T, lines = F),par.settings = list(superpose.points =
+ list(col = rainbow(7)), superpose.lines = list(col = rainbow(7))), main =
+ 'Functional Feeding Groups (Individuals)', xlab = 'Year', ylab = 'Proportion
+ of Individuals')
 tmp
 str(tmp)
 tmp$legend$right$args$key$points$pch
[1] 1 1 1 1 1 1 1
 tmp$legend$right$args$key$points$pch[] - 19
 tmp$legend$right$args$key$points$pch
[1] 19 19 19 19 19 19 19
 tmp


Rich

On Thu, Oct 17, 2013 at 12:57 PM, Rich Shepard rshep...@appl-ecosys.com wrote:
 On Thu, 17 Oct 2013, Richard M. Heiberger wrote:

 That should have worked.


   That's what I thought when I first tried it.


 I think something else is interfering. Did you redefine either T or F?


   Not intentionally.


 Please send the output from dput(head(ffg.st)) so we can experiment in
 your setting.


 structure(list(sampdate = structure(c(13326, 13326, 13326, 13326, 13326,
 13326), class = Date), func_feed_grp = structure(c(1L, 2L, 3L, 4L, 6L,
 7L), .Label = c(Filterer, Gatherer, Grazer, Omnivore, Parasite,
 Predator, Shredder), class = factor),
 quant = c(812L, 1880L, 624L, 11L, 948L, 1540L), pct.quant = c(0.14,
 0.323, 0.107, 0.002, 0.163, 0.265), num.taxa = c(11L, 28L,
 4L, 1L, 12L, 3L), pct.num.taxa = c(0.186, 0.475, 0.068, 0.017,
 0.203, 0.051)), .Names = c(sampdate, func_feed_grp, quant,
 pct.quant, num.taxa, pct.num.taxa), row.names = 102:107, class =
 data.frame)


 Rich

 --
 Richard B. Shepard, Ph.D.  |  Have knowledge, will travel.
 Applied Ecosystem Services, Inc.   |
 http://www.appl-ecosys.com Voice: 503-667-4517  Fax: 503-667-8863

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S4 base class

2013-10-17 Thread Michael Meyer
@Martin Morgan, Duncan Murdoch:

OK Thanks.
I did not understand the callNextMethod.
I will investigate this in detail.
This is great!

Thanks again,

 
Michael Meyer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread arun
You may try:
mydat[with(mydat,ave(seq_along(basel_asset_class),basel_asset_class,FUN=length)2),]
#  basel_asset_class defa_frequency
#2 8  0.070
#3 8  0.030
#4 8  0.001


#or
library(plyr)
mydat[ddply(mydat,.(basel_asset_class),mutate,L=length(defa_frequency))[,3] 
2,] #assuming it is sorted.

A.K.




On Thursday, October 17, 2013 1:59 PM, Katherine Gobin 
katherine_go...@yahoo.com wrote:
 I am sorry perhaps  was not able to put the question properly. I am not 
looking for the subset of the data.frame where the basel_asset_class is  2. I 
do agree that would have been a basic requirement. Let me try to put the 
question again. 

I have a data frame as 

mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

# Please note I have changed the basel_asset_class to 4 from 2, to avoid 
confusion.

 mydat
  basel_asset_class defa_frequency
1                 4          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001



This is just an representative example. In reality, I may have no of basel 
asset classes. 4, 8 etc are the IDs can be anything thus I cant hard code it as 
subset(mydat, mydat$basel_asset_class  2).


What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies 
w.r.t. basel aseet class 4, similarly there could be another basel asset class 
having say 5 default frequncies. Thus, I need to take subset of the data.frame 
s.t. the no of corresponding defa_frequencies is greater than 2.

The idea is we try to fit exponential curve Y = A exp( BX ) for each of the 
basel asset classes and to estimate values of A and B, mathematically one needs 
to have at least two values of X.

I hope I may be able to express my requirement. Its not that I need the subset 
of mydat s.t. basel asset class is  2 (now 4 in revised example), but sbuset 
s.t. no of default frequencies is greater than or equal to 2. This 2 is not 
same as basel asset class 2.

Kindly guide

With warm regards

Katherine Gobin





On Thursday, 17 October 2013 9:33 PM, Bert Gunter gunter.ber...@gene.com 
wrote:

Kindly guide ...

This is a very basic question, so the kindest guide I can give is to read an 
Introduction to R (ships with R) or a R web tutorial of your choice so that you 
can learn how R works instead of posting to this list.

Cheers,
Bert




On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin katherine_go...@yahoo.com 
wrote:

Dear Forum,

I have a data frame as 

mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

 mydat
  basel_asset_class defa_frequency
1                 2          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001


I need to get the subset of this data.frame where no of records for the given 
basel_asset_class is  2, i.e. I need to obtain subset of above data.frame as 
(since there is only 1 record, against basel_asset_class = 2, I want to filter 
it)

 mydat_a
  basel_asset_class defa_frequency
1                 8          0.070
2                 8          0.030
3                 8          0.001

Kindly guide

Katherine
        [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374
    [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Rich Shepard

On Thu, 17 Oct 2013, Richard M. Heiberger wrote:


I always get lost in simpleKey.


  As this is my first use of it I take what's offered by those more
experienced than I.


The approach of directly modifying the trellis object usually works.
tmp - xyplot(pct.quant ~ sampdate, data = ffg.st, groups = func_feed_grp, type 
=
+ 'p', pch = 19, key = simpleKey(text = levels(ffg.st$func_feed_grp), space =
+ 'right', points = T, lines = F),par.settings = list(superpose.points =
+ list(col = rainbow(7)), superpose.lines = list(col = rainbow(7))), main =
+ 'Functional Feeding Groups (Individuals)', xlab = 'Year', ylab = 'Proportion
+ of Individuals')

tmp
str(tmp)
tmp$legend$right$args$key$points$pch

[1] 1 1 1 1 1 1 1

tmp$legend$right$args$key$points$pch[] - 19
tmp$legend$right$args$key$points$pch

[1] 19 19 19 19 19 19 19


  OK. More steps but it will get the plots where they need to be.

Many thanks,

Rich

--
Richard B. Shepard, Ph.D.  |  Have knowledge, will travel.
Applied Ecosystem Services, Inc.   |
http://www.appl-ecosys.com Voice: 503-667-4517  Fax: 503-667-8863

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Newb: How I find random vector index?

2013-10-17 Thread Stock Beaver
# Suppose I have a vector:

myvec = c(1,0,3,0,77,9,0,1,2,0)

# I want to randomly pick an element from myvec
# where element == 0
# and print the value of the corresponding index.

# So, for example I might randomly pick the 3rd 0
# and I would print the corresponding index
# which is 7,

# My initial approach is to use a for-loop.
# Also I take a short-cut which assumes myvec is short:

elm = 1
while (elm != 0) {
  # Pick a random index, (it might be a 0):
  rndidx = round(runif(1, min=1, max=length(myvec)))
  elm = myvec[rndidx]
  if(elm == 0)
    print(I am done)
  else
    print(I am not done)
}
print(rndidx)

# If myvec is large and/or contains no zeros,
# The above loop is sub-optimal/faulty.

# I suspect that skilled R-people would approach this task differently.
# Perhaps they would use features baked into R rather than use a loop?
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Goodness of fit for a Rietveld Refinement

2013-10-17 Thread rflood13
Hi folks,

Wondering if anyone might be able to help me on this one. I have just done
some geochemistry with X-ray Diffraction and Rietveld Refinement in order to
quantify the data. I have a observed spectra from my sample and a calculated
spectra from the Rietveld Refinement (in the attached image, along with the
background). I was wondering is there a package in R that I might be able to
use that would essentially show me how well (or how poorly) fitted the
Rietveld calculated spectra was with regard to my observed spectra? It's
essentially a goodness of fit or R-squared value but I've been having some
difficulty finding the right way to assess the model fit. I'd appreciate any
information or tips anyone might have.

Kind regards,

Rory Flood.

--
Rory Flood
Postgraduate Research Student
Room 02 044, Elmwood Building
School of Geography, Archaeology and Palaeoecology
Queen's University Belfast
Belfast BT7 1NN
Co. Antrim
Northern Ireland

Tel: +44 (0) 28 9097 3929
Email: rfloo...@qub.ac.uk
__

http://r.789695.n4.nabble.com/file/n4678470/Spectra.jpg 



--
View this message in context: 
http://r.789695.n4.nabble.com/Goodness-of-fit-for-a-Rietveld-Refinement-tp4678470.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Newb: How I find random vector index?

2013-10-17 Thread Sarah Goslee
Not only does it not require a loop, this is a one-liner:

 myvec - c(1,0,3,0,77,9,0,1,2,0)
 sample(which(myvec == 0), 1)
[1] 4
 sample(which(myvec == 0), 1)
[1] 7
 sample(which(myvec == 0), 1)
[1] 2

If there's a possibility of not having zeros then you'll need to check
that separately, otherwise sample() will throw an error. For instance:

if(any(myvec == 0)) {
  sample(which(myvec == 0), 1)
}

which() will

Sarah


On Thu, Oct 17, 2013 at 2:54 PM, Stock Beaver stockbea...@ymail.com wrote:
 # Suppose I have a vector:

 myvec = c(1,0,3,0,77,9,0,1,2,0)

 # I want to randomly pick an element from myvec
 # where element == 0
 # and print the value of the corresponding index.

 # So, for example I might randomly pick the 3rd 0
 # and I would print the corresponding index
 # which is 7,

 # My initial approach is to use a for-loop.
 # Also I take a short-cut which assumes myvec is short:

 elm = 1
 while (elm != 0) {
   # Pick a random index, (it might be a 0):
   rndidx = round(runif(1, min=1, max=length(myvec)))
   elm = myvec[rndidx]
   if(elm == 0)
 print(I am done)
   else
 print(I am not done)
 }
 print(rndidx)

 # If myvec is large and/or contains no zeros,
 # The above loop is sub-optimal/faulty.

 # I suspect that skilled R-people would approach this task differently.
 # Perhaps they would use features baked into R rather than use a loop?
 [[alternative HTML version deleted]]


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Newb: How I find random vector index?

2013-10-17 Thread Sarah Goslee
Typo fix below:

On Thu, Oct 17, 2013 at 3:05 PM, Sarah Goslee sarah.gos...@gmail.com wrote:
 Not only does it not require a loop, this is a one-liner:

 myvec - c(1,0,3,0,77,9,0,1,2,0)
 sample(which(myvec == 0), 1)
 [1] 4
 sample(which(myvec == 0), 1)
 [1] 7
 sample(which(myvec == 0), 1)
 [1] 2

 If there's a possibility of not having zeros then you'll need to check
 that separately, otherwise sample() will throw an error. For instance:

 if(any(myvec == 0)) {
   sample(which(myvec == 0), 1)
 }

 which() will
  ^  just delete this.


 Sarah


 On Thu, Oct 17, 2013 at 2:54 PM, Stock Beaver stockbea...@ymail.com wrote:
 # Suppose I have a vector:

 myvec = c(1,0,3,0,77,9,0,1,2,0)

 # I want to randomly pick an element from myvec
 # where element == 0
 # and print the value of the corresponding index.

 # So, for example I might randomly pick the 3rd 0
 # and I would print the corresponding index
 # which is 7,

 # My initial approach is to use a for-loop.
 # Also I take a short-cut which assumes myvec is short:

 elm = 1
 while (elm != 0) {
   # Pick a random index, (it might be a 0):
   rndidx = round(runif(1, min=1, max=length(myvec)))
   elm = myvec[rndidx]
   if(elm == 0)
 print(I am done)
   else
 print(I am not done)
 }
 print(rndidx)

 # If myvec is large and/or contains no zeros,
 # The above loop is sub-optimal/faulty.

 # I suspect that skilled R-people would approach this task differently.
 # Perhaps they would use features baked into R rather than use a loop?
 [[alternative HTML version deleted]]



-- 
Sarah Goslee

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted regression markers on scatter plots

2013-10-17 Thread David Winsemius

On Oct 16, 2013, at 10:04 AM, Msugarman wrote:

 Hi all,
 
 I'm trying to graph the results of a weighted regression analysis. Is anyone
 aware of a way to make my markers appear a different sizes to be consistent
 with their respective weights?

You have not produced any data or code. If using base graphics then 
`plot.default` accepta vector for cex.


 
 Thanks,
 -Mike Sugarman
 Wayne State University
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Weighted-regression-markers-on-scatter-plots-tp4678370.html
 Sent from the R help mailing list archive at Nabble.com.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Newb: How I find random vector index?

2013-10-17 Thread Brian Diggs
On 10/17/2013 11:54 AM, Stock Beaver wrote:
 # Suppose I have a vector:

 myvec = c(1,0,3,0,77,9,0,1,2,0)

 # I want to randomly pick an element from myvec
 # where element == 0
 # and print the value of the corresponding index.

 # So, for example I might randomly pick the 3rd 0
 # and I would print the corresponding index
 # which is 7,

 # My initial approach is to use a for-loop.
 # Also I take a short-cut which assumes myvec is short:

 elm = 1
 while (elm != 0) {
# Pick a random index, (it might be a 0):
rndidx = round(runif(1, min=1, max=length(myvec)))
elm = myvec[rndidx]
if(elm == 0)
  print(I am done)
else
  print(I am not done)
 }
 print(rndidx)

It's a little easier if you re-arrange your problem statement. This is 
equivalent: return randomly one index of myvec for which the element of 
myvec equals 0. A direct implementation of this is

sample(which(myvec==0), 1)

which(myvec==0) returns a vector of indexes of myvec for which the value 
of the vector is 0. sample(..., 1) randomly selects one of those.

 # If myvec is large and/or contains no zeros,
 # The above loop is sub-optimal/faulty.

This approach also fails if there is no 0's in the vector. What do you 
want the result to be when that is the case? If we go with the simple 
answer of NA, then you can special case that (and wrap it up into a 
function)

OneZeroIndex - function(myvec) {
   zeros - which(myvec==0)
   if (length(zeros)  0) {
 sample(zeros, 1)
   } else {
 NA
   }
}

 # I suspect that skilled R-people would approach this task differently.
 # Perhaps they would use features baked into R rather than use a loop?
   [[alternative HTML version deleted]]
Please post plain text only.

-- 
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health  Science University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting maximums between different variables

2013-10-17 Thread arun
Hi,

You could also check ?data.table() as it could be faster.

#Speed comparison


set.seed(498) 
oilT - 
data.frame(YEAR=rep(rep(1800:2012,50),100),state=rep(rep(state.abb,each=213),100),value=sample(2000:8,1065000,replace=TRUE),stringsAsFactors=FALSE)
system.time(res1 - 
oilT[as.logical(with(oilT,ave(value,list(YEAR),FUN=function(x) x%in% 
max(x,])
# user  system elapsed 
#  0.532   0.008   0.540 
 dim(res1) #as some years have duplicated maximums
#[1] 220   3

 res1[duplicated(res1[,1])|duplicated(res1[,1],fromLast=TRUE),]

library(data.table)
dt1 - data.table(oilT,key='YEAR')
system.time( res2 - dt1[dt1[,value %in% max(value),'YEAR']$V1])
#   user  system elapsed 
#  0.060   0.000   0.062 
 res1 - res1[order(res1$YEAR),]
 row.names(res1) - 1:nrow(res1)
 identical(res1,as.data.frame(res2))
#[1] TRUE


A.K.



On Thursday, October 17, 2013 1:35 PM, arun smartpink...@yahoo.com wrote:
Hi,
You may try:


unlist(lapply(seq_len(nrow(oil)),function(i) oil[i,-1][which.max(oil[i,-1])])) 
 #  CA    ND 
#4 6 
#or
library(reshape2)

datM - melt(oil,id.var=YEAR)


datM[as.logical(with(datM,ave(value,list(YEAR),FUN= function(x) x%in% 
max(x,]
#  YEAR variable value
#3 2011   CA 4
#8 2012   ND 6

A.K.




On Thursday, October 17, 2013 12:50 PM, Tim Umbach tim.umb...@hufw.de wrote:
Hi there,

another beginners question, I'm afraid. Basically i want to selct the
maximum of values, that correspond to different variables. I have a table
of oil production that looks somewhat like this:

oil - data.frame( YEAR = c(2011, 2012),
                   TX = c(2, 3),
                   CA = c(4, 25000),
                   AL = c(2,
21000),

                   ND = c(21000,6))

Now I want to find out, which state produced most oil in a given year. I
tried this:

attach(oil)
last_year = oil[ c(YEAR == 2012), ]
max(last_year)

Which works, but it doesnt't give me the corresponding values (i.e. it just
gives me the maximum output, not what state its from).
So I tried this:

oil[c(oil == max(last_year)),]
and this:
oil[c(last_year == max(last_year)),]
and this:
oil[which.max(last_year),]
and this:
last_year[max(last_year),]

None of them work, but they don't give error messages either, the output is
just NA. The problem is, in my eyes, that I'm comparing the values of
different variables with each other. Because if i change the structure of
the dataframe (which I can't do with the real data, at least not with out
doing it by hand with a huge dataset), it looks like this and works
perfectly:

oil2 - data.frame (
  names = c('YEAR', 'TX', 'CA', 'AL', 'ND'),
  oil_2011 = c(2011, 2, 4, 2, 21000),
  oil_2012 = c(2012, 3, 25000, 21000, 6)
  )
attach(oil2)
oil2[c(oil_2012 == max(oil_2012)),]

Any help is much appreciated.

Thanks, Tim Umbach

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted regression markers on scatter plots

2013-10-17 Thread Jim Lemon

On 10/17/2013 04:04 AM, Msugarman wrote:

Hi all,

I'm trying to graph the results of a weighted regression analysis. Is anyone
aware of a way to make my markers appear a different sizes to be consistent
with their respective weights?


Hi Mike,
Have a look at the size_n_color function in the plotrix package.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread Bert Gunter
May I ask why:

count_by_class - with(dat, ave(numeric(length(basel_
asset_class)), basel_asset_class, FUN=length))

should not be more simply done as:

count_by_class - with(dat, ave(basel_asset_class, basel_asset_class,
FUN=length))

?

-- Bert


On Thu, Oct 17, 2013 at 12:36 PM, William Dunlap wdun...@tibco.com wrote:

  What I need is to select only those records for which there are more
 than two default
  frequencies (defa_frequency),

 Here is one way.  There are many others:
 dat - data.frame( # slightly less trivial example
 basel_asset_class=c(4,8,8,8,74,3,74),
 defa_frequency=(1:7)/8)
 count_by_class - with(dat, ave(numeric(length(basel_asset_class)),
 basel_asset_class, FUN=length))
 cbind(dat, count_by_class) # see what we just computed
  basel_asset_class defa_frequency count_by_class
1 4  0.125  1
2 8  0.250  3
3 8  0.375  3
4 8  0.500  3
574  0.625  2
6 3  0.750  1
774  0.875  2
 mydat[count_by_class1, ] # I think this is what you are asking for
  basel_asset_class defa_frequency
2 8  0.250
3 8  0.375
4 8  0.500
574  0.625
774  0.875

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com


  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf
  Of Katherine Gobin
  Sent: Thursday, October 17, 2013 11:05 AM
  To: Bert Gunter
  Cc: r-help@r-project.org
  Subject: Re: [R] Subseting a data.frame
 
  Correction. (2nd para first three lines)
 
  Pl read following line
 
  What I need is to select only those records for which there are more
 than two default
  frequencies (defa_frequency), Thus, there is only one default frequency
 = 0.150 w.r.t
  basel_asset_class = 4 whereas there are default frequencies w.r.t. basel
 aseet class 4,
 
 
  as
 
  What I need is to select only those records for which there are more
 than two default
  frequencies (defa_frequency), Thus, there is only one default frequency
 = 0.150 w.r.t
  basel_asset_class = 4 whereas there are THREE default frequencies w.r.t.
 basel aseet
  class 8,
 
 
 
  I alpologize for the incovenience.
 
  Regards
 
  KAtherine
 
 
 
 
 
 
 
 
  On , Katherine Gobin katherine_go...@yahoo.com wrote:
 
   I am sorry perhaps  was not able to put the question properly. I am not
 looking for the
  subset of the data.frame where the basel_asset_class is  2. I do agree
 that would have
  been a basic requirement. Let me try to put the question again.
 
  I have a data frame as
 
  mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency =
 c(0.15, 0.07, 0.03,
  0.001))
 
  # Please note I have changed the basel_asset_class to 4 from 2, to avoid
 confusion.
 
   mydat
basel_asset_class defa_frequency
  1 4  0.150
  2 8  0.070
  3 8  0.030
  4 8  0.001
 
 
 
  This is just an representative example. In reality, I may have no of
 basel asset classes. 4, 8
  etc are the IDs can be anything thus I cant hard code it as subset(mydat,
  mydat$basel_asset_class  2).
 
 
  What I need is to select only those records for which there are more
 than two default
  frequencies (defa_frequency), Thus, there is only one default frequency
 = 0.150 w.r.t
  basel_asset_class = 4 whereas there are default frequencies w.r.t. basel
 aseet class 4,
  similarly there could be another basel asset class having say 5 default
 frequncies. Thus, I
  need to take subset of the data.frame s.t. the no of corresponding
 defa_frequencies is
  greater than 2.
 
  The idea is we try to fit exponential curve Y = A exp( BX ) for each of
 the basel asset
  classes and to estimate values of A and B, mathematically one needs to
 have at least two
  values of X.
 
  I hope I may be able to express my requirement. Its not that I need the
 subset of mydat
  s.t. basel asset class is  2 (now 4 in revised example), but sbuset
 s.t. no of default
  frequencies is greater than or equal to 2. This 2 is not same as basel
 asset class 2.
 
  Kindly guide
 
  With warm regards
 
  Katherine Gobin
 
 
 
 
  On Thursday, 17 October 2013 9:33 PM, Bert Gunter 
 gunter.ber...@gene.com wrote:
 
  Kindly guide ...
 
  This is a very basic question, so the kindest guide I can give is to
 read an Introduction to R
  (ships with R) or a R web tutorial of your choice so that you can learn
 how R works
  instead of posting to this list.
 
  Cheers,
  Bert
 
 
 
 
  On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin 
 katherine_go...@yahoo.com
  wrote:
 
  Dear Forum,
  
  I have a data 

Re: [R] Subseting a data.frame

2013-10-17 Thread William Dunlap
 What I need is to select only those records for which there are more than two 
 default
 frequencies (defa_frequency),

Here is one way.  There are many others:
dat - data.frame( # slightly less trivial example
basel_asset_class=c(4,8,8,8,74,3,74),
defa_frequency=(1:7)/8)
count_by_class - with(dat, ave(numeric(length(basel_asset_class)), 
basel_asset_class, FUN=length))
cbind(dat, count_by_class) # see what we just computed
 basel_asset_class defa_frequency count_by_class
   1 4  0.125  1
   2 8  0.250  3
   3 8  0.375  3
   4 8  0.500  3
   574  0.625  2
   6 3  0.750  1
   774  0.875  2
mydat[count_by_class1, ] # I think this is what you are asking for
 basel_asset_class defa_frequency
   2 8  0.250
   3 8  0.375
   4 8  0.500
   574  0.625
   774  0.875

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Katherine Gobin
 Sent: Thursday, October 17, 2013 11:05 AM
 To: Bert Gunter
 Cc: r-help@r-project.org
 Subject: Re: [R] Subseting a data.frame
 
 Correction. (2nd para first three lines)
 
 Pl read following line
 
 What I need is to select only those records for which there are more than two 
 default
 frequencies (defa_frequency), Thus, there is only one default frequency = 
 0.150 w.r.t
 basel_asset_class = 4 whereas there are default frequencies w.r.t. basel 
 aseet class 4,
 
 
 as
 
 What I need is to select only those records for which there are more than two 
 default
 frequencies (defa_frequency), Thus, there is only one default frequency = 
 0.150 w.r.t
 basel_asset_class = 4 whereas there are THREE default frequencies w.r.t. 
 basel aseet
 class 8,
 
 
 
 I alpologize for the incovenience.
 
 Regards
 
 KAtherine
 
 
 
 
 
 
 
 
 On , Katherine Gobin katherine_go...@yahoo.com wrote:
 
  I am sorry perhaps  was not able to put the question properly. I am not 
 looking for the
 subset of the data.frame where the basel_asset_class is  2. I do agree that 
 would have
 been a basic requirement. Let me try to put the question again.
 
 I have a data frame as
 
 mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = 
 c(0.15, 0.07, 0.03,
 0.001))
 
 # Please note I have changed the basel_asset_class to 4 from 2, to avoid 
 confusion.
 
  mydat
   basel_asset_class defa_frequency
 1                 4          0.150
 2                 8          0.070
 3                 8          0.030
 4                 8          0.001
 
 
 
 This is just an representative example. In reality, I may have no of basel 
 asset classes. 4, 8
 etc are the IDs can be anything thus I cant hard code it as subset(mydat,
 mydat$basel_asset_class  2).
 
 
 What I need is to select only those records for which there are more than two 
 default
 frequencies (defa_frequency), Thus, there is only one default frequency = 
 0.150 w.r.t
 basel_asset_class = 4 whereas there are default frequencies w.r.t. basel 
 aseet class 4,
 similarly there could be another basel asset class having say 5 default 
 frequncies. Thus, I
 need to take subset of the data.frame s.t. the no of corresponding 
 defa_frequencies is
 greater than 2.
 
 The idea is we try to fit exponential curve Y = A exp( BX ) for each of the 
 basel asset
 classes and to estimate values of A and B, mathematically one needs to have 
 at least two
 values of X.
 
 I hope I may be able to express my requirement. Its not that I need the 
 subset of mydat
 s.t. basel asset class is  2 (now 4 in revised example), but sbuset s.t. no 
 of default
 frequencies is greater than or equal to 2. This 2 is not same as basel asset 
 class 2.
 
 Kindly guide
 
 With warm regards
 
 Katherine Gobin
 
 
 
 
 On Thursday, 17 October 2013 9:33 PM, Bert Gunter gunter.ber...@gene.com 
 wrote:
 
 Kindly guide ...
 
 This is a very basic question, so the kindest guide I can give is to read an 
 Introduction to R
 (ships with R) or a R web tutorial of your choice so that you can learn how R 
 works
 instead of posting to this list.
 
 Cheers,
 Bert
 
 
 
 
 On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin katherine_go...@yahoo.com
 wrote:
 
 Dear Forum,
 
 I have a data frame as
 
 mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = 
 c(0.15, 0.07,
 0.03, 0.001))
 
  mydat
   basel_asset_class defa_frequency
 1                 2          0.150
 2                 8          0.070
 3                 8          0.030
 4                 8          0.001
 
 
 I need to get the subset of this data.frame where no of records for the given
 

Re: [R] Subseting a data.frame

2013-10-17 Thread William Dunlap
  May I ask why:
count_by_class - with(dat, ave(numeric(length(basel_asset_class)), 
basel_asset_class, FUN=length))
  should not be more simply done as:
count_by_class - with(dat, ave(basel_asset_class, basel_asset_class, 
FUN=length))

The way I did it would work if basel_asset_class were non-numeric.
In ave(x, group, FUN=FUN), FUN's return value should be the same type as x (or
you can get some odd type conversions).  E.g.,

num - c(2,3,2,2) ;  char - c(Two,Three,Two,Two)
ave(num, num, FUN=length) # good
   [1] 3 1 3 3
ave(char, char, FUN=length) # bad
   [1] 3 1 3 3
fac - factor(char, levels=c(One,Two,Three))
ave(fac, fac, FUN=length)
   [1] NA NA NA NA
   Levels: One Two Three
   Warning messages:
   1: In `[-.factor`(`*tmp*`, i, value = 0L) :
 invalid factor level, NA generated
   2: In `[-.factor`(`*tmp*`, i, value = 3L) :
 invalid factor level, NA generated
   3: In `[-.factor`(`*tmp*`, i, value = 1L) :
 invalid factor level, NA generated
but x=integer(length(group)) works in all cases:
ave(integer(length(fac)), fac, FUN=length)
   [1] 3 1 3 3
ave(integer(length(char)), char, FUN=length)
  [1] 3 1 3 3

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

From: Bert Gunter [mailto:gunter.ber...@gene.com]
Sent: Thursday, October 17, 2013 1:06 PM
To: William Dunlap
Cc: Katherine Gobin; r-help@r-project.org
Subject: Re: [R] Subseting a data.frame

May I ask why:

count_by_class - with(dat, ave(numeric(length(basel_
asset_class)), basel_asset_class, FUN=length))
should not be more simply done as:

count_by_class - with(dat, ave(basel_asset_class, basel_asset_class, 
FUN=length))

?
-- Bert

On Thu, Oct 17, 2013 at 12:36 PM, William Dunlap 
wdun...@tibco.commailto:wdun...@tibco.com wrote:
 What I need is to select only those records for which there are more than two 
 default
 frequencies (defa_frequency),

Here is one way.  There are many others:
dat - data.frame( # slightly less trivial example
basel_asset_class=c(4,8,8,8,74,3,74),
defa_frequency=(1:7)/8)
count_by_class - with(dat, ave(numeric(length(basel_asset_class)), 
basel_asset_class, FUN=length))
cbind(dat, count_by_class) # see what we just computed
 basel_asset_class defa_frequency count_by_class
   1 4  0.125  1
   2 8  0.250  3
   3 8  0.375  3
   4 8  0.500  3
   574  0.625  2
   6 3  0.750  1
   774  0.875  2
mydat[count_by_class1, ] # I think this is what you are asking for
 basel_asset_class defa_frequency
   2 8  0.250
   3 8  0.375
   4 8  0.500
   574  0.625
   774  0.875

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.comhttp://tibco.com


 -Original Message-
 From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Katherine Gobin
 Sent: Thursday, October 17, 2013 11:05 AM
 To: Bert Gunter
 Cc: r-help@r-project.orgmailto:r-help@r-project.org
 Subject: Re: [R] Subseting a data.frame

 Correction. (2nd para first three lines)

 Pl read following line

 What I need is to select only those records for which there are more than two 
 default
 frequencies (defa_frequency), Thus, there is only one default frequency = 
 0.150 w.r.t
 basel_asset_class = 4 whereas there are default frequencies w.r.t. basel 
 aseet class 4,


 as

 What I need is to select only those records for which there are more than two 
 default
 frequencies (defa_frequency), Thus, there is only one default frequency = 
 0.150 w.r.t
 basel_asset_class = 4 whereas there are THREE default frequencies w.r.t. 
 basel aseet
 class 8,



 I alpologize for the incovenience.

 Regards

 KAtherine








 On , Katherine Gobin 
 katherine_go...@yahoo.commailto:katherine_go...@yahoo.com wrote:

  I am sorry perhaps  was not able to put the question properly. I am not 
 looking for the
 subset of the data.frame where the basel_asset_class is  2. I do agree that 
 would have
 been a basic requirement. Let me try to put the question again.

 I have a data frame as

 mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = 
 c(0.15, 0.07, 0.03,
 0.001))

 # Please note I have changed the basel_asset_class to 4 from 2, to avoid 
 confusion.

  mydat
   basel_asset_class defa_frequency
 1 4  0.150
 2 8  0.070
 3 8  0.030
 4 8  0.001



 This is just an representative example. In reality, I may have no of basel 
 asset classes. 4, 8
 etc are the IDs can be anything thus I cant hard code it as subset(mydat,
 

Re: [R] Subseting a data.frame

2013-10-17 Thread Bert Gunter
Thanks, Bill.

But ?ave specifically says:

ave(x, ..., FUN = mean)

Arguments:
x

A numeric.

So that it should not be expected to work properly if the argument is
not (coercible to) numeric. Nevertheless, defensive programming is
always wise.

Cheers,
Bert


On Thu, Oct 17, 2013 at 1:34 PM, William Dunlap wdun...@tibco.com wrote:
   May I ask why:
 count_by_class - with(dat, ave(numeric(length(basel_asset_class)),
 basel_asset_class, FUN=length))

   should not be more simply done as:
 count_by_class - with(dat, ave(basel_asset_class, basel_asset_class,
 FUN=length))

 The way I did it would work if basel_asset_class were non-numeric.

 In ave(x, group, FUN=FUN), FUN's return value should be the same type as x
 (or

 you can get some odd type conversions).  E.g.,



 num - c(2,3,2,2) ;  char - c(Two,Three,Two,Two)

 ave(num, num, FUN=length) # good

[1] 3 1 3 3

 ave(char, char, FUN=length) # bad

[1] 3 1 3 3

 fac - factor(char, levels=c(One,Two,Three))

 ave(fac, fac, FUN=length)

[1] NA NA NA NA

Levels: One Two Three

Warning messages:

1: In `[-.factor`(`*tmp*`, i, value = 0L) :

  invalid factor level, NA generated

2: In `[-.factor`(`*tmp*`, i, value = 3L) :

  invalid factor level, NA generated

3: In `[-.factor`(`*tmp*`, i, value = 1L) :

  invalid factor level, NA generated

 but x=integer(length(group)) works in all cases:

 ave(integer(length(fac)), fac, FUN=length)

[1] 3 1 3 3

 ave(integer(length(char)), char, FUN=length)

   [1] 3 1 3 3



 Bill Dunlap

 Spotfire, TIBCO Software

 wdunlap tibco.com



 From: Bert Gunter [mailto:gunter.ber...@gene.com]
 Sent: Thursday, October 17, 2013 1:06 PM
 To: William Dunlap
 Cc: Katherine Gobin; r-help@r-project.org
 Subject: Re: [R] Subseting a data.frame



 May I ask why:

 count_by_class - with(dat, ave(numeric(length(basel_

 asset_class)), basel_asset_class, FUN=length))

 should not be more simply done as:

 count_by_class - with(dat, ave(basel_asset_class, basel_asset_class,
 FUN=length))

 ?

 -- Bert



 On Thu, Oct 17, 2013 at 12:36 PM, William Dunlap wdun...@tibco.com wrote:

 What I need is to select only those records for which there are more than
 two default
 frequencies (defa_frequency),

 Here is one way.  There are many others:
 dat - data.frame( # slightly less trivial example
 basel_asset_class=c(4,8,8,8,74,3,74),
 defa_frequency=(1:7)/8)
 count_by_class - with(dat, ave(numeric(length(basel_asset_class)),
 basel_asset_class, FUN=length))
 cbind(dat, count_by_class) # see what we just computed
  basel_asset_class defa_frequency count_by_class
1 4  0.125  1
2 8  0.250  3
3 8  0.375  3
4 8  0.500  3
574  0.625  2
6 3  0.750  1
774  0.875  2
 mydat[count_by_class1, ] # I think this is what you are asking for
  basel_asset_class defa_frequency
2 8  0.250
3 8  0.375
4 8  0.500
574  0.625
774  0.875

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf
 Of Katherine Gobin
 Sent: Thursday, October 17, 2013 11:05 AM
 To: Bert Gunter
 Cc: r-help@r-project.org
 Subject: Re: [R] Subseting a data.frame

 Correction. (2nd para first three lines)

 Pl read following line

 What I need is to select only those records for which there are more than
 two default
 frequencies (defa_frequency), Thus, there is only one default frequency =
 0.150 w.r.t
 basel_asset_class = 4 whereas there are default frequencies w.r.t. basel
 aseet class 4,


 as

 What I need is to select only those records for which there are more than
 two default
 frequencies (defa_frequency), Thus, there is only one default frequency =
 0.150 w.r.t
 basel_asset_class = 4 whereas there are THREE default frequencies w.r.t.
 basel aseet
 class 8,



 I alpologize for the incovenience.

 Regards

 KAtherine








 On , Katherine Gobin katherine_go...@yahoo.com wrote:

  I am sorry perhaps  was not able to put the question properly. I am not
 looking for the
 subset of the data.frame where the basel_asset_class is  2. I do agree
 that would have
 been a basic requirement. Let me try to put the question again.

 I have a data frame as

 mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency =
 c(0.15, 0.07, 0.03,
 0.001))

 # Please note I have changed the basel_asset_class to 4 from 2, to avoid
 confusion.

  mydat
   basel_asset_class defa_frequency
 1 4  0.150
 2 8 

Re: [R] Subseting a data.frame

2013-10-17 Thread arun
Hi Bill,

#seq_along() worked in the cases you showed.

 ave(seq_along(fac),fac,FUN=length)
#[1] 3 1 3 3
  ave(seq_along(num), num, FUN=length) 
#[1] 3 1 3 3
  ave(seq_along(char), char, FUN=length) 
#[1] 3 1 3 3



I thought, there might be some advantages in speed, but they were similar in 
speed.
set.seed(195)
 num1 - sample(1e3,1e7,replace=TRUE)
 system.time(res1 - ave(integer(length(num1)),num1,FUN=length))
  # user  system elapsed 
  #4.148   0.228   4.382 
system.time(res2 - ave(seq_along(num1),num1,FUN=length))
#   user  system elapsed 
 # 3.944   0.228   4.181 
system.time(res3 - ave(num1,num1,FUN=length))
#   user  system elapsed 
 # 3.740   0.264   4.012 
identical(res1,res2)
#[1] TRUE
 identical(res2,res3)
#[1] TRUE


A.K. 




On Thursday, October 17, 2013 4:34 PM, William Dunlap wdun...@tibco.com wrote:
  May I ask why:
    count_by_class - with(dat, ave(numeric(length(basel_asset_class)), 
basel_asset_class, FUN=length))
  should not be more simply done as:
    count_by_class - with(dat, ave(basel_asset_class, basel_asset_class, 
FUN=length))

The way I did it would work if basel_asset_class were non-numeric.
In ave(x, group, FUN=FUN), FUN's return value should be the same type as x (or
you can get some odd type conversions).  E.g.,

    num - c(2,3,2,2) ;  char - c(Two,Three,Two,Two)
    ave(num, num, FUN=length) # good
   [1] 3 1 3 3
    ave(char, char, FUN=length) # bad
   [1] 3 1 3 3
    fac - factor(char, levels=c(One,Two,Three))
    ave(fac, fac, FUN=length)
   [1] NA NA NA NA
   Levels: One Two Three
   Warning messages:
   1: In `[-.factor`(`*tmp*`, i, value = 0L) :
     invalid factor level, NA generated
   2: In `[-.factor`(`*tmp*`, i, value = 3L) :
     invalid factor level, NA generated
   3: In `[-.factor`(`*tmp*`, i, value = 1L) :
     invalid factor level, NA generated
but x=integer(length(group)) works in all cases:
    ave(integer(length(fac)), fac, FUN=length)
   [1] 3 1 3 3
    ave(integer(length(char)), char, FUN=length)
      [1] 3 1 3 3

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

From: Bert Gunter [mailto:gunter.ber...@gene.com]
Sent: Thursday, October 17, 2013 1:06 PM
To: William Dunlap
Cc: Katherine Gobin; r-help@r-project.org
Subject: Re: [R] Subseting a data.frame

May I ask why:

count_by_class - with(dat, ave(numeric(length(basel_
asset_class)), basel_asset_class, FUN=length))
should not be more simply done as:

count_by_class - with(dat, ave(basel_asset_class, basel_asset_class, 
FUN=length))

?
-- Bert

On Thu, Oct 17, 2013 at 12:36 PM, William Dunlap 
wdun...@tibco.commailto:wdun...@tibco.com wrote:
 What I need is to select only those records for which there are more than two 
 default
 frequencies (defa_frequency),

Here is one way.  There are many others:
    dat - data.frame( # slightly less trivial example
        basel_asset_class=c(4,8,8,8,74,3,74),
        defa_frequency=(1:7)/8)
    count_by_class - with(dat, ave(numeric(length(basel_asset_class)), 
basel_asset_class, FUN=length))
    cbind(dat, count_by_class) # see what we just computed
     basel_asset_class defa_frequency count_by_class
   1                 4          0.125              1
   2                 8          0.250              3
   3                 8          0.375              3
   4                 8          0.500              3
   5                74          0.625              2
   6                 3          0.750              1
   7                74          0.875              2
    mydat[count_by_class1, ] # I think this is what you are asking for
     basel_asset_class defa_frequency
   2                 8          0.250
   3                 8          0.375
   4                 8          0.500
   5                74          0.625
   7                74          0.875

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.comhttp://tibco.com


 -Original Message-
 From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Katherine Gobin
 Sent: Thursday, October 17, 2013 11:05 AM
 To: Bert Gunter
 Cc: r-help@r-project.orgmailto:r-help@r-project.org
 Subject: Re: [R] Subseting a data.frame

 Correction. (2nd para first three lines)

 Pl read following line

 What I need is to select only those records for which there are more than two 
 default
 frequencies (defa_frequency), Thus, there is only one default frequency = 
 0.150 w.r.t
 basel_asset_class = 4 whereas there are default frequencies w.r.t. basel 
 aseet class 4,


 as

 What I need is to select only those records for which there are more than two 
 default
 frequencies (defa_frequency), Thus, there is only one default frequency = 
 0.150 w.r.t
 basel_asset_class = 4 whereas there are THREE default frequencies w.r.t. 
 basel aseet
 class 8,



 I alpologize for the incovenience.

 Regards

 KAtherine








 On , Katherine Gobin 
 katherine_go...@yahoo.commailto:katherine_go...@yahoo.com wrote:

 

Re: [R] Subseting a data.frame

2013-10-17 Thread William Dunlap
seq_along(x), integer(length(x)), is.na(x), or anything that produces an integer
(or numeric or logical) vector the length of x would work.  I use integer() or 
numeric()
to indicate I'm not using its value: it is just a vector in which to place the
return values of FUN().

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: arun [mailto:smartpink...@yahoo.com]
 Sent: Thursday, October 17, 2013 2:33 PM
 To: R help
 Cc: William Dunlap; Bert Gunter
 Subject: Re: [R] Subseting a data.frame
 
 Hi Bill,
 
 #seq_along() worked in the cases you showed.
 
  ave(seq_along(fac),fac,FUN=length)
 #[1] 3 1 3 3
   ave(seq_along(num), num, FUN=length)
 #[1] 3 1 3 3
   ave(seq_along(char), char, FUN=length)
 #[1] 3 1 3 3
 
 
 
 I thought, there might be some advantages in speed, but they were similar in 
 speed.
 set.seed(195)
  num1 - sample(1e3,1e7,replace=TRUE)
  system.time(res1 - ave(integer(length(num1)),num1,FUN=length))
   # user  system elapsed
   #4.148   0.228   4.382
 system.time(res2 - ave(seq_along(num1),num1,FUN=length))
 #   user  system elapsed
  # 3.944   0.228   4.181
 system.time(res3 - ave(num1,num1,FUN=length))
 #   user  system elapsed
  # 3.740   0.264   4.012
 identical(res1,res2)
 #[1] TRUE
  identical(res2,res3)
 #[1] TRUE
 
 
 A.K.
 
 
 
 
 On Thursday, October 17, 2013 4:34 PM, William Dunlap wdun...@tibco.com 
 wrote:
   May I ask why:
     count_by_class - with(dat, ave(numeric(length(basel_asset_class)), 
 basel_asset_class,
 FUN=length))
   should not be more simply done as:
     count_by_class - with(dat, ave(basel_asset_class, basel_asset_class, 
 FUN=length))
 
 The way I did it would work if basel_asset_class were non-numeric.
 In ave(x, group, FUN=FUN), FUN's return value should be the same type as x (or
 you can get some odd type conversions).  E.g.,
 
     num - c(2,3,2,2) ;  char - c(Two,Three,Two,Two)
     ave(num, num, FUN=length) # good
    [1] 3 1 3 3
     ave(char, char, FUN=length) # bad
    [1] 3 1 3 3
     fac - factor(char, levels=c(One,Two,Three))
     ave(fac, fac, FUN=length)
    [1] NA NA NA NA
    Levels: One Two Three
    Warning messages:
    1: In `[-.factor`(`*tmp*`, i, value = 0L) :
      invalid factor level, NA generated
    2: In `[-.factor`(`*tmp*`, i, value = 3L) :
      invalid factor level, NA generated
    3: In `[-.factor`(`*tmp*`, i, value = 1L) :
      invalid factor level, NA generated
 but x=integer(length(group)) works in all cases:
     ave(integer(length(fac)), fac, FUN=length)
    [1] 3 1 3 3
     ave(integer(length(char)), char, FUN=length)
       [1] 3 1 3 3
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com
 
 From: Bert Gunter [mailto:gunter.ber...@gene.com]
 Sent: Thursday, October 17, 2013 1:06 PM
 To: William Dunlap
 Cc: Katherine Gobin; r-help@r-project.org
 Subject: Re: [R] Subseting a data.frame
 
 May I ask why:
 
 count_by_class - with(dat, ave(numeric(length(basel_
 asset_class)), basel_asset_class, FUN=length))
 should not be more simply done as:
 
 count_by_class - with(dat, ave(basel_asset_class, basel_asset_class, 
 FUN=length))
 
 ?
 -- Bert
 
 On Thu, Oct 17, 2013 at 12:36 PM, William Dunlap
 wdun...@tibco.commailto:wdun...@tibco.com wrote:
  What I need is to select only those records for which there are more than 
  two default
  frequencies (defa_frequency),
 
 Here is one way.  There are many others:
     dat - data.frame( # slightly less trivial example
         basel_asset_class=c(4,8,8,8,74,3,74),
         defa_frequency=(1:7)/8)
     count_by_class - with(dat, ave(numeric(length(basel_asset_class)),
 basel_asset_class, FUN=length))
     cbind(dat, count_by_class) # see what we just computed
      basel_asset_class defa_frequency count_by_class
    1                 4          0.125              1
    2                 8          0.250              3
    3                 8          0.375              3
    4                 8          0.500              3
    5                74          0.625              2
    6                 3          0.750              1
    7                74          0.875              2
     mydat[count_by_class1, ] # I think this is what you are asking for
      basel_asset_class defa_frequency
    2                 8          0.250
    3                 8          0.375
    4                 8          0.500
    5                74          0.625
    7                74          0.875
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.comhttp://tibco.com
 
 
  -Original Message-
  From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org 
  [mailto:r-
 help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On Behalf
  Of Katherine Gobin
  Sent: Thursday, October 17, 2013 11:05 AM
  To: Bert Gunter
  Cc: r-help@r-project.orgmailto:r-help@r-project.org
  Subject: Re: [R] Subseting a data.frame
 
  Correction. (2nd para first three lines)
 
  Pl read following line
 
  What I need is to select only those records 

[R] speeding up a loop

2013-10-17 Thread Ye Lin
Hey R professionals,

I have a large dataset and I want to run a loop on it basically creating a
new column which gathers information from another reference table.

When I run the code, R just freezes and even does not response after 30min
which is really unusual. I tried sapply as well but does not improve at
all.

I am running R 3.0.2 on Windows 7.  I checked the system, when I run the
code, my CPU usage is about 25%-30% that is taxing my desktop.

Here is my code:

#df1 is the data set I want to add a new column#
#b is the reference tabel#

for (i in (1:nrow(df1))) {
  begin=which(b$Time2==df1$start[i]  b$Date==df1$Date[i])
  date=unlist(strsplit(as.character(dff$end[i]), ))[1]
   end=ifelse(date==2013-10-17,
   which(b$Time2==df1$end[i]  b$Date==df1$Date[i]),
   which(b$Time2==df1$end[i]-3600*24  b$Date==as.Date(df1$Date[i])+1))
df1$new[i] - sum(b[begin:end,]$Power)
}

And here is a mimic sample of df1  b:

df1 - structure(list(Date = structure(c(1369699200, 1369699200,
1369699200,
1369699200, 1369699200), tzone = UTC, class = c(POSIXct,
POSIXt)), start = structure(c(1381991205, 1381990247, 1382010454,
1382007281, 1381992288), tzone = UTC, class = c(POSIXct,
POSIXt)), end = structure(c(1381992405, 1381993727, 1382010694,
1382007461, 1381992468), tzone = UTC, class = c(POSIXct,
POSIXt))), .Names = c(Date, start, end), row.names = c(NA,
-5L), class = data.frame)


b - structure(list(Date = structure(c(1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200), tzone = UTC,
class = c(POSIXct,
POSIXt)), Time2 = structure(c(1381989634, 1381989694, 1381989754,
1381989814, 1381989874, 1381989934, 1381989994, 1381990054, 1381990114,
1381990174, 1381990234, 1381990294, 1381990354, 1381990414, 1381990474,
1381990534, 1381990594, 1381990654, 1381990714, 1381990774, 1381990834,
1381990894, 1381990954, 1381991014, 1381991074, 1381991134, 1381991194,
1381991254, 1381991314, 1381991374, 1381991434, 1381991494, 1381991554,
1381991614, 1381991674, 1381991734, 1381991794, 1381991854, 1381991914,
1381991974, 1381992034, 1381992094, 1381992154, 1381992214, 1381992274,
1381992334, 1381992394, 1381992454, 1381992514, 1381992574), tzone = UTC,
class = c(POSIXct,
POSIXt)), Power = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50)), .Names = c(Date, Time2, Power
), row.names = c(NA, -50L), class = data.frame)

Thanks for your help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Incorporate Julia into R

2013-10-17 Thread Suzen, Mehmet
On 17 October 2013 15:38, Timo Schmid timo_sch...@hotmail.com wrote:
 I have some code in R with a lot of matrix multiplication and inverting. R 
 can be very slow for larger matrices like 5000x5000.
 I have seen the new programming language Julia (www.julialang.org) which is 
 quite fast in doing matrix algebra.

Its not Julia, but LAPACK.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting maximums between different variables

2013-10-17 Thread Law, Jason
See ?pmax for getting the max for each year.

do.call('pmax', oil[-1])

Or equivalently:

pmax(oil$TX, oil$CA, oil$AL, oil$ND)

apply and which.max will give you the index:

i - apply(oil[-1], 1, which.max)

which you can use to extract the state:

names(oil[-1])[i]

Jason

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Tim Umbach
Sent: Thursday, October 17, 2013 9:49 AM
To: r-help@r-project.org
Subject: [R] Selecting maximums between different variables

Hi there,

another beginners question, I'm afraid. Basically i want to selct the maximum 
of values, that correspond to different variables. I have a table of oil 
production that looks somewhat like this:

oil - data.frame( YEAR = c(2011, 2012),
   TX = c(2, 3),
   CA = c(4, 25000),
   AL = c(2,
21000),

   ND = c(21000,6))

Now I want to find out, which state produced most oil in a given year. I tried 
this:

attach(oil)
last_year = oil[ c(YEAR == 2012), ]
max(last_year)

Which works, but it doesnt't give me the corresponding values (i.e. it just 
gives me the maximum output, not what state its from).
So I tried this:

oil[c(oil == max(last_year)),]
and this:
oil[c(last_year == max(last_year)),]
and this:
oil[which.max(last_year),]
and this:
last_year[max(last_year),]

None of them work, but they don't give error messages either, the output is 
just NA. The problem is, in my eyes, that I'm comparing the values of 
different variables with each other. Because if i change the structure of the 
dataframe (which I can't do with the real data, at least not with out doing it 
by hand with a huge dataset), it looks like this and works
perfectly:

oil2 - data.frame (
  names = c('YEAR', 'TX', 'CA', 'AL', 'ND'),
  oil_2011 = c(2011, 2, 4, 2, 21000),
  oil_2012 = c(2012, 3, 25000, 21000, 6)
  )
attach(oil2)
oil2[c(oil_2012 == max(oil_2012)),]

Any help is much appreciated.

Thanks, Tim Umbach

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] crr question‏ in library(cmprsk)

2013-10-17 Thread Elan InP
Hi all

I do not understand why I am getting the following error message. Can 
anybody help me with this? Thanks in advance.

install.packages(cmprsk)
library(cmprsk)
result1 -crr(ftime, fstatus, cov1, failcode=1, cencode=0 )
one.pout1 = predict(result1,cov1,X=cbind(1,one.z1,one.z2))

predict.crr(result1,cov1,X=cbind(1,one.z1,one.z2))
Error: could not find function predict.crr



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speeding up a loop

2013-10-17 Thread David Winsemius

On Oct 17, 2013, at 2:56 PM, Ye Lin wrote:

 Hey R professionals,
 
 I have a large dataset and I want to run a loop on it basically creating a
 new column which gathers information from another reference table.
 
 When I run the code, R just freezes and even does not response after 30min
 which is really unusual. I tried sapply as well but does not improve at
 all.
 
 I am running R 3.0.2 on Windows 7.  I checked the system, when I run the
 code, my CPU usage is about 25%-30% that is taxing my desktop.

A guess: It's not your CPU use ... it's your RAM use. You've probably exhausted 
your RAM and your system has paged out to virutla memory
 
 Here is my code:
 
 #df1 is the data set I want to add a new column#
 #b is the reference tabel#
 
 for (i in (1:nrow(df1))) {
  begin=which(b$Time2==df1$start[i]  b$Date==df1$Date[i])
  date=unlist(strsplit(as.character(dff$end[i]), ))[1]
   end=ifelse(date==2013-10-17,
   which(b$Time2==df1$end[i]  b$Date==df1$Date[i]),
   which(b$Time2==df1$end[i]-3600*24  b$Date==as.Date(df1$Date[i])+1))
df1$new[i] - sum(b[begin:end,]$Power)
 }
 

I get: 
Error in strsplit(as.character(dff$end[i]),  ) : object 'dff' not found

If I change the dff to df1, I get: 
Error in begin:end : argument of length 0

-- 
David.
 And here is a mimic sample of df1  b:
 
 df1 - structure(list(Date = structure(c(1369699200, 1369699200,
 1369699200,
 1369699200, 1369699200), tzone = UTC, class = c(POSIXct,
 POSIXt)), start = structure(c(1381991205, 1381990247, 1382010454,
 1382007281, 1381992288), tzone = UTC, class = c(POSIXct,
 POSIXt)), end = structure(c(1381992405, 1381993727, 1382010694,
 1382007461, 1381992468), tzone = UTC, class = c(POSIXct,
 POSIXt))), .Names = c(Date, start, end), row.names = c(NA,
 -5L), class = data.frame)
 
 
 b - structure(list(Date = structure(c(1369699200, 1369699200, 1369699200,
 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
 1369699200, 1369699200, 1369699200, 1369699200, 1369699200), tzone = UTC,
 class = c(POSIXct,
 POSIXt)), Time2 = structure(c(1381989634, 1381989694, 1381989754,
 1381989814, 1381989874, 1381989934, 1381989994, 1381990054, 1381990114,
 1381990174, 1381990234, 1381990294, 1381990354, 1381990414, 1381990474,
 1381990534, 1381990594, 1381990654, 1381990714, 1381990774, 1381990834,
 1381990894, 1381990954, 1381991014, 1381991074, 1381991134, 1381991194,
 1381991254, 1381991314, 1381991374, 1381991434, 1381991494, 1381991554,
 1381991614, 1381991674, 1381991734, 1381991794, 1381991854, 1381991914,
 1381991974, 1381992034, 1381992094, 1381992154, 1381992214, 1381992274,
 1381992334, 1381992394, 1381992454, 1381992514, 1381992574), tzone = UTC,
 class = c(POSIXct,
 POSIXt)), Power = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
 45, 46, 47, 48, 49, 50)), .Names = c(Date, Time2, Power
 ), row.names = c(NA, -50L), class = data.frame)
 
 Thanks for your help!
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.