Re: [R] Problem with new(er) R version's matrix package

2014-04-26 Thread Arne Henningsen
On 25 April 2014 20:15, David Winsemius dwinsem...@comcast.net wrote:

 On Apr 25, 2014, at 9:17 AM, Werner W. wrote:

 Dear Rs,

 I am re-executing some older code. It does work in the ancient R 2.12.0 
 which I still have on my PC but with the new version R 3.1.0 it does not 
 work any more (but some other new stuff, which won't work with 2.12).

 The problem arises in context with the systemfit package using the matrix 
 package. In R 3.1.0 the following error is thrown:
 Error in as.matrix(solve(W, tol = solvetol)[1:ncol(xMat), 1:ncol(xMat)]) : 
 error in evaluating the argument 'x' in selecting a method for function 
 'as.matrix': Error in .solve.sparse.dgC(as(a, dgCMatrix), b = b, tol = 
 tol) : LU computationally singular: ratio of extreme entries in |diag(U)| = 
 7.012e-39

 However, I have no clue what I can do about this. Was there some change in 
 the defaults of the matrix package? I couldn't find anything apparent in the 
 changelog. As the same code works in R 2.12.0, I suppose that the problem is 
 not my data.

 You have not told us what version of the Matrix package you were using.
 As such I would suggest that you review the Changelog which is a link
 for the CRAN page for pkg:Matrix and go back 4 years or so since R
 major versions change about once a year.

 http://cran.r-project.org/web/packages/Matrix/ChangeLog

In addition, please provide a minimal, self-contained, reproducible example.

Best,
Arne

-- 
Arne Henningsen
http://www.arne-henningsen.name

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Within ID variable delete all rows after reaching a specific value

2014-04-26 Thread Jeff Newmiller

Jennifer:

a) Don't post in HTML... read the Posting Guide.

b) Don't make data frames by first making matrices... you rarely create 
what you think you are creating. In your case, your code creates a bunch 
of factor columns... use the str() function to verify that your data are 
sensible before analyzing it.


c) The ave and cumsum functions are useful here:

tmp - data.frame( X1 = rbinom( 1000, 1, .03 )
 , X2 = array( 1:127, c(1000,1) )
 , X3 = array( format( seq( ISOdate(1990,1,1)
  , by='month'
  , length=56 )
 , format='%d.%m.%Y')
 , c( 1000, 1 ) ) )
tmp - tmp[ with( tmp, order( X2, X3 ) ), ]
tmp2 - subset( tmp
  , 1 = ave( X1
, X2
, FUN=function( x ) {
cumsum( cumsum( x ) )
  } ) ) )

which generates a vector of increasing values once the first nonzero value 
is found in each group, and then only keeps the rows for which those 
increasing values are zero or one.


On Sat, 26 Apr 2014, Jim Lemon wrote:


On 04/26/2014 12:42 PM, Jennifer Sabatier wrote:

So, I know that's a confusing Subject header.

Here's similar data:


tmp- data.frame(matrix(
 c(rbinom(1000, 1, .03),
   array(1:127, c(1000,1)),
   array(format(seq(ISOdate(1990,1,1), by='month',
length=56), format='%d.%m.%Y'), c(1000,1))),
 ncol=3))
tmp- tmp[with(tmp, order(X2, X3)), ]
table(tmp$X1)


X1 is the variable of interest - disease status.  It's a survival-type of
variable, where you are 0 until you become 1.
X2 is the person ID variable.
X3 is the clinic date (here it's monthly, just for example...but in my real
data it's a bit more complicated - definitely not equally spaced nor the
same number of visits to the clinic per ID.).

Some people stay X1 = 0 for all clinic visits.  Only a small proportion
become X1=1.

However, the data has errors I need to clean off.  Once someone becomes
X1=1 they should have no more rows in the dataset.  These are data entry
errors.

In my data I have people who continue to have rows in the data.  Sometimes
the rows show X1=0 and sometimes X1=1.  Sometimes there's just one more row
and sometimes there are many more rows.

How can I go through, find the first X1 = 1, and then delete any rows after
that, for each value of X2?

Thanks!

Jen


Hi Jen,
This might do what you want:

tmp$X3-as.Date(tmp$X3,%d.%m.%Y)
tmp-tmp[order(tmp$X2,tmp$X3),]
first-TRUE
for(patno in unique(tmp$X2)) {
cat(patno,\n)
tmpbit-tmp[tmp$X2 == patno,]
firstone-which(tmpbit$X1 == 1)[1]
cat(firstone,\n)
if(is.na(firstone)) firstone-dim(tmpbit)[1]
newtmpbit-tmpbit[1:firstone,]
if(first) {
 newtmp-newtmpbit
 first-FALSE
}
else newtmp-rbind(newtmp,newtmpbit)
}

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Within ID variable delete all rows after reaching a specific value

2014-04-26 Thread arun


Hi,

You may also try:
set.seed(425)

##your code
tmp - data.frame(

#

tmp1 - tmp
str(tmp1)
#'data.frame':    1000 obs. of  3 variables:
# $ X1: Factor w/ 2 levels 0,1: 1 1 1 1 1 1 1 1 1 1 ...
# $ X2: Factor w/ 127 levels 1,10,100,..: 1 1 1 1 1 1 1 1 2 2 ...
# $ X3: Factor w/ 56 levels 01.01.1990,01.01.1991,..: 1 21 17 37 33 51 48 
10 11 45 #...


 tmp1 - tmp1[with(tmp1,order(X2, as.Date(X3, %d.%m.%Y))),]
tmp2 - tmp1[with(tmp1,!ave(as.numeric(as.character(X1)),X2, FUN=function(x)  
cumsum(cumsum(x)) 1 )),]

###checking results with Jim's method
tmp2New - tmp2
tmp2New$X3 - as.Date(tmp2New$X3, %d.%m.%Y)
identical(tmp2New,newtmp) ##Jim's result
#[1] TRUE

A.K.




On Saturday, April 26, 2014 12:07 AM, Jim Lemon j...@bitwrit.com.au wrote:
On 04/26/2014 12:42 PM, Jennifer Sabatier wrote:
 So, I know that's a confusing Subject header.

 Here's similar data:


 tmp- data.frame(matrix(
                          c(rbinom(1000, 1, .03),
                            array(1:127, c(1000,1)),
                            array(format(seq(ISOdate(1990,1,1), by='month',
 length=56), format='%d.%m.%Y'), c(1000,1))),
                          ncol=3))
 tmp- tmp[with(tmp, order(X2, X3)), ]
 table(tmp$X1)


 X1 is the variable of interest - disease status.  It's a survival-type of
 variable, where you are 0 until you become 1.
 X2 is the person ID variable.
 X3 is the clinic date (here it's monthly, just for example...but in my real
 data it's a bit more complicated - definitely not equally spaced nor the
 same number of visits to the clinic per ID.).

 Some people stay X1 = 0 for all clinic visits.  Only a small proportion
 become X1=1.

 However, the data has errors I need to clean off.  Once someone becomes
 X1=1 they should have no more rows in the dataset.  These are data entry
 errors.

 In my data I have people who continue to have rows in the data.  Sometimes
 the rows show X1=0 and sometimes X1=1.  Sometimes there's just one more row
 and sometimes there are many more rows.

 How can I go through, find the first X1 = 1, and then delete any rows after
 that, for each value of X2?

 Thanks!

 Jen

Hi Jen,
This might do what you want:

tmp$X3-as.Date(tmp$X3,%d.%m.%Y)
tmp-tmp[order(tmp$X2,tmp$X3),]
first-TRUE
for(patno in unique(tmp$X2)) {
  cat(patno,\n)
  tmpbit-tmp[tmp$X2 == patno,]
  firstone-which(tmpbit$X1 == 1)[1]
  cat(firstone,\n)
  if(is.na(firstone)) firstone-dim(tmpbit)[1]
  newtmpbit-tmpbit[1:firstone,]
  if(first) {
   newtmp-newtmpbit
   first-FALSE
  }
  else newtmp-rbind(newtmp,newtmpbit)
}

Jim


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] about rect.hclust

2014-04-26 Thread Gregory Jefferis

On 26 Apr 2014, at 06:41, Gregory Jefferis jeffe...@gmail.com wrote:

 # plot dendrogram coloured by cluster:
 ##
 library(dendroextras)
 cluc=colour_clusters(clu, k=3, groupLabels =TRUE)
 plot(cluc)

There was a typo in an argument name for colour_clusters in my earlier response 
(thanks to Dennis Murphy for pointing this out). Correction in above.

Fancier versions are also possible:

plot(colour_clusters(clu, k=3, groupLabels = as.roman))

Best,

Greg.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple Loop Counter

2014-04-26 Thread arun
HI,
May be this helps:
for(i in letters) {
  n - n+1
  x[n,] - c(i, n)
cat(loop, n, \n)
}
x
#or
for(i in seq_along(letters)) {
  n - n+1
  x[n,] - c(letters[i], n)
cat(loop, i, \n)
}
x

A.K.


#How can I add the loop counter in #2 to the loop in #1.?
#***
#1.
x - matrix( , ncol = 2, nrow = 26) # empty matrix
n - 0 #set n to 0
for(i in letters) {
  n - n+1
  x[n,] - c(i, n)
}
x
#***
#2.
for (i in 1:10) {
cat(loop, i, \n)
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] about rect.hclust

2014-04-26 Thread Tal Galili
Thank you for mentioning the
dendextendhttp://cran.r-project.org/web/packages/dendextend/package
Gregory.

As I mention in the package, the dendextend package implements a similar
function as in the dendroextra. The big difference between the two is that
the dendextend package has an implementation of the cutree function for
dendrograms (making it possible to not have to go through the hclust object
in order to perform it). This is helpful when using non-standard
dendrograms that can not be turned into an hclust object (via
as.hclust.dendrogram).
Also, the dendextend package includes many other functions for dendrogram
manipulation which might be helpful.

Here is a quick example that can give you a sense of the possibilities:


install.packages(dendextend)
require(dendextend)

# pdf(dendextend_example.pdf)
dend - as.dendrogram(hclust(dist(USArrests), ave))
d1=color_branches(dend, k=5, col = c(3,1,1,4,1))
plot(d1) # selective coloring of branches :)
d2=color_branches(d1,k=5)
plot(d2) # getting rainbow_hcl colors for each cluster (the default)
d3=color_branches(d1,5,groupLabels=TRUE) # adding numbers to the branches
plot(d3)
d4=color_labels(d3, k=5, col = c(3,1,1,4,1)) # color labels by cluster
plot(d4)
d5=color_labels(d4, col = c(red, orange)) # color labels from left to
right
plot(d5)
# dev.off()

Here is the output:
https://www.dropbox.com/s/vii83jd9xnbmuv7/dendextend_example.pdf

Also, there is a (rough draft) of a vignettes for the package here:
https://github.com/talgalili/dendextend/blob/master/vignettes/dendextend-tutorial.pdf?raw=true

Feedback is welcome.

Best,
Tal









Contact
Details:---
Contact me: tal.gal...@gmail.com |
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--



On Sat, Apr 26, 2014 at 10:36 AM, Gregory Jefferis jeffe...@gmail.comwrote:


 On 26 Apr 2014, at 06:41, Gregory Jefferis jeffe...@gmail.com wrote:

  # plot dendrogram coloured by cluster:
  ##
  library(dendroextras)
  cluc=colour_clusters(clu, k=3, groupLabels =TRUE)
  plot(cluc)

 There was a typo in an argument name for colour_clusters in my earlier
 response (thanks to Dennis Murphy for pointing this out). Correction in
 above.

 Fancier versions are also possible:

 plot(colour_clusters(clu, k=3, groupLabels = as.roman))

 Best,

 Greg.
 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] purtest and missing values (plm-package)

2014-04-26 Thread Katharina Mersmann
Hey arun,
I am thankful for your help and assistance! With the example it works! 

Have a nice day,
Katie

-Ursprüngliche Nachricht-
Von: arun [mailto:smartpink...@yahoo.com] 
Gesendet: Freitag, 25. April 2014 18:01
An: r-help@r-project.org
Cc: Katharina Mersmann
Betreff: Re: [R] purtest and missing values (plm-package)



Hi,

May be this helps:
library(plm)
data(Grunfeld)
 Grunfeld$inv[c(2,8)] - NA
 GrunfeldNew - subset(Grunfeld, !year %in% year[is.na(inv)])

x - data.frame(split(GrunfeldNew$inv, GrunfeldNew$firm)) purtest(x, pmax = 4, 
exo = none, test = levinlin,lags=SIC) #Levin-Lin-Chu Unit-Root Test 
(ex. var. : None ) #
#data:  x
#z.x1 = 5.0507, p-value = 4.402e-07
#alternative hypothesis: stationarity
#
#Warning message:
#In selectT(l, theTs) : the time serie is short


A.K.


On Friday, April 25, 2014 10:48 AM, Katharina Mersmann 
kmers...@smail.uni-koeln.de wrote:
Because, my inbox shows a cutted-email-version, once again:

Hello R-Community,

I have quarterly panel-data and not surprisingly missing values, in a few 
variables, which are differently distributed around the panel.
Now I want to run different unit-root tests.

For ADF on the pooled data set, I chose CADF-package, which can determine the 
number of lags automa. by SIC and handles missing values. (I hope this is the 
right one )

Secondly I want to run an LLC and IPS test specificially for the panel data 
(Levin, Lin  Chu –test and Im, Pesaran  Shin-test ) for which I use the 
purtest function, from plm-package.
But I don´t know how to apply it, if my series contains missing values ( so 
only a error-message is created)

 purtest(data.plm,data=data.plm, test = levinlin,exo = none,lags 
 =SIC)
Fehler in lm.fit(X, y) : NA/NaN/Inf in 'x'
Zusätzlich: Warnmeldung:
In Ops.factor(object[2:length(object)], object[1:(length(object) -  :
  - not meaningful for factors

So my Question is:
1.Is there a way to handle the missing values? 
2.Do I just omit them? And if, is there a way to integrate this by adding 
an ” na.omit” into the function ?


To make it easier explaining the way of proceeding, a reproducible example 
could be:

data(Grunfeld, package = plm)
y - data.frame(split(Grunfeld$inv, Grunfeld$firm)) purtest(y, pmax = 4, exo = 
none, test = levinlin,lags=”SIC”) # works no missing data

# add an NA

data(Grunfeld, package = plm)
x −data.frame(split(Grunfeldinv, Grunfeld$firm)) Grunfeldinv[2]−NA purtest(x, 
pmax = 4, exo = none, test = levinlin,lags=”SIC”) # Error in lm.fit(X, x) : 
NA/NaN/Inf in 'x'

Thanks for your hints and suggestions! 
Katie


-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im 
Auftrag von Katharina Mersmann
Gesendet: Freitag, 25. April 2014 15:53
An: r-help@r-project.org
Betreff: [R] purtest and missing values (plm-package)

Hello R-Community,



I have quarterly panel-data and not surprisingly missing values, in a few 
variables, which are differently distributed around the panel.

Now I want to run different unit-root tests.



For ADF on the pooled data set, I chose CADF-package, which can determine the 
number of lags automa. by SIC and handles missing values. (I hope this is the 
right one )



Secondly I want to run an LLC and IPS test specificially for the panel data 
(Levin, Lin  Chu b

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Metafor: How to integrate effectsizes?

2014-04-26 Thread Michael Dewey

At 20:34 25/04/2014, Viechtbauer Wolfgang (STAT) wrote:
If you know the d-value and the corresponding 
group sizes for a study, then it's possible to 
add that study to the rest of the dataset. Also, 
if you only know the test statistic from an 
independent samples t-test (or only the p-value 
corresponding to that test), it's possible to 
back-compute what the standardized mean difference is.


I added an illustration of this to the metafor package website:

http://www.metafor-project.org/doku.php/tips:assembling_data_smd


Verena might also like to look at the compute.es 
package available from CRAN to see whether any of 
the conversions programmed there do the job.




Best,
Wolfgang

--
Wolfgang Viechtbauer, Ph.D., Statistician
Department of Psychiatry and Psychology
School for Mental Health and Neuroscience
Faculty of Health, Medicine, and Life Sciences
Maastricht University, P.O. Box 616 (VIJV1)
6200 MD Maastricht, The Netherlands
+31 (43) 388-4170 | http://www.wvbauer.com

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Michael Dewey
 Sent: Friday, April 25, 2014 16:23
 To: Verena Weinbir
 Cc: r-help@r-project.org
 Subject: Re: [R] Metafor: How to integrate effectsizes?

 At 12:33 25/04/2014, you wrote:
 Thank you very much for your reply and the book recommendation, Michael.
 
 Yes, I mean Cohen's d - sorry for the typo :-)
 
 Just to make this sure for me: There is no
 possibility to integrate stated Cohens' ds in an
 R-Metaanalysis (or a MA at all), if there is no
 further information traceable regarding SE or the like?

 If there is really no other information like
 sample sizes, significance level, value of some
 significance test then you would have to impute a
 value from somewhere. That would seem a last resort.

 I have cc'ed this back to the list, please keep
 it on the list so others may benefit and contribute.


 best regards,
 
 Verena
 
 
 On Fri, Apr 25, 2014 at 1:21 PM, Michael Dewey
 mailto:i...@aghmed.fsnet.co.uki...@aghmed.fsnet.co.uk wrote:
 At 13:15 24/04/2014, Verena Weinbir wrote:
 Hello!
 
 I am using the metafor package for my master's thesis as an R-newbie.
 While
 calculating effectsizes from my dataset (mean values and
 standarddeviations) using escalc shouldn't be a problem (I hope ;-)),
 I
 wonder how I could at this point integrate additional studies, which
 only
 state conhens d (no information about mean value and sds available), to
 calculate an overall analysis. Â I would be very grateful for your
 support!
 
 
 You mean Cohen's d I think.
 
 You will need some more information to enable
 you to calculate its standard error. Have a look at Rosenthal's chapter
 in
 @book{cooper94,
 Â  Â author = {Cooper, H and Hedges, L V},
 Â  Â title = {A handbook of research synthesis},
 Â  Â year = {1994},
 Â  Â publisher = {Russell Sage},
 Â  Â address = {New York},
 Â  Â keywords = {meta-analysis}
 }
 (There is an updated edition)
 This gives you more information about converting
 effect sizes and extracting them from unpromising beginnings.
 
 It often requires some ingenuity to get the
 information you need so have a go and then get
 back here with more details if you run into problems
 
 
 Best regards,
 
 Verena


Michael Dewey
i...@aghmed.fsnet.co.uk
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cramer Rao upper/lower bounds- No Comments ?

2014-04-26 Thread Rui Barradas

Hello,

Is this homework? There's a no homework policy here.

Em 25-04-2014 12:40, Mohammed Ouassou escreveu:

Dear R users;

I have a question about Cramer Rao upper/lower bounds


Cramer Rao _lower_ bound, not upper.


Is it possible to compute Crammer Rao upper/lower bounds from residuals
and  corresponding covariance matrices ?


Residuals of what?

Hope this helps,

Rui Barradas



Any suggestions will be appreciated, thanks in advance.


M.O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Read.table mucks up headers

2014-04-26 Thread starter
Hello Uwe  Others

Thanks for all your help, I figured out what the problem was. It wasn't
working with old R version. Once I updated it to the latest version, it
seemed to work.

Thank you



--
View this message in context: 
http://r.789695.n4.nabble.com/Read-table-mucks-up-headers-tp4688742p4689519.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-DB] Update results not being written to existing data frame when using sqldf UPDATE

2014-04-26 Thread Gabor Grothendieck
On Sat, Apr 26, 2014 at 10:06 AM, Christopher Lowenkamp
clowenk...@gmail.com wrote:
 RStudio Version 0.98.501
 R version 3.1.0
 Mac OSX 10.9.2

 Packages loaded:
 sqldf
 gsubfn
 proto
 RSQLite
 DBI
 RSQLite.extfuns
 tcltk

 Good morning:

 I am trying to run an sqldf update with two tables.  Both tables contain a
 variable called ‘off_id’.  I am trying to update a variable in tablea (v2)
 with the number of times each record in tablea appears in tableb.

 ##

 tablea - data.frame(off_id = c(12, 14, 16, 17, 18, 22, 1, 5, 7, 44, 4, 3),
 v2 = 0)

 tableb - data.frame(off_id = c(12, 12, 14, 14, 14, 14, 16, 17, 12, 12, 1,
 18, 18, 5, 7, 3, 16, 1, 1, 3, 3, 3, 1))

 sql1 - UPDATE tablea SET v2 = (SELECT count(*) FROM tableb
 WHERE tableb.off_id = tablea.off_id)

 sql2 - “SELECT * FROM tablea”

 #The following code returns NULL

 sqldf(sql1, sql2)

 #When I run the following I do get back the data but tablea$v2 still does
 not update

 sqldf(c(sql1, sql2), method = raw)

 #If I run the following I get the expected results in tablec$v2, but
 tablea$v2 does not update

 tablec - as.data.frame(sqldf(c(sql1, sql2)))

 ##

 I am wondering what I am doing wrong.  Is there a way to get tablea$v2 to
 update?  I did check at https://code.google.com/p/sqldf/ (and have read
 through FAQ 8 a number of times) but don't see an answer to the problem I
 am having.



sqldf never modies any object in your R work space.  It did update the
table in the main sqlite database but its up to you if you want to
write it back to R.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Faster way to transform vector [3 8 4 6 1 5] to [2 6 3 5 1 4]

2014-04-26 Thread xmliu1...@gmail.com
Hi,

could anybody help me to find a fast way to fix the following question?

Given a verctor of length N, for example bo = [3  8  4  6  1  5],
I want to drive a vector whose elements are 1, 2, ..., N and the order of 
elements is the same as that in verctor bo.
In this example, the result is supposed to be bt = [2  6  3  5  1 4].

I used the following code to solove this:

bo - c(3,  8,  4,  6,  1,  5)
N - length(bo)
bt - rep(0, N)
M - max(bo)
temp - bo
  for(i in 1 : N)
{
min - M
i_min - 0
   
for(j in 1 : N)
{
if(min = temp[j])
{
  min - temp[j]
  i_min -j
 }
}
bt[i_min] - i
temp[i_min] - M+ 1
 }
 bt
[1] 2 6 3 5 1 4

However, the time complexity is O(N2).
When N is larger than 100, it takes too much time.
Is there any faster way to fix it?

best
Xueming



xmliu1...@gmail.com
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Food grade Peroxide for treatment life threatening chronic ailments.

2014-04-26 Thread kenya Secret Cures
Forget quil eggs.  35% food grade hydrogen peroxide has brought about relief  
to tens of thousands of patients
suffering from disastrous diseases like  CANCER, LEUKEMIA, HEPATITIS, KIDNEY 
FAIURE, HEART 
PROBEMS, LIVER DISORDERS. ASTHMA, ALLERGIES, CHRONIC ULCERS, HIGH BLOOD 
PRESSURE,
CANDIDIASIS, FAILURE TO CONCEIVE IN LADIES, SINUSES, and many more. 
 
  Over 6,100 articles in European scientific literature have attested to its 
effectiveness  in not only killing diseased cells 
but also simultaneously revitalizing and rejuvenating healthy cells, thereby 
creating vibrant energy and well-being.

  It is safe, inexpensive and powerful healing modality and has been 
administered by an estimated 15,000 European doctors, 
naturopaths and homeopaths to more than 10 million patients in the past 70 
years to successfully treat practically every known disease. 


   35% Food grade hydrogen peroxide is now availble at our clinic  for 
treatment of most diseases.
Kindly call  us for more information. 0710 100 199 or Visit our website 
http://www.secretcures.co.ke/_item?item_id=5700735861784576  for more 
information.

We are located at Nyota house. 3rd floor room 304 along accra road Nairobi cbd.

Regards, 

Kenya Natural Alternative Medicine Initiative CLINIC.


---
This email is free from viruses and malware because avast! Antivirus protection 
is active.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Update results not being written to existing data frame when using sqldf UPDATE

2014-04-26 Thread Christopher Lowenkamp
RStudio Version 0.98.501
R version 3.1.0
Mac OSX 10.9.2

Packages loaded:
sqldf
gsubfn
proto
RSQLite
DBI
RSQLite.extfuns
tcltk

Good morning:

I am trying to run an sqldf update with two tables.  Both tables contain a
variable called ‘off_id’.  I am trying to update a variable in tablea (v2)
with the number of times each record in tablea appears in tableb.

##

tablea - data.frame(off_id = c(12, 14, 16, 17, 18, 22, 1, 5, 7, 44, 4, 3),
v2 = 0)

tableb - data.frame(off_id = c(12, 12, 14, 14, 14, 14, 16, 17, 12, 12, 1,
18, 18, 5, 7, 3, 16, 1, 1, 3, 3, 3, 1))

sql1 - UPDATE tablea SET v2 = (SELECT count(*) FROM tableb
WHERE tableb.off_id = tablea.off_id)

sql2 - “SELECT * FROM tablea”

#The following code returns NULL

sqldf(sql1, sql2)

#When I run the following I do get back the data but tablea$v2 still does
not update

sqldf(c(sql1, sql2), method = raw)

#If I run the following I get the expected results in tablec$v2, but
tablea$v2 does not update

tablec - as.data.frame(sqldf(c(sql1, sql2)))

##

I am wondering what I am doing wrong.  Is there a way to get tablea$v2 to
update?  I did check at https://code.google.com/p/sqldf/ (and have read
through FAQ 8 a number of times) but don't see an answer to the problem I
am having.


Thanks for your time and consideration.

Christopher Lowenkamp
Administrative Office US Courts
University of Missiouri-Kansas City

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't plot transparency images in R-3.1.0

2014-04-26 Thread Fong Chun Chan
It seems to have something to do with my other library dependencies that
has broken the plotting. I tried to re-install R-3.0.2 again and I am
getting the same error for that R version too.

Been scouring the web for advice:

http://stackoverflow.com/questions/10777008/how-to-set-cairo-as-default-backend-for-x11-in-r
http://r.789695.n4.nabble.com/ggplot-2-semi-transparency-error-td902379.html

But it seems that these solutions have to do with saving the plot. I am
just trying to plot it in the R Console...



On Fri, Apr 25, 2014 at 7:06 PM, Ulises M. Alvarez u...@sophie.unam.mxwrote:

 On 04/25/2014 07:26 PM, Fong Chun Chan wrote:

 Hi all,

 I recently upgraded to R-3.1.0 from R-3.0.2. Things seem to be fine, but
 when trying to plot a ggplot image with transparency, I get the following
 issue:

 Warning message:
 In grid.Call.graphics(L_polygon, x$x, x$y, index) :
semi-transparency is not supported on this device: reported only once
 per
 page

 This only appears to affect my R-3.1.0 installation and not my R-3.0.2 as
 I
 can still plot normally with it. Has anyone else experienced this problem?

 Thanks,


 Hi:

 After the upgrade to 3.1.0, did you execute?

 update.packages(checkBuilt=TRUE)

 I did it, and I have no problem running the code that you provide.

 sessionInfo()
 # R version 3.1.0 (2014-04-10)
 # Platform: x86_64-pc-linux-gnu (64-bit)
 #
 # locale:
 #  [1] LC_CTYPE=en_US.utf8   LC_NUMERIC=C
 #  [3] LC_TIME=en_US.utf8LC_COLLATE=en_US.utf8
 #  [5] LC_MONETARY=en_US.utf8LC_MESSAGES=en_US.utf8
 #  [7] LC_PAPER=en_US.utf8   LC_NAME=C
 #  [9] LC_ADDRESS=C  LC_TELEPHONE=C
 # [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
 #
 # attached base packages:
 # [1] stats graphics  grDevices utils datasets  methods
 # base
 #
 # other attached packages:
 # [1] ggplot2_0.9.3.1
 --
 Ulises M. Alvarez
 http://sophie.unam.mx/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Faster way to transform vector [3 8 4 6 1 5] to [2 6 3 5 1 4]

2014-04-26 Thread William Dunlap
Look into the rank function.

If there are duplicated values in the input vector its 'ties' argument
says how to deal with them. If there are ties I think your algorithm
puts the last one in the first position, e.g., it maps
c(101,101,101,102,102) to c(3,2,1,5,4).  rank does not include this
option, but if that is really what you want to do you can use
   myRank - function (x)  rev(rank(rev(x), ties = first))

On Sat, Apr 26, 2014 at 2:54 AM, xmliu1...@gmail.com
xmliu1...@gmail.com wrote:
 Hi,

 could anybody help me to find a fast way to fix the following question?

 Given a verctor of length N, for example bo = [3  8  4  6  1  5],
 I want to drive a vector whose elements are 1, 2, ..., N and the order of 
 elements is the same as that in verctor bo.
 In this example, the result is supposed to be bt = [2  6  3  5  1 4].

 I used the following code to solove this:

 bo - c(3,  8,  4,  6,  1,  5)
 N - length(bo)
 bt - rep(0, N)
 M - max(bo)
 temp - bo
   for(i in 1 : N)
 {
 min - M
 i_min - 0

 for(j in 1 : N)
 {
 if(min = temp[j])
 {
   min - temp[j]
   i_min -j
  }
 }
 bt[i_min] - i
 temp[i_min] - M+ 1
  }
 bt
 [1] 2 6 3 5 1 4

 However, the time complexity is O(N2).
 When N is larger than 100, it takes too much time.
 Is there any faster way to fix it?

 best
 Xueming



 xmliu1...@gmail.com
 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Bill Dunlap
TIBCO Software
wdunlap tibco.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Faster way to transform vector [3 8 4 6 1 5] to [2 6 3 5 1 4]

2014-04-26 Thread Jorge I Velez
Hi Xueming,

Try

(1:length(bo))[rank(bo)]

In a function the above would be

f - function(x){
  N - length(x)
  (1:N)[rank(x)]
}
f(bo)
# [1] 2 6 3 5 1 4

HTH,
Jorge.-




On Sat, Apr 26, 2014 at 7:54 PM, xmliu1...@gmail.com xmliu1...@gmail.comwrote:

 Hi,

 could anybody help me to find a fast way to fix the following question?

 Given a verctor of length N, for example bo = [3  8  4  6  1  5],
 I want to drive a vector whose elements are 1, 2, ..., N and the order of
 elements is the same as that in verctor bo.
 In this example, the result is supposed to be bt = [2  6  3  5  1 4].

 I used the following code to solove this:

 bo - c(3,  8,  4,  6,  1,  5)
 N - length(bo)
 bt - rep(0, N)
 M - max(bo)
 temp - bo
   for(i in 1 : N)
 {
 min - M
 i_min - 0

 for(j in 1 : N)
 {
 if(min = temp[j])
 {
   min - temp[j]
   i_min -j
  }
 }
 bt[i_min] - i
 temp[i_min] - M+ 1
  }
  bt
 [1] 2 6 3 5 1 4

 However, the time complexity is O(N2).
 When N is larger than 100, it takes too much time.
 Is there any faster way to fix it?

 best
 Xueming



 xmliu1...@gmail.com
 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mapply echoes function call when browser() is called from within FUN

2014-04-26 Thread Rguy
When mapply is applied to a function that has a call to browser() within
it, the result can be a disastrous amount of feedback.

To clarify this situation please consider the following function,
containing a call to browser within it:

plus = function(a, b) {browser(); a + b}

A plain vanilla call to plus() yields the following:

LAPTOP_32G_01 plus(1,2)
Called from: plus(1, 2)
Browse[1]
[1] 3

Now consider the following application of mapply to plus:

LAPTOP_32G_01 mapply(plus, 1:2, 1:2)
Called from: (function (a, b)
{
browser()
a + b
})(dots[[1L]][[1L]], dots[[2L]][[1L]])
Browse[1]
Called from: (function (a, b)
{
browser()
a + b
})(dots[[1L]][[2L]], dots[[2L]][[2L]])
Browse[1]
[1] 2 4

Notice that at each step, after the browser is called, mapply prints out
the function call including its arguments:

Called from: (function (a, b)
{
browser()
a + b
})(dots[[1L]][[1L]], dots[[2L]][[1L]])

etc.

In the present case this does no harm except to make things a little harder
to read. However, if one of the inputs happens to be a data frame with a
million rows, the entire million rows are printed to the screen. I have
been bitten by this, which is why I am writing this note. I have a question
and a request:

Question: Is there some way to prevent mapply (or browser) from echoing the
function call when browser is called from within FUN?

Request: If not, could the ability to turn off this echoing be provided. As
things stand, calling browser from within FUN, when FUN is a realistically
big function or has realistically big arguments, is a disaster.

Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mapply echoes function call when browser() is called from within FUN

2014-04-26 Thread Duncan Murdoch

On 26/04/2014, 11:42 AM, Rguy wrote:

When mapply is applied to a function that has a call to browser() within
it, the result can be a disastrous amount of feedback.

To clarify this situation please consider the following function,
containing a call to browser within it:

plus = function(a, b) {browser(); a + b}

A plain vanilla call to plus() yields the following:

LAPTOP_32G_01 plus(1,2)
Called from: plus(1, 2)
Browse[1]
[1] 3

Now consider the following application of mapply to plus:

LAPTOP_32G_01 mapply(plus, 1:2, 1:2)
Called from: (function (a, b)
{
 browser()
 a + b
})(dots[[1L]][[1L]], dots[[2L]][[1L]])
Browse[1]
Called from: (function (a, b)
{
 browser()
 a + b
})(dots[[1L]][[2L]], dots[[2L]][[2L]])
Browse[1]
[1] 2 4

Notice that at each step, after the browser is called, mapply prints out
the function call including its arguments:

Called from: (function (a, b)
{
 browser()
 a + b
})(dots[[1L]][[1L]], dots[[2L]][[1L]])

etc.

In the present case this does no harm except to make things a little harder
to read. However, if one of the inputs happens to be a data frame with a
million rows, the entire million rows are printed to the screen. I have
been bitten by this, which is why I am writing this note. I have a question
and a request:


I don't see the argument values being printed in your example, and if I 
replace them with dataframes, I still don't see them.  So it's not quite 
as simple as you describe to get the voluminous output.


Reproducible examples are needed if you want something fixed.



Question: Is there some way to prevent mapply (or browser) from echoing the
function call when browser is called from within FUN?


Yes, use the skipCalls argument to browser.

Duncan Murdoch



Request: If not, could the ability to turn off this echoing be provided. As
things stand, calling browser from within FUN, when FUN is a realistically
big function or has realistically big arguments, is a disaster.

Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lattice plot formatting: pch, abbreviation and labels

2014-04-26 Thread Luigi Marongiu
Dear all,
I am trying to use the lattice plot, but the syntax is quite
difficult. Specifically I have eight variables (1 to 8) each of them
further subdivided in two classes (negative=0 and positive=1). I am
using the stripplot() to represent these values. I would like to
represent the negative and positive values with black and white dots
so I have tried to use the argument pch=c(16, 1) both in the main
stripplot function and embedded in the scale argument. However the
resulting plot shows that some points are drawn with the pch=16 and
other with pch=1 irrespective of their class.
Is there a way to draw the values for the variable positivity = 0 with
pch=16 and those with positivity = 1 with pch=0?

In addition I would like to change the labels under the axis from 0,1
to N,P. However when I placed labels=c(N, P) or labels=list(N,
P) in the main stripplot() I did not obtained any difference and
when placed within the scale argument also the -labels were modified.
Is there a way to change the label 0 with N and the label 1 with P?

final problem: I would like to abbreviate the Unstimulated box label
with Unst.. I have tried the abbreviate=TRUE argument but again the
syntax is too complex for me and it did not work.
Is there a way to abbreviate the variables names?

Thank you very much for your help.
Best wishes,
Luigi


CODE:


### open plot library
library(lattice)

my.data-structure(list(
   column_1 = 1:120,
   column_2 = structure(c(
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8), .Label = c(Unstimulated, ESAT6, CFP10,
Rv3615c, Rv2654, Rv3879, Rv3873, PHA), class = factor),
column_3 = structure(c(
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
0,0,0,0,0,0,0,0)),
 column_4 = c(
 
192.0519108,183.6403531,53.46798757,83.60638077,69.60749873,159.4706861,256.8765622,499.2899303,
 
2170.799076,1411.349719,2759.472348,2098.973397,2164.739515,1288.676574,1611.486543,6205.229575,
 
870.7424981,465.9967135,191.8962375,864.0937485,2962.693675,1289.259137,2418.651212,7345.712517,
 0,168.1198893,674.4342961,101.1575401,47.81596237,0,0,1420.793922,
 
142.6871331,5.466468742,291.9564635,80.73914133,73.02239621,64.47806871,144.3543635,3167.959757,
 
3164.748333,1092.634557,28733.20269,1207.87783,729.6090973,151.8706088,241.2466141,9600.963594,
 
1411.718287,12569.96285,1143.254476,6317.378481,16542.27718,79.68025792,1958.495138,7224.503437,
 
208.4382941,69.48609769,656.691151,0.499017582,7114.910926,187.6296174,41.73980805,8930.784541,
 
4.276752185,0.432300363,60.89228665,1.103924786,0.490686366,1.812993239,7.264531581,1518.610307,
 
2172.051528,595.8513744,17141.84336,589.6565971,1340.287628,117.350942,593.7034054,24043.61463,
 
0,81.83292179,1539.864321,36.41722958,8.385131047,161.7647376,65.21615696,7265.573875,
 
97.84753179,154.051827,0.613835842,10.06138851,45.04879285,176.8284258,18795.75462,30676.769,
 
5780.34957,944.2200834,2398.235596,1083.393165,2541.714557,1251.670895,1547.178549,1792.679176,
 
3067.988416,8117.210173,23676.02226,8251.937547,17360.80494,18563.61561,16941.865,31453.96708,
 
2767.493803,4796.33016,12292.93705,3864.657567,9380.673835,14886.44683,8457.88646,26050.47191)),
.Names = c(row, stimulation, positivity, copy), row.names =
c(NA, -120L),
 class = data.frame)
attach(my.data)

stripplot(my.data$copy ~ factor(my.data$positivity)|factor(my.data$stimulation,
levels = c(Unstimulated, ESAT6,CFP10,Rv3615c,
Rv2654, Rv3879, Rv3873,PHA)),
my.data, hor=F, layout = c(8,1), scales = list(relation = same),
  jitter.data=TRUE, alpha=1, pch=c(16,1), col=black,
  ylab=expression(bold(Copy)),
xlab=expression(bold(Stimulation)), main=Plot,
  par.settings = list(strip.background=list(col=white)),
par.strip.text=list(font=2))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Faster way to transform vector [3 8 4 6 1 5] to [2 6 3 5 1 4]

2014-04-26 Thread arun
Hi,
Perhaps,
rank(bo)
#[1] 2 6 3 5 1 4
A.K.




On Saturday, April 26, 2014 11:00 AM, xmliu1...@gmail.com 
xmliu1...@gmail.com wrote:
Hi,

could anybody help me to find a fast way to fix the following question?

Given a verctor of length N, for example bo = [3  8  4  6  1  5],
I want to drive a vector whose elements are 1, 2, ..., N and the order of 
elements is the same as that in verctor bo.
In this example, the result is supposed to be bt = [2  6  3  5  1 4].

I used the following code to solove this:

bo - c(3,  8,  4,  6,  1,  5)
N - length(bo)
bt - rep(0, N)
M - max(bo)
temp - bo
  for(i in 1 : N)
            {
                min - M
                i_min - 0
                  
                for(j in 1 : N)
                {
                    if(min = temp[j])
                    {
                          min - temp[j]
                          i_min -j
                     }
                }
                bt[i_min] - i
                temp[i_min] - M+ 1
             }
 bt
[1] 2 6 3 5 1 4

However, the time complexity is O(N2).
When N is larger than 100, it takes too much time.
Is there any faster way to fix it?

best
Xueming



xmliu1...@gmail.com
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[R] GLM using truncated lognormal distribution

2014-04-26 Thread Marc Girondot

Dear honorable list-members,

I know how to fit a truncated lognormal distribution (or Gaussian) 
(example here:
http://max2.ese.u-psud.fr/epc/conservation/Girondot/Publications/Blog_r/Entrees/2012/5/24_Adjust_a_truncated_lognormal_distribution.html 
) but I would like to use it in the context of glm.


Rather than using family=gaussian(), ideally I would like to have a 
family=truncated_gaussian().
I see using fix(gaussian) how is organized the gaussian() function. It 
is not 100% clear now but I think I could manage to change it to do a 
family=truncated_gaussian().


But before to do it, perhaps it exists already.

I find the package truncnorm but it does not do this function.

Thanks a lot for any advice,

Marc

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] writing a package with doParallel and compiled C

2014-04-26 Thread Adam Clark
Hi all,

Any tips on how to use doParallel as part of a function in a package that I
am writing? The function calls compiled C code, and (as far as I know),
doParallel requires you to load C functions for each core that you use
(either with the .packages argument to foreach(), or by explicitly loading
the object for each loop).

However - the package doesn't pass the check command when I try to build
it. I suspect that this is because the package isn't yet loaded into the
namespace while it is being checked, and doParallel can therefore not find
it.

Does anyone know whether my suspicion is wrong, or whether there is a
better way to handle doParallel in this case?

Many thanks,

-- 
Adam Clark
PhD Candidate, Dept. Ecology, Evolution, and Behavior
University of Minnesota, Twin Cities
100 Ecology Building, 1987 Upper Buford Circle, St. Paul, MN 55108
atcl...@umn.edu, (857)-544-6782, www.cbs.umn.edu/lab/tilman/adamclark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] writing a package with doParallel and compiled C

2014-04-26 Thread Jeff Newmiller
I am going to go out on a limb and suggest that you not do this at all. 
Packages are good places for algorithms, and parallel processing is an 
infrastructure optimization that a) is not always an efficiency win, and b) can 
be quite sensitive to the actual infrastructure that is available (best 
solutions for Windows and *nix platforms often being noticeably different).
If you have considered this already and still intend to proceed, I have to 
defer to someone else for an answer to your actual question.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On April 26, 2014 11:52:18 AM PDT, Adam Clark atcl...@umn.edu wrote:
Hi all,

Any tips on how to use doParallel as part of a function in a package
that I
am writing? The function calls compiled C code, and (as far as I know),
doParallel requires you to load C functions for each core that you use
(either with the .packages argument to foreach(), or by explicitly
loading
the object for each loop).

However - the package doesn't pass the check command when I try to
build
it. I suspect that this is because the package isn't yet loaded into
the
namespace while it is being checked, and doParallel can therefore not
find
it.

Does anyone know whether my suspicion is wrong, or whether there is a
better way to handle doParallel in this case?

Many thanks,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with new(er) R version's matrix package

2014-04-26 Thread Martin Maechler
 Arne Henningsen arne.henning...@gmail.com
 on Sat, 26 Apr 2014 08:15:37 +0200 writes:

 On 25 April 2014 20:15, David Winsemius
 dwinsem...@comcast.net wrote:
 
 On Apr 25, 2014, at 9:17 AM, Werner W. wrote:
 
 Dear Rs,
 
 I am re-executing some older code. It does work in the
 ancient R 2.12.0 which I still have on my PC but with
 the new version R 3.1.0 it does not work any more (but
 some other new stuff, which won't work with 2.12).
 
 The problem arises in context with the systemfit package
 using the matrix package. In R 3.1.0 the following error
 is thrown: Error in as.matrix(solve(W, tol =
 solvetol)[1:ncol(xMat), 1:ncol(xMat)]) : error in
 evaluating the argument 'x' in selecting a method for
 function 'as.matrix': Error in .solve.sparse.dgC(as(a,
 dgCMatrix), b = b, tol = tol) : LU computationally
 singular: ratio of extreme entries in |diag(U)| =
 7.012e-39
 
 However, I have no clue what I can do about this. Was
 there some change in the defaults of the matrix package?
 I couldn't find anything apparent in the changelog. As
 the same code works in R 2.12.0, I suppose that the
 problem is not my data.
 
 You have not told us what version of the Matrix package
 you were using.  As such I would suggest that you review
 the Changelog which is a link for the CRAN page for
 pkg:Matrix and go back 4 years or so since R major
 versions change about once a year.
 
 http://cran.r-project.org/web/packages/Matrix/ChangeLog

 In addition, please provide a minimal, self-contained,
 reproducible example.

Yes, please do.   As maintainer of the Matrix package, I'm
willing to look into the situation of course.

As was mentioned, many things have changed in 4 years.
The error message above looks like you'd want to invert a
(very close to) singular matrix, and there could be quite few
reasons why parts of the older code gave slightly different
answers.

Without a reproducible example, we can't get started though.

Best regards,
Martin Maechler, ETH Zurich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] For loop processing too slow - pre-format data.frame?

2014-04-26 Thread cembling
Hi,

I am bootstrapping, but my loops are taking way too long  I need to make it
faster. Looking on the R-help archive I suspect it may be due to not
specifying the size of my data.frame, mainly because I don't know in advance
how large it has to be. Can anyone help?

My data looks like this (first 5 entries of 'SpeyBay'):

  Year JulianDay Hour Day Month Quarter Season SeaState Visibility TideState
1 2005916   1 4   2  22  2  2.18
2 2005917   1 4   2  22  2  1.53
3 2005919   1 4   2  22  3  0.80
4 200591   11   1 4   2  22  4  0.96
5 200591   14   1 4   2  21  6  2.25
  TideHeight CetPres Segment
1  2   0   1
2  3   0   1
3  5   0   2
4 -5   0   3
5 -2   0   4

I am bootstrapping 1000 times but re-sampling on segment (since my data is
autocorrelated), which means I am trying to reconstruct my data based on
random segments e.g. segment 3, then segment 1, each of which may include
from 1-14 data rows. So I don't know how many rows I am going to get in
advance.

When I run my for loop, I just use rbind with undefined size of the new
variable e.g. 'tempD2', and I suspect it is this that is slowing down the
whole process (probably partly due to having a for loop within a for loop).

Can anyone give me any advice on how to pre-define a data frame (if this is
what the data shown above is) that can have an undefined size - or how to
make it big enough to take all the data?). I've been trying to figure this
out for ages with no luck  sure it's something simple!

Code shown below - any tips on making the code faster would be greatly
appreciated - the last run took several hours which is just not practical!

Many thanks in advance,
Clare Embling

CODE: 

SpringWatch - 504
SummerWatch - 704
AutumnWatch - 392
MaxSample - 704

signif - 0

for(j in 1:1000){

   # resampling 2 different years (D  E) in 3 different seasons (2, 3  4)
separately
   D2S - sample(D2Start:D2Stop,MaxSample,replace=T)
   D3S - sample(D3Start:D3Stop,MaxSample,replace=T)
   D4S - sample(D4Start:D4Stop,MaxSample,replace=T)
   E2S - sample(E2Start:E2Stop,MaxSample,replace=T)
   E3S - sample(E3Start:E3Stop,MaxSample,replace=T)
   E4S - sample(E4Start:E4Stop,MaxSample,replace=T)

   # Creating new data frames with the first sampled segment
   TempD2 - SpeyBay[(Segment==D2S[1]),]
   TempD3 - SpeyBay[(Segment==D3S[1]),]
   TempD4 - SpeyBay[(Segment==D4S[1]),]
   TempE2 - SpeyBay[(Segment==E2S[1]),]
   TempE3 - SpeyBay[(Segment==E3S[1]),]
   TempE4 - SpeyBay[(Segment==E4S[1]),]

   # loop to add together all the rows of data for each segment sampled
   for(i in 2:MaxSample) {
  TempD2 - rbind(TempD2,SpeyBay[(Segment==D2S[i]),])
  TempD3 - rbind(TempD3,SpeyBay[(Segment==D3S[i]),])
  TempD4 - rbind(TempD4,SpeyBay[(Segment==D4S[i]),])
  TempE2 - rbind(TempE2,SpeyBay[(Segment==E2S[i]),])
  TempE3 - rbind(TempE3,SpeyBay[(Segment==E3S[i]),])
  TempE4 - rbind(TempE4,SpeyBay[(Segment==E4S[i]),])
   }
   # But actually I only want a certain number of rows of data...
   NewD2 - TempD2[1:SpringWatch,]   
   NewD3 - TempD3[1:SummerWatch,]   
   NewD4 - TempD4[1:AutumnWatch,]   
   NewE2 - TempE2[1:SpringWatch,]   
   NewE3 - TempE3[1:SummerWatch,]   
   NewE4 - TempE4[1:AutumnWatch,]   

   # then combine together (could do this in one step!
   NewD - rbind(NewD2,NewD3,NewD4)
   NewE - rbind(NewE2,NewE3,NewE4)

   CompDE - rbind(NewD,NewE)

   #Run a GLM-GEE on the resampled distributions to see if there is a
statistical difference between years 

   NewGLMGEE1 -
geeglm(CetPres~Year++SeaState,data=CompDE,family=binomial,id=Segment,corstr=ar1)
   pv - summary(NewGLMGEE1)$coefficients[, Pr(|W|)]  ## will extract
them
   signif[j] - pv[2] # only interested in the significance of Year in the
model
}








--
View this message in context: 
http://r.789695.n4.nabble.com/For-loop-processing-too-slow-pre-format-data-frame-tp4689543.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help using extrafont package | R graphics

2014-04-26 Thread Evan Cooch
Greetings --

Submitted this a little while ago -- for some reason, still being held 
up by the moderator. Trying again...


For a host of reasons, I need to use/embed Garamond font with various R 
graphics for a particular publication. I've figured out how to more or 
less get there from here, using the following sequence:


library(extrafont)

Then, I need to import the system fonts (Windoze box, R 3.1.0).

So, I use

font_import()

But, this takes a *huge* amount of time, and often throws errors as it 
chokes on various fonts (I have probably 250+ fonts installed). I only 
want *one* font (Garamond). But, for the life of me, I can't figure out 
how to get font_import to select only the single font I want. In theory

font_import(paths = NULL, recursive = TRUE, prompt = TRUE, pattern = NULL)

as defaults, where pattern is a regex that font filenames must match.

The file name for Garamong is gara.ttf, so I tried

font_import(pattern=gara)

R responds with 'Importing fonts make take a few minutes, depending on 
the...etc, etc'.
Continue? [y/n]

Hit 'y', and am presented with

Scanning ttf files in C:\Windows\Fonts ...
Extracting .afm files from .ttf files...
Error in data.frame(fontfile = ttfiles, FontName = , stringsAsFactors 
= FALSE) :
   arguments imply differing number of rows: 0, 1

I have no idea what to do with this.

Suggestions/pointers to the obvious welcome. And I thought futzing with 
fonts in LaTeX was fun! ;-)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] average and median values for each of the class

2014-04-26 Thread Nico Met
Dear all,



I have a matrix (dimension, 16 x 12) where  2nd column represents class
 (1,1,1,1,1,2,2,2, etc) information. I want to estimate average  and median
values for each of the class and add this information as a row at end of
the each classes.


for example:

dput(dat)

structure(list(class = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,

3L, 3L, 3L, 4L, 4L, 4L, 5L), name1 = c(2.554923977, 2.371586762,

2.497293431, 2.464827875, 2.981934845, 2.228995664, 2.099640729,

1.900314302, 2.630005966, 2.632590262, 2.581887814, 2.408797563,

2.098761103, 3.070460716, 1.436980716, 1.645121806), name2 = c(1.297412278,

1.104804244, 1.30621114, 1.126009533, 1.466740841, 1.012041118,

0.923466541, 0.840575023, 1.285530176, 1.041909333, 1.194917856,

1.085015826, 1.047492703, 1.587558217, 0.593340012, 0.723630088

), name3 = c(0.587160798, 0.596127884, 0.623760721, 0.549016135,

0.686642084, 0.487523394, 0.458620467, 0.397974913, 0.615928976,

0.546005649, 0.657383069, 0.546613129, 0.476503461, 0.749062102,

0.304160587, 0.29037358), name4 = c(2.833441759, 2.713374426,

2.532626548, 2.409093102, 3.014912721, 2.113507947, 2.017291324,

1.667744912, 2.602560666, 2.31649643, 2.761204809, 2.433963493,

2.229911767, 3.191646399, 1.269919241, 1.387479858), name5 = c(2.172365295,

1.955695471, 2.141072829, 1.975743278, 2.377018372, 1.791300389,

1.669079382, 1.500209628, 2.164401874, 1.830038378, 2.106750025,

1.92888294, 1.707217549, 2.585082653, 1.114841754, 1.315712452

), name6 = c(0.715129844, 0.688186262, 0.70133748, 0.709362008,

0.712145174, 0.563593885, 0.532109761, 0.472197304, 0.690165016,

0.65635473, 0.615835066, 0.64310098, 0.562974891, 0.900622255,

0.408546784, 0.416284408), name7 = c(1.995505133, 1.860095899,

1.843151597, 1.709861774, 2.155993511, 1.506409746, 1.315405587,

1.234544153, 1.96629927, 1.74879757, 1.93994009, 1.660173854,

1.556735295, 2.355723318, 0.866634243, 1.013367677), name8 = c(0.275484997,

0.233856392, 0.294021245, 0.315504347, 0.251906585, 0.250263636,

0.348599173, 0.273806933, 0.32067937, 0.278581115, 0.293726291,

0.308350808, 0.201297444, 0.351927886, 0.204230625, 0.185681471

), name9 = c(2.461066627, 2.210756164, 2.289047888, 2.253988252,

2.668184733, 1.911697836, 1.793443775, 1.560027186, 2.36941155,

1.96191, 2.391501376, 2.002215107, 1.932144233, 2.73705052,

1.15580754, 1.807697999), name10 = c(0.723025351, 0.613147422,

0.805399925, 0.65651577, 0.779389048, 0.54260459, 0.492283542,

0.507969501, 0.749700016, 0.644231327, 0.810319215, 0.620331891,

0.600240557, 0.884775748, 0.40006142, 0.391661912), name11 = c(0.308565619,

0.453808281, 0.363716904, 0.376332596, 0.324998876, 0.361013073,

0.430744786, 0.468818055, 0.166072668, 0.369262627, 0.297666411,

0.256091173, 0.123021464, 0.308188684, 0.646436241, 0.722972632

)), .Names = c(class, name1, name2, name3, name4, name5,

name6, name7, name8, name9, name10, name11), class = data.frame,
row.names = c(ara1,

ara2, ara3, ara4, ara5, ara6, ara7, ara8, ara9,

ara10, ara11, ara12, ara13, ara14, ara15, ara16

))


I wrote this:



 avg-as.data.frame(aggregate(dat[,2:dim(dat)[2]], dat[class],
function(x) mean(x,na.rm=T)) )


med-as.data.frame(aggregate(dat[,2:dim(dat)[2]], dat[class], function(x)
median(x,na.rm=T)) )


# avg

#  classname1 name2 name3name4name5 name6name7
name#8name9name10name11

#1 1 2.574113 1.2602356 0.6085415 2.700690 2.124379 0.7052322 1.912922
#0.2741547 2.376609 0.7154955 0.3654845

#2 2 2.214739 1.0154032 0.4900119 2.100276 1.781248 0.5645165 1.505665
#0.2983373 1.908645 0.5731394 0.3566621

#3 3 2.541092 1.1072810 0.589 2.503888 1.955224 0.6384303 1.782971
#0.2935527 2.118543 0.6916275 0.3076734

#4 4 2.202068 1.0761303 0.5099087 2.230492 1.802381 0.6240480 1.593031
#0.2524853 1.941667 0.6283592 0.3592155

#5 5 1.645122 0.7236301 0.2903736 1.387480 1.315712 0.4162844 1.013368
#0.1856815 1.807698 0.3916619 0.7229726

# med

#  classname1 name2 name3name4name5 name6name7
name#8name9name10name11

#1 1 2.497293 1.2974123 0.5961279 2.713374 2.141073 0.7093620 1.860096
#0.2754850 2.289048 0.7230254 0.3637169

#2 2 2.164318 0.9677538 0.4730719 2.065400 1.730190 0.5478518 1.410908
#0.2972432 1.852571 0.5252870 0.3958789

#3 3 2.581888 1.0850158 0.5466131 2.433963 1.928883 0.6431010 1.748798
#0.2937263 2.002215 0.6442313 0.2976664

#4 4 2.098761 1.0474927 0.4765035 2.229912 1.707218 0.5629749 1.556735
#0.2042306 1.932144 0.6002406 0.3081887

#5 5 1.645122 0.7236301 0.2903736 1.387480 1.315712 0.4162844 1.013368
#0.1856815 1.807698 0.3916619 0.7229726




But I do not know how can I add this information in the original data?


For example, for class 1, the output will look like this:

dput(res1)

structure(list(class = c(1L, 1L, 1L, 1L, 1L, 1L, 1L), name1 =
c(2.554923977,

2.371586762, 2.497293431, 2.464827875, 2.981934845, 2.574113378,

2.497293431), name2 = c(1.297412278, 

Re: [R] lattice plot formatting: pch, abbreviation and labels

2014-04-26 Thread Duncan Mackay
Hi Luigi

You are typing things unnecessarily: do not use the attach command unless
absolutely necessary - it has unfortunate consequences.
It is better to use with or within.
alpha is not available with some devices with bad consequences if used; its
default is 1 anyway.

Once you have stated a data.frame as the data object it is usually not
necessary to use the data.frame$ sign in to signify column names: use the
column names.
data = a data.frame is the equivalent to with

You are reordering the levels of stimulation and changing the name of 1 so I
thought it was easiest to make a column stim and do things there otherwise
it the relevelling could be done in the data argument using the subset
argument. Then the change to Unst could be done by using  strip=
strip.custom(factor.levels = ...),

To change for the -/+ I made a group - the easiest way.
Have a look a changing the negative symbol to 20 or something else as it is
hard to visualise. You may vary things with changing cex (remember to have 2
values 1 for each group).

If you do str(xyplot object) you will get a big print of the object and
within that the x limits are shown as 0, 1 which means that the x values
are 1 and 2
There is a command to get these things but I have forgotten it

my.data$stimulation - factor(levels = c(Unstimulated,
ESAT6,CFP10,Rv3615c,Rv2654, Rv3879, Rv3873,PHA))
my.data$stim - factor(my.data$stimulation, labels = c(Unst.,
ESAT6,CFP10,Rv3615c,Rv2654, Rv3879, Rv3873,PHA))
my.data$pos - ifelse(sign(my.data$copy)  0, 2,1)

  stripplot(copy ~ factor(positivity)|stim,my.data,
groups = pos,
hor = F,
layout = c(8,1),
scales = list(x = list(at = c(1,2),
   labels = c(N,P))),
jitter.x = TRUE,
amount = 2,
pch = c(16,1),
col = black,
ylab = expression(bold(Copy)),
xlab = expression(bold(Stimulation)),
main=Plot,
par.settings = list(strip.background=list(col=white)),
par.strip.text=list(font=2)
)

Regards

Duncan

Duncan Mackay
Department of Agronomy and Soil Science
University of New England
Armidale NSW 2351
Email: home: mac...@northnet.com.au


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Luigi Marongiu
Sent: Sunday, 27 April 2014 02:07
To: r-help@r-project.org
Subject: [R] lattice plot formatting: pch, abbreviation and labels

Dear all,
I am trying to use the lattice plot, but the syntax is quite
difficult. Specifically I have eight variables (1 to 8) each of them
further subdivided in two classes (negative=0 and positive=1). I am
using the stripplot() to represent these values. I would like to
represent the negative and positive values with black and white dots
so I have tried to use the argument pch=c(16, 1) both in the main
stripplot function and embedded in the scale argument. However the
resulting plot shows that some points are drawn with the pch=16 and
other with pch=1 irrespective of their class.
Is there a way to draw the values for the variable positivity = 0 with
pch=16 and those with positivity = 1 with pch=0?

In addition I would like to change the labels under the axis from 0,1
to N,P. However when I placed labels=c(N, P) or labels=list(N,
P) in the main stripplot() I did not obtained any difference and
when placed within the scale argument also the -labels were modified.
Is there a way to change the label 0 with N and the label 1 with P?

final problem: I would like to abbreviate the Unstimulated box label
with Unst.. I have tried the abbreviate=TRUE argument but again the
syntax is too complex for me and it did not work.
Is there a way to abbreviate the variables names?

Thank you very much for your help.
Best wishes,
Luigi


CODE:


### open plot library
library(lattice)

my.data-structure(list(
   column_1 = 1:120,
   column_2 = structure(c(
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8,
 1,2,3,4,5,6,7,8), .Label = c(Unstimulated, ESAT6, CFP10,
Rv3615c, Rv2654, Rv3879, Rv3873, PHA), class = factor),
column_3 = structure(c(
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
0,0,0,0,0,0,0,0)),
 column_4 = c(
 
192.0519108,183.6403531,53.46798757,83.60638077,69.60749873,159.4706861,256.
8765622,499.2899303,
 
2170.799076,1411.349719,2759.472348,2098.973397,2164.739515,1288.676574,1611
.486543,6205.229575,
 
870.7424981,465.9967135,191.8962375,864.0937485,2962.693675,1289.259137,2418
.651212,7345.712517,
 

Re: [R] average and median values for each of the class

2014-04-26 Thread arun


Hi,
Your dput() suggests dat as data.frame.
##Using the results you got,

res2 - do.call(rbind,lapply(unique(dat$class),function(i) {x1 
-rbind(dat[dat$class==i,], avg[avg$class==i,], med[med$class==i,]); 
rownames(x1)[!grepl(ara,rownames(x1))] - paste0(c(Avg, Med), i); x1}))


A.K.



On Saturday, April 26, 2014 8:39 PM, Nico Met nicome...@gmail.com wrote:
Dear all,



I have a matrix (dimension, 16 x 12) where  2nd column represents class
(1,1,1,1,1,2,2,2, etc) information. I want to estimate average  and median
values for each of the class and add this information as a row at end of
the each classes.


for example:

dput(dat)

structure(list(class = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,

3L, 3L, 3L, 4L, 4L, 4L, 5L), name1 = c(2.554923977, 2.371586762,

2.497293431, 2.464827875, 2.981934845, 2.228995664, 2.099640729,

1.900314302, 2.630005966, 2.632590262, 2.581887814, 2.408797563,

2.098761103, 3.070460716, 1.436980716, 1.645121806), name2 = c(1.297412278,

1.104804244, 1.30621114, 1.126009533, 1.466740841, 1.012041118,

0.923466541, 0.840575023, 1.285530176, 1.041909333, 1.194917856,

1.085015826, 1.047492703, 1.587558217, 0.593340012, 0.723630088

), name3 = c(0.587160798, 0.596127884, 0.623760721, 0.549016135,

0.686642084, 0.487523394, 0.458620467, 0.397974913, 0.615928976,

0.546005649, 0.657383069, 0.546613129, 0.476503461, 0.749062102,

0.304160587, 0.29037358), name4 = c(2.833441759, 2.713374426,

2.532626548, 2.409093102, 3.014912721, 2.113507947, 2.017291324,

1.667744912, 2.602560666, 2.31649643, 2.761204809, 2.433963493,

2.229911767, 3.191646399, 1.269919241, 1.387479858), name5 = c(2.172365295,

1.955695471, 2.141072829, 1.975743278, 2.377018372, 1.791300389,

1.669079382, 1.500209628, 2.164401874, 1.830038378, 2.106750025,

1.92888294, 1.707217549, 2.585082653, 1.114841754, 1.315712452

), name6 = c(0.715129844, 0.688186262, 0.70133748, 0.709362008,

0.712145174, 0.563593885, 0.532109761, 0.472197304, 0.690165016,

0.65635473, 0.615835066, 0.64310098, 0.562974891, 0.900622255,

0.408546784, 0.416284408), name7 = c(1.995505133, 1.860095899,

1.843151597, 1.709861774, 2.155993511, 1.506409746, 1.315405587,

1.234544153, 1.96629927, 1.74879757, 1.93994009, 1.660173854,

1.556735295, 2.355723318, 0.866634243, 1.013367677), name8 = c(0.275484997,

0.233856392, 0.294021245, 0.315504347, 0.251906585, 0.250263636,

0.348599173, 0.273806933, 0.32067937, 0.278581115, 0.293726291,

0.308350808, 0.201297444, 0.351927886, 0.204230625, 0.185681471

), name9 = c(2.461066627, 2.210756164, 2.289047888, 2.253988252,

2.668184733, 1.911697836, 1.793443775, 1.560027186, 2.36941155,

1.96191, 2.391501376, 2.002215107, 1.932144233, 2.73705052,

1.15580754, 1.807697999), name10 = c(0.723025351, 0.613147422,

0.805399925, 0.65651577, 0.779389048, 0.54260459, 0.492283542,

0.507969501, 0.749700016, 0.644231327, 0.810319215, 0.620331891,

0.600240557, 0.884775748, 0.40006142, 0.391661912), name11 = c(0.308565619,

0.453808281, 0.363716904, 0.376332596, 0.324998876, 0.361013073,

0.430744786, 0.468818055, 0.166072668, 0.369262627, 0.297666411,

0.256091173, 0.123021464, 0.308188684, 0.646436241, 0.722972632

)), .Names = c(class, name1, name2, name3, name4, name5,

name6, name7, name8, name9, name10, name11), class = data.frame,
row.names = c(ara1,

ara2, ara3, ara4, ara5, ara6, ara7, ara8, ara9,

ara10, ara11, ara12, ara13, ara14, ara15, ara16

))


I wrote this:



avg-as.data.frame(aggregate(dat[,2:dim(dat)[2]], dat[class],
function(x) mean(x,na.rm=T)) )


med-as.data.frame(aggregate(dat[,2:dim(dat)[2]], dat[class], function(x)
median(x,na.rm=T)) )


# avg

#  class    name1     name2     name3    name4    name5     name6    name7
    name#8    name9    name10    name11

#1     1 2.574113 1.2602356 0.6085415 2.700690 2.124379 0.7052322 1.912922
#0.2741547 2.376609 0.7154955 0.3654845

#2     2 2.214739 1.0154032 0.4900119 2.100276 1.781248 0.5645165 1.505665
#0.2983373 1.908645 0.5731394 0.3566621

#3     3 2.541092 1.1072810 0.589 2.503888 1.955224 0.6384303 1.782971
#0.2935527 2.118543 0.6916275 0.3076734

#4     4 2.202068 1.0761303 0.5099087 2.230492 1.802381 0.6240480 1.593031
#0.2524853 1.941667 0.6283592 0.3592155

#5     5 1.645122 0.7236301 0.2903736 1.387480 1.315712 0.4162844 1.013368
#0.1856815 1.807698 0.3916619 0.7229726

# med

#  class    name1     name2     name3    name4    name5     name6    name7
    name#8    name9    name10    name11

#1     1 2.497293 1.2974123 0.5961279 2.713374 2.141073 0.7093620 1.860096
#0.2754850 2.289048 0.7230254 0.3637169

#2     2 2.164318 0.9677538 0.4730719 2.065400 1.730190 0.5478518 1.410908
#0.2972432 1.852571 0.5252870 0.3958789

#3     3 2.581888 1.0850158 0.5466131 2.433963 1.928883 0.6431010 1.748798
#0.2937263 2.002215 0.6442313 0.2976664

#4     4 2.098761 1.0474927 0.4765035 2.229912 1.707218 0.5629749 1.556735
#0.2042306 1.932144 0.6002406 0.3081887

#5     5 1.645122 0.7236301 0.2903736 1.387480 1.315712 0.4162844 1.013368
#0.1856815 

Re: [R] average and median values for each of the class

2014-04-26 Thread David Winsemius

On Apr 26, 2014, at 5:37 PM, Nico Met wrote:

 Dear all,
 
 
 
 I have a matrix (dimension, 16 x 12) where  2nd column represents class
 (1,1,1,1,1,2,2,2, etc) information. I want to estimate average  and median
 values for each of the class and add this information as a row at end of
 the each classes.
 
Well it does have a dimension attribute but it is a data.frame, NOT a matrix. 
The term class is a reserved word in R. What is it that you mean by that 
word? if it is for each column then:

sapply( dat, function(x) c( mean(x), median(x)) )



 sapply( dat, function(x) c( mean_x = mean(x), median_x = median(x)) )
  classname1name2 name3name4name5 name6
mean_x   2.4375 2.350258 1.102291 0.5358036 2.343448 1.895963 0.6242466
median_x 2. 2.436813 1.094910 0.5478146 2.421528 1.942289 0.6497279
   name7 name8name9name10name11
mean_x   1.67054 0.2742449 2.094122 0.6388536 0.3736069
median_x 1.72933 0.2770331 2.106486 0.6322816 0.3623650

-- 
David.


 
 for example:
 
 dput(dat)
 
 structure(list(class = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
 
 3L, 3L, 3L, 4L, 4L, 4L, 5L), name1 = c(2.554923977, 2.371586762,
 
 2.497293431, 2.464827875, 2.981934845, 2.228995664, 2.099640729,
 
 1.900314302, 2.630005966, 2.632590262, 2.581887814, 2.408797563,
 
 2.098761103, 3.070460716, 1.436980716, 1.645121806), name2 = c(1.297412278,
 
 1.104804244, 1.30621114, 1.126009533, 1.466740841, 1.012041118,
 
 0.923466541, 0.840575023, 1.285530176, 1.041909333, 1.194917856,
 
 1.085015826, 1.047492703, 1.587558217, 0.593340012, 0.723630088
 
 ), name3 = c(0.587160798, 0.596127884, 0.623760721, 0.549016135,
 
 0.686642084, 0.487523394, 0.458620467, 0.397974913, 0.615928976,
 
 0.546005649, 0.657383069, 0.546613129, 0.476503461, 0.749062102,
 
 0.304160587, 0.29037358), name4 = c(2.833441759, 2.713374426,
 
 2.532626548, 2.409093102, 3.014912721, 2.113507947, 2.017291324,
 
 1.667744912, 2.602560666, 2.31649643, 2.761204809, 2.433963493,
 
 2.229911767, 3.191646399, 1.269919241, 1.387479858), name5 = c(2.172365295,
 
 1.955695471, 2.141072829, 1.975743278, 2.377018372, 1.791300389,
 
 1.669079382, 1.500209628, 2.164401874, 1.830038378, 2.106750025,
 
 1.92888294, 1.707217549, 2.585082653, 1.114841754, 1.315712452
 
 ), name6 = c(0.715129844, 0.688186262, 0.70133748, 0.709362008,
 
 0.712145174, 0.563593885, 0.532109761, 0.472197304, 0.690165016,
 
 0.65635473, 0.615835066, 0.64310098, 0.562974891, 0.900622255,
 
 0.408546784, 0.416284408), name7 = c(1.995505133, 1.860095899,
 
 1.843151597, 1.709861774, 2.155993511, 1.506409746, 1.315405587,
 
 1.234544153, 1.96629927, 1.74879757, 1.93994009, 1.660173854,
 
 1.556735295, 2.355723318, 0.866634243, 1.013367677), name8 = c(0.275484997,
 
 0.233856392, 0.294021245, 0.315504347, 0.251906585, 0.250263636,
 
 0.348599173, 0.273806933, 0.32067937, 0.278581115, 0.293726291,
 
 0.308350808, 0.201297444, 0.351927886, 0.204230625, 0.185681471
 
 ), name9 = c(2.461066627, 2.210756164, 2.289047888, 2.253988252,
 
 2.668184733, 1.911697836, 1.793443775, 1.560027186, 2.36941155,
 
 1.96191, 2.391501376, 2.002215107, 1.932144233, 2.73705052,
 
 1.15580754, 1.807697999), name10 = c(0.723025351, 0.613147422,
 
 0.805399925, 0.65651577, 0.779389048, 0.54260459, 0.492283542,
 
 0.507969501, 0.749700016, 0.644231327, 0.810319215, 0.620331891,
 
 0.600240557, 0.884775748, 0.40006142, 0.391661912), name11 = c(0.308565619,
 
 0.453808281, 0.363716904, 0.376332596, 0.324998876, 0.361013073,
 
 0.430744786, 0.468818055, 0.166072668, 0.369262627, 0.297666411,
 
 0.256091173, 0.123021464, 0.308188684, 0.646436241, 0.722972632
 
 )), .Names = c(class, name1, name2, name3, name4, name5,
 
 name6, name7, name8, name9, name10, name11), class = data.frame,
 row.names = c(ara1,
 
 ara2, ara3, ara4, ara5, ara6, ara7, ara8, ara9,
 
 ara10, ara11, ara12, ara13, ara14, ara15, ara16
 
 ))
 
 
 I wrote this:
 
 
 
 avg-as.data.frame(aggregate(dat[,2:dim(dat)[2]], dat[class],
 function(x) mean(x,na.rm=T)) )
 
 
 med-as.data.frame(aggregate(dat[,2:dim(dat)[2]], dat[class], function(x)
 median(x,na.rm=T)) )
 
 
 # avg
 
 #  classname1 name2 name3name4name5 name6name7
name#8name9name10name11
 
 #1 1 2.574113 1.2602356 0.6085415 2.700690 2.124379 0.7052322 1.912922
 #0.2741547 2.376609 0.7154955 0.3654845
 
 #2 2 2.214739 1.0154032 0.4900119 2.100276 1.781248 0.5645165 1.505665
 #0.2983373 1.908645 0.5731394 0.3566621
 
 #3 3 2.541092 1.1072810 0.589 2.503888 1.955224 0.6384303 1.782971
 #0.2935527 2.118543 0.6916275 0.3076734
 
 #4 4 2.202068 1.0761303 0.5099087 2.230492 1.802381 0.6240480 1.593031
 #0.2524853 1.941667 0.6283592 0.3592155
 
 #5 5 1.645122 0.7236301 0.2903736 1.387480 1.315712 0.4162844 1.013368
 #0.1856815 1.807698 0.3916619 0.7229726
 
 # med
 
 #  classname1 name2 name3name4name5 name6name7
name#8name9name10name11
 
 #1 1 2.497293