[R] Extracting only multiple occurrences

2013-08-08 Thread Kevin Parent
Hoping someone here can help me with this small problem.
set.seed(2013)

x-sort(c(letters,letters[sample(26,10,1)]))

This gives a vector of 36 letters with some muliples (in this case, 
g,m,s,t,u,v,x,y). Now what I need is to get rid of the ones that only occur 
once and keep the multiples. I need the opposite of the unique() function. I 
expect this should be pretty easy but I can't see it. Anyone know a solution? 
Thanks in advance!


_
Kevin Parent, Ph.D
Korea Maritime University
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting only multiple occurrences

2013-08-08 Thread David Winsemius

On Aug 7, 2013, at 10:37 PM, Kevin Parent wrote:

 Hoping someone here can help me with this small problem.
 set.seed(2013)
 
 x-sort(c(letters,letters[sample(26,10,1)]))
 
 This gives a vector of 36 letters with some muliples (in this case, 
 g,m,s,t,u,v,x,y). Now what I need is to get rid of the ones that only occur 
 once and keep the multiples. I need the opposite of the unique() function. I 
 expect this should be pretty easy but I can't see it. Anyone know a solution? 
 Thanks in advance!
 
 
?duplicated

x[ duplicated(x) ]


David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting only multiple occurrences

2013-08-08 Thread David Winsemius

On Aug 7, 2013, at 11:03 PM, David Winsemius wrote:

 
 On Aug 7, 2013, at 10:37 PM, Kevin Parent wrote:
 
 Hoping someone here can help me with this small problem.
 set.seed(2013)
 
 x-sort(c(letters,letters[sample(26,10,1)]))
 
 This gives a vector of 36 letters with some muliples (in this case, 
 g,m,s,t,u,v,x,y). Now what I need is to get rid of the ones that only occur 
 once and keep the multiples. I need the opposite of the unique() function. I 
 expect this should be pretty easy but I can't see it. Anyone know a 
 solution? Thanks in advance!
 
 
 ?duplicated
 
 x[ duplicated(x) ]
 

Also this may be of interest:

cran.r-project.org/doc/contrib/Short-refcard.pdf‎

... but I was suprised to find that both duplicated and rle are absent from 
Short's refcards.


-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting only multiple occurrences

2013-08-08 Thread Kevin Parent
Well that almost works, and I didn't know about duplicated() so thanks for 
that. However, it only gives me the duplicated values. I need the original ones 
too. So the result I want is: [g,g,m,m,s,s,t,t,u,u,u,v,v,x,x,y,y,y]. What 
duplicated() gives me is [g,m,s,t,u,u,v,x,y,y]

 
Playing around with it, I got this but can't helping thinking there must be a 
less awkward way:
set.seed(2013)
x-sort(c(letters,letters[sample(26,10,1)]))
x-x[duplicated(x)]
x-sort(c(x,unique(x)))
_
Kevin Parent, Ph.D
Korea Maritime University


 From: David Winsemius dwinsem...@comcast.net

Cc: r-help@r-project.org r-help@r-project.org 
Sent: Thursday, August 8, 2013 3:03 PM
Subject: Re: [R] Extracting only multiple occurrences



On Aug 7, 2013, at 10:37 PM, Kevin Parent wrote:

 Hoping someone here can help me with this small problem.
 set.seed(2013)
 
 x-sort(c(letters,letters[sample(26,10,1)]))
 
 This gives a vector of 36 letters with some muliples (in this case, 
 g,m,s,t,u,v,x,y). Now what I need is to get rid of the ones that only occur 
 once and keep the multiples. I need the opposite of the unique() function. I 
 expect this should be pretty easy but I can't see it. Anyone know a solution? 
 Thanks in advance!
 
 
?duplicated

x[ duplicated(x) ]


David Winsemius
Alameda, CA, USA
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting only multiple occurrences

2013-08-08 Thread David Winsemius

On Aug 7, 2013, at 11:23 PM, Kevin Parent wrote:

 Well that almost works, and I didn't know about duplicated() so thanks for 
 that. However, it only gives me the duplicated values. I need the original 
 ones too. So the result I want is: [g,g,m,m,s,s,t,t,u,u,u,v,v,x,x,y,y,y]. 
 What duplicated() gives me is [g,m,s,t,u,u,v,x,y,y]

x[ duplicated(x) | duplicated(x, fromLast=TRUE) ]

-- 
David.
  
 Playing around with it, I got this but can't helping thinking there must be a 
 less awkward way:
 set.seed(2013)
 x-sort(c(letters,letters[sample(26,10,1)]))
 x-x[duplicated(x)]
 x-sort(c(x,unique(x)))
 _
 Kevin Parent, Ph.D
 Korea Maritime University
 From: David Winsemius dwinsem...@comcast.net
 To: Kevin Parent kspar...@yahoo.com 
 Cc: r-help@r-project.org r-help@r-project.org 
 Sent: Thursday, August 8, 2013 3:03 PM
 Subject: Re: [R] Extracting only multiple occurrences
 
 
 On Aug 7, 2013, at 10:37 PM, Kevin Parent wrote:
 
  Hoping someone here can help me with this small problem.
  set.seed(2013)
  
  x-sort(c(letters,letters[sample(26,10,1)]))
  
  This gives a vector of 36 letters with some muliples (in this case, 
  g,m,s,t,u,v,x,y). Now what I need is to get rid of the ones that only occur 
  once and keep the multiples. I need the opposite of the unique() function. 
  I expect this should be pretty easy but I can't see it. Anyone know a 
  solution? Thanks in advance!
  
  
 ?duplicated
 
 x[ duplicated(x) ]
 
 
 David Winsemius
 Alameda, CA, USA
 
 
 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How is a file descriptor stored ?

2013-08-08 Thread mohan . radhakrishnan
Hi,

The file handling code sometimes throws this exception.


Error in UseMethod(close) :
  no applicable method for 'close' applied to an object of class c
('integer', 'numeric')

Is there a sample based on my code that I can test ? I want to extract the
file descriptors from the hashmap and close them. I think that is causing
the exception. Sometimes just closing - close(fd) - is causing this too.

Thanks,
Mohan



 
 
 
RE: [R] How is a file descriptor stored ?
 
 
William Dunlap   
   to:   
 Berend Hasselman, mohan.radhakrish...@polarisft.com 
 07-08-2013 08:01 PM 
 
 
 
 
Cc:  
r-help@r-project.org   
 
 
 
 





 Use

 assign(key, file( key, w ), envir=cpufile)

 In your assign expression you are assigning cpufile to the third formal
argument which is
 pos.
 You meant the envir argument, I presume.

Or use the syntax
cpufile[[key]] - file(key, w)
instead of
assign(key, file( key, w ), envir=cpufile)
The former works for lists and environments and corresponds to your
later usage of
 listoffiles[[key]]
to retrieve the data.

From what I've seen of your example, a list might be a better way to
store your data, because of its copy-on-write semantics and because
it doesn't keep a parent environment in memory.  By using
'[[' instead of 'get' and 'assign' you minimize the number of changes
required to switch between a list and an environment.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf
 Of Berend Hasselman
 Sent: Wednesday, August 07, 2013 4:10 AM
 To: mohan.radhakrish...@polarisft.com
 Cc: r-help@r-project.org
 Subject: Re: [R] How is a file descriptor stored ?


 On 07-08-2013, at 12:13, mohan.radhakrish...@polarisft.com wrote:

 
  Hi,
 I thought that 'R' like java will allow me to store file names
  (keys) and file descriptors(values) in a hashmap.
 
 
  filelist.array - function(n){
   sink(nmon.log)
   cpufile - new.env(hash=T, parent=emptyenv())
   for (i in 1:n) {
 key - paste(output, i, .txt, sep = )
 assign(key, file( key, w ), cpufile)
   }
 sink()
return (cpufile)
  }
 
  But when I try to test it like this there is an exception
 
  [1] Exception is  Error in UseMethod(\close\): no applicable method
for
  'close' applied to an object of class \c('integer', 'numeric')\\n
 
  test.simple.filelist.array - function() {
 
 execution - tryCatch({
   sink(nmon.log)
   listoffiles - filelist.array(3)
   for (v in ls(listoffiles)) {
   print(paste(Map value is [,
listoffiles[[v]], ]))
   fd - listoffiles[[v]]
   close(fd)
   }
 sink()
   }, error = function(err){
   print(paste(Exception is ,err))
   })
  }
 
  I think I am missing some fundamentals.
 

 Read the help page for assign more carefully.
 Use

 assign(key, file( key, w ), envir=cpufile)

 In your assign expression you are assigning cpufile to the third formal
argument which is
 pos.
 You meant the envir argument, I presume.

 Berend

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




This e-Mail may contain proprietary and 

Re: [R] Running time complexity of Seasonal ARIMA model (forecast package)

2013-08-08 Thread Prof Brian Ripley

On 08/08/2013 05:08, Mohit Dhingra wrote:

*Dear All,*

I am using Seasonal ARIMA model for predicting cloud workloads. I want to
know the running time complexity of building model by the algorithm
implemented in R (I am not sure, is it Yule-Walker?). I want to know if it


It is not Yule-Walker (which is for AR models only).


is polynomial O(n^2) etc. or exponential or linear (O(n)).  Can someone
please help.


What is 'n' here?  Please read the references for yourself: they will 
tell you enough to deduce the answer -- or you could experiment.



PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


PLEASE do.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] diallel analysis question

2013-08-08 Thread waqas shafqat
plz solve  this question and send me commands..

this is a question for diallel analysis..


thanks
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] laf_open_fwf

2013-08-08 Thread christian.kamenik
Dear Jan

Many thanks for your help. In fact, all lines are shorter than my column 
width...

my.column.widths:   238
range(nchar(lines)):235 237

So, it seems I have an inconsistent file structure...
I guess there is no way to handle this in an automated way?

Best Regard

Christian Kamenik
Project Manager

Federal Department of the Environment, Transport, Energy and Communications 
DETEC  
Federal Roads Office FEDRO
Division Road Traffic
Road Accident Statistics

Mailing Address: 3003 Bern
Location: Weltpoststrasse 5, 3015 Bern

Tel +41 31 323 14 89 
Fax +41 31 323 43 21

christian.kame...@astra.admin.ch
www.astra.admin.ch
-Ursprüngliche Nachricht-
Von: Jan van der Laan [mailto:rh...@eoos.dds.nl] 
Gesendet: Mittwoch, 7. August 2013 20:57
An: r-help@r-project.org
Cc: Kamenik Christian ASTRA
Betreff: Re: [R] laf_open_fwf

Dear Christian,

Well... it shouldn't normally do that. The only way I can currently think of 
that might cause this problem is that the file has \r\n\r\n, which would mean 
that every line is followed by an empty line.

Another cause might be (although I would not really expect the results you see) 
that the sum of your column widths is larger than the actual with of the line.

You can check your line lengths using:

lines - readLines(my.filename)
nchar(lines)

Each line should have the same length and be equal to (or at least larger than) 
sum(my.column.widths)

If this is not the problem: would it be possible that you send me a small part 
of your file so that I could try to reproduce the problem? Or if you cannot 
share your data: replace the actual values with nonsense values.

Regards,
Jan

PS I read your mail by chance as I am not a regular r-help reader. When you 
have specific LaF problems it is better to also cc me directly.

On 08/06/2013 12:35 PM, christian.kame...@astra.admin.ch wrote:
 Dear all

 I was trying the (fairly new) LaF package, and came across the following 
 problem:

 I opened a connection to a fixed width ASCII file using 
 laf_open_fwf(my.filename, my.column_types, my.column_widths, 
 my.column_names)

 When looking at the data, it turned out that \n (newline) and \r (carriage 
 return) were considered as characters, thus destroying the structure in my 
 data (the second column does not include any numbers):

 my.data[1565:1575,1:3]

 MF_FARZ1  Fahrzeugarttext MF_MARKE
 1 \n043 Landwirt. Traktor2140
 2 \n043 Landwirt. Traktor6206
 3 \n001 Personenwagen2026
 4 \n001 Personenwagen2026
 5\r\n00 1Personenwagen404
 6\r\n02 0Gesellschaftswagen   710
 7\r\n00 1Personenwagen505
 8\r\n00 1Personenwagen505
 9\r\n00 1Personenwagen301
 10   \r\n00 1Personenwagen553
 11   \r\n04 3Landwirt. Traktor257

 I am working on Windows 7 32-bit.

 Any help would be highly appreciated.

 Best Regard

 Christian Kamenik
 Project Manager

 Federal Department of the Environment, Transport, Energy and 
 Communications DETEC Federal Roads Office FEDRO Division Road Traffic 
 Road Accident Statistics

 Mailing Address: 3003 Bern
 Location: Weltpoststrasse 5, 3015 Bern

 Tel +41 31 323 14 89
 Fax +41 31 323 43 21

 christian.kame...@astra.admin.chmailto:christian.kamenik@astra.admin.
 ch www.astra.admin.chhttp://www.astra.admin.ch/


   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Upgrading to R and keeping packages

2013-08-08 Thread Pancho Mulongeni
Hi! I just installed the latest R 3.01.
I then wanted to update my packages.
I believe the advice given is to take the library folder from the old R version 
and copy it on top of (overwrite) the library folder of the new R version, in 
my case the library of R 2.15.2 to library of R 3.01.

When I did this, the next time I started R 3.01, I had an error message
Error in .Call(R_isMethodsDispatchOn, onOff, PACKAGE = base) :
  R_isMethodsDispatchOn not available for .Call() for package base

also the update.packages() function was 'not available'.
Where did I go wrong?
I think I will just update the packages I want rather than the whole folder and 
see if this works,

Pancho Mulongeni
Research Assistant
PharmAccess Foundation
1 Fouché Street
Windhoek West
Windhoek
Namibia
 
Tel:   +264 61 419 000
Fax:  +264 61 419 001/2
Mob: +264 81 4456 286

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with na.action within own function

2013-08-08 Thread ivan
Dear R Community,

I am trying to build a very simple function which uses lm and coeftest to
return a coefficient matrix with heteroskedasticity robust standard errors.
The  function is the following:

reg=function(formula,data,na.action){
  res=lm(formula=formula,data=data,na.action=na.action)
  hc3=coeftest(res, vcov = vcovHC(res, type = HC3))
  residuals=resid(res)
  return(list(coef.hc3=hc3,R2=summary(res)$r.squared,
R2.adj=summary(res)$adj.r.squared, residuals=residuals))
}

The function works perfect as long as the data contains no missing values.
I.e.

test1=seq(1,30,1)
test2=seq(1,30,1)
testdata=data.frame(test1,test2)
reg(formula=test1~test2, data=testdata, na.action=na.exclude)

However, as soon as I have a missing value, it does not work any more (the
error message is: Error in estfun(x)/X : non-conformable arrays):

test1=seq(1,30,1)
test2=seq(1,30,1)
test2[5]=NA
testdata=data.frame(test1,test2)
reg(formula=test1~test2, data=testdata, na.action=na.exclude)

My feeling is that it has something to do with na.exclude being a function
itself and hence R having a problem with inputing a function into another
function. Does anyone have an idea?

Thanks a lot!!!
Regards

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting only multiple occurrences

2013-08-08 Thread Jim Lemon

On 08/08/2013 04:23 PM, Kevin Parent wrote:

Well that almost works, and I didn't know about duplicated() so thanks for 
that. However, it only gives me the duplicated values. I need the original ones 
too. So the result I want is: [g,g,m,m,s,s,t,t,u,u,u,v,v,x,x,y,y,y]. What 
duplicated() gives me is [g,m,s,t,u,u,v,x,y,y]



Hi Kevin,
How about:

x[x %in% duplicated(x)]

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Upgrading to R and keeping packages

2013-08-08 Thread Prof Brian Ripley

On 08/08/2013 09:07, Pancho Mulongeni wrote:

Hi! I just installed the latest R 3.01.
I then wanted to update my packages.
I believe the advice given is to take the library folder from the old R version 
and copy it on top of (overwrite) the library folder of the new R version, in 
my case the library of R 2.15.2 to library of R 3.01.

When I did this, the next time I started R 3.01, I had an error message
Error in .Call(R_isMethodsDispatchOn, onOff, PACKAGE = base) :
   R_isMethodsDispatchOn not available for .Call() for package base

also the update.packages() function was 'not available'.
Where did I go wrong?


Copy over the base packages.

This was never the advice.  See e.g. 
http://cran.r-project.org/bin/windows/base/rw-FAQ.html#What_0027s-the-best-way-to-upgrade_003f



I think I will just update the packages I want rather than the whole folder and 
see if this works,

Pancho Mulongeni
Research Assistant
PharmAccess Foundation
1 Fouché Street
Windhoek West
Windhoek
Namibia

Tel:   +264 61 419 000
Fax:  +264 61 419 001/2
Mob: +264 81 4456 286

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting only multiple occurrences

2013-08-08 Thread Berend Hasselman

On 08-08-2013, at 10:27, Jim Lemon j...@bitwrit.com.au wrote:

 On 08/08/2013 04:23 PM, Kevin Parent wrote:
 Well that almost works, and I didn't know about duplicated() so thanks for 
 that. However, it only gives me the duplicated values. I need the original 
 ones too. So the result I want is: [g,g,m,m,s,s,t,t,u,u,u,v,v,x,x,y,y,y]. 
 What duplicated() gives me is [g,m,s,t,u,u,v,x,y,y]
 
 
 Hi Kevin,
 How about:
 
 x[x %in% duplicated(x)]

Don't you mean this

x[x %in% x[duplicated(x)]]   

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Numercial evaluation of intgral with different bounds

2013-08-08 Thread Carlos Nasher
Hello R helpers,

I'm struggling how to apply the integrate function to a data frame. Here is
an example of what I'm trying to do:

# Create data frame
x - 0:4
tx - 10:14
T - 12:16
data - data.frame(x=x, tx=tx, T=T)

# Parameter
alpha - 10
beta - 11

# Integral
integrand - function(y){
  (y+alpha)^-(r+data$x)*(y+beta^-(s+1))
}

Now I want to apply the integrate function to evaluate the integral for
each line of the data frame with tx as the lower and T as the upper bound.
The respektive values (and the values only) should be returned in a vector.

I want to avoid the use of a loop since the integral is part of a function
I want to optimize with optim and so speed is crucial. I tried to do this
by something like:

integral - lapply(data$tx, integrate, f=integrand, upper=data$T)
integral2 - sapply(integral, function(x){x[1]})
integral3 - unlist(integral2, use.names=FALSE)

But this doesn't work properly. I'd glad if you have any hints how to get
this done.


Many thanks and best regards,
Carlos


-- 
-
Carlos Nasher
Buchenstr. 12
22299 Hamburg

tel:+49 (0)40 67952962
mobil:+49 (0)175 9386725
mail:  carlos.nas...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting only multiple occurrences

2013-08-08 Thread Jim Lemon

On 08/08/2013 06:52 PM, Berend Hasselman wrote:


On 08-08-2013, at 10:27, Jim Lemonj...@bitwrit.com.au  wrote:


On 08/08/2013 04:23 PM, Kevin Parent wrote:

Well that almost works, and I didn't know about duplicated() so thanks for 
that. However, it only gives me the duplicated values. I need the original ones 
too. So the result I want is: [g,g,m,m,s,s,t,t,u,u,u,v,v,x,x,y,y,y]. What 
duplicated() gives me is [g,m,s,t,u,u,v,x,y,y]



Hi Kevin,
How about:

x[x %in% duplicated(x)]


Don't you mean this

x[x %in% x[duplicated(x)]]

Berend


Ah, yes, thanks.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting only multiple occurrences

2013-08-08 Thread Rolf Turner

On 08/08/13 20:27, Jim Lemon wrote:

On 08/08/2013 04:23 PM, Kevin Parent wrote:
Well that almost works, and I didn't know about duplicated() so 
thanks for that. However, it only gives me the duplicated values. I 
need the original ones too. So the result I want is: 
[g,g,m,m,s,s,t,t,u,u,u,v,v,x,x,y,y,y]. What duplicated() gives me is 
[g,m,s,t,u,u,v,x,y,y]




Hi Kevin,
How about:

x[x %in% duplicated(x)]


Uh, I think you mean

x[x %in% x[duplicated(x)]]

Another idear:

tx - table(x)
tx - tx[tx1]
rep(names(tx),tx)

Well, that's three lines as opposed to one, so not as good.  But it 
perhaps demonstrates

a useful tool to add to one's kit.

cheers,

Rolf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gnls() and L-BFGS-B in library(nlme)

2013-08-08 Thread Philipp Grueber
Dear R Users,

I attempt to estimate a generalized nonlinear least squares model using
gnls() from the nlme library.

I wish to restrict some of my parameters using the L-BFGS-B method for the
optim optimizer. However, in contrast to nls() in the same package, gnls()
does not accept any `lower' or `upper' argument, nor does the control
argument allow for such elements. Unfortunately, the help file does not
provide a description of how to include lower or upper bounds, nor did I
find any hint in the internet.

Example
library(nlme)
#works but no restriction included
fm1 - gnls(weight ~ SSlogis(Time, Asym, xmid, scal), Soybean,
start=c(Asym=17,xmid=50,scal=7),weights =
varPower(),control=gnlsControl(opt=optim,optimMethod=L-BFGS-B))
summary(fm1)

#does not work
fm2 - gnls(weight ~ SSlogis(Time, Asym, xmid, scal),
Soybean,start=c(Asym=17,xmid=50,scal=7), weights =
varPower(),control=gnlsControl(opt=optim,optimMethod=L-BFGS-B),lower=c(0,0,0),upper=c(100,100,100))
summary(fm2)

fm3 - gnls(weight ~ SSlogis(Time, Asym, xmid, scal),
Soybean,start=c(Asym=17,xmid=50,scal=7),lower=c(0,0,0),upper=c(100,100,100),
weights =
varPower(),control=gnlsControl(opt=optim,optimMethod=L-BFGS-B))
summary(fm3)

Do you know a way to include lower and upper bounds? And: How do I restrict
only selected parameters while I wish others to take any value.

Any hints pointing to a solution are highly appreciated. 

Regards,
Philipp Grueber 



-

EBS Universitaet fuer Wirtschaft und Recht
FARE Department
Wiesbaden/ Germany
http://www.ebs.edu/index.php?id=finaccL=0
--
View this message in context: 
http://r.789695.n4.nabble.com/gnls-and-L-BFGS-B-in-library-nlme-tp4673341.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running time complexity of Seasonal ARIMA model (forecast package)

2013-08-08 Thread Mohit Dhingra
*Dear Sir,*

Thanks for your response. Here, I was using 'n' to denote the input size
(no. of points in time series using which I am building a Seasonal ARIMA
model). I can check the running time myself and I have done that as well
(it takes some 1-2 minutes for 50 iterations for my input size), but I want
to know more about the asysmptotic complexity of the algorithm R uses.  I
can see three methods CSS, CSS-ML and ML that it uses to optimize the
parameters.

Like bubble sort takes O(n^2) time where n is the no. of elements. Can I
define something like this to build my ARIMA model which has n points?

*

Thanks  Regards
Mohit Dhingra
+919611190435*


On 8 August 2013 12:45, Prof Brian Ripley rip...@stats.ox.ac.uk wrote:

 On 08/08/2013 05:08, Mohit Dhingra wrote:

 *Dear All,*


 I am using Seasonal ARIMA model for predicting cloud workloads. I want to
 know the running time complexity of building model by the algorithm
 implemented in R (I am not sure, is it Yule-Walker?). I want to know if it


 It is not Yule-Walker (which is for AR models only).


  is polynomial O(n^2) etc. or exponential or linear (O(n)).  Can someone
 please help.


 What is 'n' here?  Please read the references for yourself: they will tell
 you enough to deduce the answer -- or you could experiment.

  PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 PLEASE do.


 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  
 http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] laf_open_fwf

2013-08-08 Thread Jan van der Laan


Without example data it is difficult to give suggestions on how you  
might read this file.


Are you sure your file is fixed width? Sometimes columns are neatly  
aligned using whitespace (tabs/spaces). In that case you could use  
read.table with the default settings.


Another possibility might be that the file is encoded in utf8. I  
expect that reading it in assuming another encoding (such as latin1)  
would lead to varying line sizes. Although I would expect the lengths  
to be larger than the sum of your column widths (as one symbol can be  
larger than one byte).


Jan



christian.kame...@astra.admin.ch schreef:


Dear Jan

Many thanks for your help. In fact, all lines are shorter than my  
column width...


my.column.widths:   238
range(nchar(lines)):235 237

So, it seems I have an inconsistent file structure...
I guess there is no way to handle this in an automated way?

Best Regard

Christian Kamenik
Project Manager

Federal Department of the Environment, Transport, Energy and  
Communications DETEC 

Federal Roads Office FEDRO
Division Road Traffic
Road Accident Statistics

Mailing Address: 3003 Bern
Location: Weltpoststrasse 5, 3015 Bern

Tel +41 31 323 14 89
Fax +41 31 323 43 21

christian.kame...@astra.admin.ch
www.astra.admin.ch
-Ursprüngliche Nachricht-
Von: Jan van der Laan [mailto:rh...@eoos.dds.nl]
Gesendet: Mittwoch, 7. August 2013 20:57
An: r-help@r-project.org
Cc: Kamenik Christian ASTRA
Betreff: Re: [R] laf_open_fwf

Dear Christian,

Well... it shouldn't normally do that. The only way I can currently  
think of that might cause this problem is that the file has  
\r\n\r\n, which would mean that every line is followed by an empty  
line.


Another cause might be (although I would not really expect the  
results you see) that the sum of your column widths is larger than  
the actual with of the line.


You can check your line lengths using:

lines - readLines(my.filename)
nchar(lines)

Each line should have the same length and be equal to (or at least  
larger than) sum(my.column.widths)


If this is not the problem: would it be possible that you send me a  
small part of your file so that I could try to reproduce the  
problem? Or if you cannot share your data: replace the actual values  
with nonsense values.


Regards,
Jan

PS I read your mail by chance as I am not a regular r-help reader.  
When you have specific LaF problems it is better to also cc me  
directly.


On 08/06/2013 12:35 PM, christian.kame...@astra.admin.ch wrote:

Dear all

I was trying the (fairly new) LaF package, and came across the  
following problem:


I opened a connection to a fixed width ASCII file using
laf_open_fwf(my.filename, my.column_types, my.column_widths,
my.column_names)

When looking at the data, it turned out that \n (newline) and \r  
(carriage return) were considered as characters, thus destroying  
the structure in my data (the second column does not include any  
numbers):



my.data[1565:1575,1:3]


MF_FARZ1  Fahrzeugarttext MF_MARKE
1 \n043 Landwirt. Traktor2140
2 \n043 Landwirt. Traktor6206
3 \n001 Personenwagen2026
4 \n001 Personenwagen2026
5\r\n00 1Personenwagen404
6\r\n02 0Gesellschaftswagen   710
7\r\n00 1Personenwagen505
8\r\n00 1Personenwagen505
9\r\n00 1Personenwagen301
10   \r\n00 1Personenwagen553
11   \r\n04 3Landwirt. Traktor257

I am working on Windows 7 32-bit.

Any help would be highly appreciated.

Best Regard

Christian Kamenik
Project Manager

Federal Department of the Environment, Transport, Energy and
Communications DETEC Federal Roads Office FEDRO Division Road Traffic
Road Accident Statistics

Mailing Address: 3003 Bern
Location: Weltpoststrasse 5, 3015 Bern

Tel +41 31 323 14 89
Fax +41 31 323 43 21

christian.kame...@astra.admin.chmailto:christian.kamenik@astra.admin.
ch www.astra.admin.chhttp://www.astra.admin.ch/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lack of catalog number referencing for statistical objects

2013-08-08 Thread Jose Iparraguirre
Hi,

In Skew-t fits to mortality data - can a Gaussian-related distribution replace 
the Gompertz-Makeham as the basis for mortality studies? (J Gerontol A Biol 
Sci Med Sci 2013 August ;68(8):903-913; doi:10.1093/Gerona/gls239) Clark et al 
compares the fit of several distributions to mortality data (using the bbmle 
package, among others).

This is not the place to comment on the paper, of course. Just wanted to share 
with you the following quote (p. 906) -almost a throwaway remark given the 
context of the paper- which may be of interest to the statisticians in the 
forum:

[T]here was a lack of catalogue number referencing for statistical objects. 
Attempts at cataloguing are under way (eg the Digital Library of Mathematical 
Functions), but the present lack of number referencing could be contrasted 
with, for example, that for genes, genomes, single-nucleotide polymorphisms or 
proteins. This has, seemingly, led in some cases to poor characterisation: for 
example... the AIC is calculated differently in packages bbmle and gamlss. And 
then they provide one suggestion for cataloguing.

An obvious question to the forum is how come the AIC is calculated differently 
depending on the package, but I wanted to highlight this need for cataloguing 
statistical objects akin to genes, proteins, etc.

Kind regards,

José

Prof. José Iparraguirre
Chief Economist
Age UK

The Wireless from Age UK | Radio for grown-ups.

www.ageuk.org.uk/thewireless


If you’re looking for a radio station that offers real variety, tune in to The 
Wireless from Age UK. 
Whether you choose to listen through the website at 
www.ageuk.org.uk/thewireless, on digital radio (currently available in London 
and Yorkshire) or through our TuneIn Radio app, you can look forward to an 
inspiring mix of music, conversation and useful information 24 hours a day.



 
---
Age UK is a registered charity and company limited by guarantee, (registered 
charity number 1128267, registered company number 6825798). 
Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA.

For the purposes of promoting Age UK Insurance, Age UK is an Appointed 
Representative of Age UK Enterprises Limited, Age UK is an Introducer 
Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth 
Access for the purposes of introducing potential annuity and health 
cash plans customers respectively.  Age UK Enterprises Limited, JLT Benefit 
Solutions Limited and Simplyhealth Access are all authorised and 
regulated by the Financial Services Authority. 
--

This email and any files transmitted with it are confide...{{dropped:25}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with na.action within own function

2013-08-08 Thread ivan
I found the solution.

coeftest does not work with na.exclude but with na.omit only, i.e. one
needs to omit missing values from the residual matrix.

Cheers!


On Thu, Aug 8, 2013 at 10:19 AM, ivan i.pet...@gmail.com wrote:

 Dear R Community,

 I am trying to build a very simple function which uses lm and coeftest to
 return a coefficient matrix with heteroskedasticity robust standard errors.
 The  function is the following:

 reg=function(formula,data,na.action){
   res=lm(formula=formula,data=data,na.action=na.action)
   hc3=coeftest(res, vcov = vcovHC(res, type = HC3))
   residuals=resid(res)
   return(list(coef.hc3=hc3,R2=summary(res)$r.squared,
 R2.adj=summary(res)$adj.r.squared, residuals=residuals))
 }

 The function works perfect as long as the data contains no missing values.
 I.e.

 test1=seq(1,30,1)
 test2=seq(1,30,1)
 testdata=data.frame(test1,test2)
 reg(formula=test1~test2, data=testdata, na.action=na.exclude)

 However, as soon as I have a missing value, it does not work any more (the
 error message is: Error in estfun(x)/X : non-conformable arrays):

 test1=seq(1,30,1)
 test2=seq(1,30,1)
 test2[5]=NA
 testdata=data.frame(test1,test2)
 reg(formula=test1~test2, data=testdata, na.action=na.exclude)

 My feeling is that it has something to do with na.exclude being a function
 itself and hence R having a problem with inputing a function into another
 function. Does anyone have an idea?

 Thanks a lot!!!
 Regards




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] p values for partial correlations

2013-08-08 Thread Demetrio Luis Guadagnin
Dear Tal:
Thank you for your help.
Thats what I run:

install.packages(corpcor)

require(corpcor)

correlations=cor(mydata)

pcorrrel = cor2pcor(correlations); pcorrrel






2013/8/7 Tal Galili tal.gal...@gmail.com

 A short self contained code would help us help you.

 You can try using str on the output of the command you are using, and
 try to understand where the p.value is located.



 Tal


 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
 www.r-statistics.com (English)

 --



 On Wed, Aug 7, 2013 at 10:01 PM, Demetrio Luis Guadagnin 
 dlguadag...@gmail.com wrote:

 Dear:
 I needed to calculate partial correlations and used the package corpcor
 for
 that purpose.
 The output doesnot provide p values and I was unable to find information
 or
 posts on how to get them.
 Does someone can help me?
 Thanks.

 --
 Dr. Demetrio Luis Guadagnin
 Conservação e Manejo de Vida Silvestre
 Universidade Federal do Rio Grande do Sul
 Departamento de Ecologia
 Av. Bento Gonçalves 9500
 Setor 4, Prédio 43422, Sala 105
 Caixa Postal 15007 - 91501-970 Porto Alegre RS
 Fone: (51) 3308 6774
 Fax: (51) 3308 7626
 dlguadag...@gmail.com
 Skype: demetriolguadagnin

 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Dr. Demetrio Luis Guadagnin
Conservação e Manejo de Vida Silvestre
Universidade Federal do Rio Grande do Sul
Departamento de Ecologia
Av. Bento Gonçalves 9500
Setor 4, Prédio 43422, Sala 105
Caixa Postal 15007 - 91501-970 Porto Alegre RS
Fone: (51) 3308 6774
Fax: (51) 3308 7626
dlguadag...@gmail.com
Skype: demetriolguadagnin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] New package about migration indices

2013-08-08 Thread Gergely Daróczi
Dear useRs,

me and my colleague (cc) have recently released a new package on CRAN about
computing various migration indices like the Crude Migration Rate, the
Effectiveness and Connectivity Index, different Gini indices or the
Coefficient of Variation.

I hope that some of you dealing with migration matrices in the R console
would find this small package useful:
http://cran.r-project.org/web/packages/migration.indices/

Best,
Gergely

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Add column to dataframe based on code in other column

2013-08-08 Thread Dark
Hi all,

I have a dataframe of users which contain US-state codes. 
Now I want to add a column named REGION based on the state code. I have
already done a mapping:

NorthEast - c(07, 20, 22, 30, 31, 33, 39, 41, 47)
MidWest - c(14, 15, 16, 17, 23, 24, 26, 28, 35, 36, 43, 52)
South - c(01, 04, 08, 09, 10, 11, 18, 19, 21, 25, 34, 37, 42, 44, 45, 49,
51)
West - c(02, 03, 05, 06, 12, 13, 27, 29, 32, 38, 46, 50, 53)
Other - c(40, 48, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 94,
98, 99)

So for example:
NameState_Code
Tom   20
Harry 56
Ben 05
Sally   04

Should become like:
So for example:
NameState_Code REGION
Tom   20   NorthEast
Harry 56   Other
Ben 05  West
Sally   04   South

Could anyone help me with a clever statement?



--
View this message in context: 
http://r.789695.n4.nabble.com/Add-column-to-dataframe-based-on-code-in-other-column-tp4673335.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Explaining variance in a univariate time series

2013-08-08 Thread john rre
Dear List,



I am looking to reveal the combination of environmental factors that bets
explain the observed variance in a uni-variate time series of a population.



I have approached this using two methods, and have different results,
therefore i was hoping somebody may have done something similar, or have
knowledge of the area, such that they could advise me of the best approach.



My first approach was to use canonical correlations i.e. search for
significant correlations between explanatory variables x1, x2 and x3  and
the time series,  this made sense to me and produced perfectly plausible  (
in line with a priori hypothesise) results. However, this approach  doesn't
take into account the other variables present.



To address this i then used a dynamic factor analysis - explaining temporal
variation in a set of n observed time series using linear combinations of a
set of m hidden random walks, where m  n. I then used  a AIC framework to
arrive at the most likely model.



http://cran.r-project.org/web/packages/MARSS/vignettes/UserGuide.pdf



However, the results differed, the variables present in the “best” AIC
model were not necessarily the ones with the strongest canonical
correlation.



Why my this be the case? Is there a better way to go about this?



Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Varying statistical significance in estimates of linear model

2013-08-08 Thread Stathis Kamperis
Hi everyone,

I have a response variable 'y' and several predictor variables 'x_i'.
I start with a linear model:

m1 - lm(y ~ x1); summary(m1)

and I get a statistically significant estimate for 'x1'. Then, I
modify my model as:

m2 - lm(y ~ x1 + x2); summary(m2)

At this moment, the estimate for x1 might become non-significant while
the estimate of x2 significant.

As I add more predictor variables (or interaction terms), the
estimates for which I get a statistically significant result vary. So
sometimes x1, x2, x6 are significant, while others, x2, x4, x5 are.

It seems to me that I could tweak my model in such a way (by
adding/removing predictor variables or suitable interaction terms)
that I could prove whatever I'd like to prove.

What is the proper methodology involved here ? What do you people do
in such cases ? I can provide the data if anyone cares and would like
to have a look at them.

Best regards,
Stathis Kamperis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Varying statistical significance in estimates of linear model

2013-08-08 Thread ONKELINX, Thierry
Dear Stathis,

I recommend that you try to get some advice from a local statistician or read 
an introductory book on statistics. This kind of question is beyond the scope 
of a mailing list.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey


-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
Stathis Kamperis
Verzonden: donderdag 8 augustus 2013 12:43
Aan: r-help@r-project.org
Onderwerp: [R] Varying statistical significance in estimates of linear model

Hi everyone,

I have a response variable 'y' and several predictor variables 'x_i'.
I start with a linear model:

m1 - lm(y ~ x1); summary(m1)

and I get a statistically significant estimate for 'x1'. Then, I modify my 
model as:

m2 - lm(y ~ x1 + x2); summary(m2)

At this moment, the estimate for x1 might become non-significant while the 
estimate of x2 significant.

As I add more predictor variables (or interaction terms), the estimates for 
which I get a statistically significant result vary. So sometimes x1, x2, x6 
are significant, while others, x2, x4, x5 are.

It seems to me that I could tweak my model in such a way (by adding/removing 
predictor variables or suitable interaction terms) that I could prove 
whatever I'd like to prove.

What is the proper methodology involved here ? What do you people do in such 
cases ? I can provide the data if anyone cares and would like to have a look at 
them.

Best regards,
Stathis Kamperis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en 
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is 
door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the 
writer and may not be regarded as stating an official position of INBO, as long 
as the message is not confirmed by a duly signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Varying statistical significance in estimates of linear model

2013-08-08 Thread Bert Gunter
Stathis:

1. This has nothing to do with R.  Post on a statistics list, like
stats.stackexchange.com

2. Read a basic regression/linear models text. You need to educate yourself.

-- Bert

On Thu, Aug 8, 2013 at 3:43 AM, Stathis Kamperis ekamp...@gmail.com wrote:
 Hi everyone,

 I have a response variable 'y' and several predictor variables 'x_i'.
 I start with a linear model:

 m1 - lm(y ~ x1); summary(m1)

 and I get a statistically significant estimate for 'x1'. Then, I
 modify my model as:

 m2 - lm(y ~ x1 + x2); summary(m2)

 At this moment, the estimate for x1 might become non-significant while
 the estimate of x2 significant.

 As I add more predictor variables (or interaction terms), the
 estimates for which I get a statistically significant result vary. So
 sometimes x1, x2, x6 are significant, while others, x2, x4, x5 are.

 It seems to me that I could tweak my model in such a way (by
 adding/removing predictor variables or suitable interaction terms)
 that I could prove whatever I'd like to prove.

 What is the proper methodology involved here ? What do you people do
 in such cases ? I can provide the data if anyone cares and would like
 to have a look at them.

 Best regards,
 Stathis Kamperis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add column to dataframe based on code in other column

2013-08-08 Thread Berend Hasselman

On 08-08-2013, at 11:33, Dark i...@software-solutions.nl wrote:

 Hi all,
 
 I have a dataframe of users which contain US-state codes. 
 Now I want to add a column named REGION based on the state code. I have
 already done a mapping:
 
 NorthEast - c(07, 20, 22, 30, 31, 33, 39, 41, 47)
 MidWest - c(14, 15, 16, 17, 23, 24, 26, 28, 35, 36, 43, 52)
 South - c(01, 04, 08, 09, 10, 11, 18, 19, 21, 25, 34, 37, 42, 44, 45, 49,
 51)
 West - c(02, 03, 05, 06, 12, 13, 27, 29, 32, 38, 46, 50, 53)
 Other - c(40, 48, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 94,
 98, 99)
 
 So for example:
 NameState_Code
 Tom   20
 Harry 56
 Ben 05
 Sally   04
 
 Should become like:
 So for example:
 NameState_Code REGION
 Tom   20   NorthEast
 Harry 56   Other
 Ben 05  West
 Sally   04   South
 

dd - read.table(text=NameState_Code
Tom   20
Harry 56
Ben 05
Sally   04, header=TRUE, stringsAsFactors=FALSE)

# Create table for regions indexed by state_code

region.table - rep(UNKNOWN,99)
region.table[NorthEast] - NorthEast
region.table[MidWest] - MidWest
region.table[South] - South
region.table[West] - West 
region.table[Other] - Other
region.table

# then this is easy

dd[,REGION] - region.table[dd$State_Code]


Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Revolutions blog roundup: July 2013

2013-08-08 Thread David Smith
Revolution Analytics staff write about R every weekday at the Revolutions blog:
 http://blog.revolutionanalytics.com
and every month I post a summary of articles from the previous month
of particular interest to readers of r-help.

In case you missed them, here are some articles related to R from the
month of July:

A new 90-second, creative commons video helps R enthusiasts share the
history, community and applications of R: http://bit.ly/13mf0HX

Analyst group Butler Analytics reviews 10 predictive analytics
platforms, and says that real analysts use R: http://bit.ly/13mf0HY

An excellent example of Simpsons Paradox: US median wages rose
overall, but within every educational subgroup, they declined:
http://bit.ly/13meXMk

Some tips on identifying R functions that will benefit most from byte
compilation, and how to enable automatic package compilation:
http://bit.ly/13mf0HZ

Joe Rickert's poster at useR! 2013 lists the ways that Revolution
Analytics supports the R community: http://bit.ly/13meXMl

Andrie deVries describes the applications of survival analysis
techniques to solve the problem of marketing attribution:
http://bit.ly/13mf0I0

A Shiny app by Ramnath Vaidyanathan displays the real-time status of
bike-sharing programs in more than 100 cities: http://bit.ly/13mf0I1

The new (and free) O'Reilly mini-book on real-time analytics includes
a section on a big-data architecture with R: http://bit.ly/13mf0I2

Thomas Levine's R spells includes some useful R tricks you might not
know about: http://bit.ly/13mf0I3

A review of the Rcpp tutorial at useR! 2013, with some benchmarked
examples combining R and C++ code: http://bit.ly/13meXMo

Slides from the Hadoop Summit talk High Performance Predictive
Analytics in R and Hadoop by Revolution Analytics' US Chief Scientist
Mario Inchosa: http://bit.ly/13mf0I4

My two-part review of some talks from the useR! 2013 conference: part
1 http://bit.ly/13mf0I5 and part 2 http://bit.ly/13mf0I6

Joe Rickert looks at the new big-data tree algorithm behind the
rxDTree function in the RevoScaleR package: http://bit.ly/13mf3n0

Digital marketing company X+1 uses Revolution R Enterprise for
real-time marketing optimization based on statistical models in R:
http://bit.ly/13mf3mZ

Some highlights from the June 2013 issue of the R Journal: http://bit.ly/13mf0I7

Meet the members of the Revolution Analytics team in London:
http://bit.ly/13mf3n1

A map of R user groups worldwide: http://bit.ly/13mf0Yk

Highlights and photos from some recent R user group meetings around
the world: http://bit.ly/13mf3n2

Some non-R stories in the past month included: what not to do when
analyzing data (http://bit.ly/13mf0Ym), results of a survey of data
scientists (http://bit.ly/13mf0Yn), a red-hot ball of nickel meets a
block of ice (http://bit.ly/13mf0Yl), a lion reunites with his
caretakers (http://bit.ly/13mf0Yo) and one picture that looks like
four (http://bit.ly/13mf3n3).

Meeting times for local R user groups (http://bit.ly/eC5YQe) can be
found on the updated R Community Calendar at: http://bit.ly/bb3naW

If you're looking for more articles about R, you can find summaries
from previous months at http://blog.revolutionanalytics.com/roundups/.
Join the Revolution mailing list at
http://revolutionanalytics.com/newsletter to be alerted to new
articles on a monthly basis.

As always, thanks for the comments and please keep sending suggestions
to me at da...@revolutionanalytics.com . Don't forget you can also
follow the blog using an RSS reader, or by following me on Twitter
(I'm @revodavid).

Cheers,
# David

-- 
David M Smith da...@revolutionanalytics.com
VP of Marketing, Revolution Analytics  http://blog.revolutionanalytics.com
Tel: +1 (650) 646-9523 (Seattle WA, USA)
Twitter: @revodavid
We're hiring! www.revolutionanalytics.com/careers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reason for difference in singular value decomposition produced by function La.svd (via prcomp)?

2013-08-08 Thread Ulrike Grömping

Dear expeRts,

I have run some simulations under R 2.15.1 on a Mac, and I have rerun a 
sample of them under R 3.0.1 on Windows (and also for comparison under 
R2.14.1 on Windows). For most cases, I get exactly the same results in 
all three runs. However, for those cases that depend on principal 
components computed with prcomp, where the particular choice of the 
orthogonalization is arbitrary because of several identical singular 
values, I get different results between the two Windows versions on the 
one hand and the Mac version on the other hand.


I did not find anything documented about the difference; maybe I didn't 
know where to look. Can someone help me understand the reason?


Best, Ulrike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mac Os X 10.6.8, R and edit/vi

2013-08-08 Thread David Epstein
I tried using various versions of the 'edit' command. Here is an account of
how this failed. I hope I have included all relevant information.

I haven't used R for a couple of years. Before restarting with R, I
downloaded the latest version I could find in its binary version, and
installed it without any problems.
Mac Os X Finder command About R responds with
R 3.0.1 GUI 1.61 Snow Leopard build (6492)

From inside R
version
 
 platform   x86_64-apple-darwin10.8.0 
 arch   x86_64  
 os darwin10.8.0#(However, my os is in fact 10.6.8)  
 system x86_64, darwin10.8.0
 status 
 major  3   
 minor  0.1 
 year   2013
 month  05  
 day16  
 svn rev62743   
 language   R   
 version.string R version 3.0.1 (2013-05-16)
 nickname   Good Sport   
 
edit(file='2.9.R')
 Error in file(con, r) : cannot open the connection
 In addition: Warning message:
 In file(con, r) : cannot open file '2.9.R': No such file or directory
 
getOption('editor')
[1] vi

edit(file='2.9.R',editor='/opt/local/bin/vim')
 Error in file(con, r) : cannot open the connection
 In addition: Warning message:
 In file(con, r) : cannot open file '2.9.R': No such file or directory
 
vi(file='try')
 Error in file(con, r) : cannot open the connection
 In addition: Warning message:
 In file(con, r) : cannot open file 'try': No such file or directory
 
And here is my interaction with tcsh (my default shell)
H2:~% echo $VISUAL
/opt/local/bin/vim
H2:~% echo $EDITOR
/opt/local/bin/vim
H2:~% which vi
vi:   aliased to /opt/local/bin/vim
H2:~/4Chap2% ls -ld
drwxr-xr-x  11 dbae  dbae  374  8 Aug 10:54 ./


What am I doing wrong? 
Thanks for any help.
David
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with dea.boot under R 3.0.1

2013-08-08 Thread greatest.possible.newbie
Dear Vera,
I had a similar problem once and as far as I can remember the reason were
some negative inputs or outputs. 

# Check for negative values
x1a.neg - apply(x1a, 1, function(x) any(x0))
y1.neg - apply(y1, 1, function(x) any(x0))
exclude - x1a.neg | y1.neg

# Exclude negative rows
x1a.neg - x1a.neg[!exclude,]
y1.neg - y1.neg[!exclude,]

# Try again!

Daniel



--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-dea-boot-under-R-3-0-1-tp4669964p4673363.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add column to dataframe based on code in other column

2013-08-08 Thread Bert Gunter
Dark:

1. In future, please use dput()  to post data to enable us to more
easily read them from your email.

2. As Berend demonstrates, using a more appropriate data structure is
what's required. Here is a slightly shorter, but perhaps trickier
alternative to his solution:

 df  ## Your example data frame
   Name State_Code
1   Tom 20
2 Harry 56
3   Ben  5
4 Sally  4

 l 
 -list(MidWest=MidWest,South=South,NorthEast=NorthEast,Other=Other,West=West)
 df - within(df,regions - 
 rep(names(l),sapply(l,length))[match(State_Code,unlist(l))])
 df
   Name State_Code   regions
1   Tom 20 NorthEast
2 Harry 56 Other
3   Ben  5  West
4 Sally  4 South

3. Need I say that there may be other alternatives that might be better.

Cheers,
Bert


On Thu, Aug 8, 2013 at 7:14 AM, Berend Hasselman b...@xs4all.nl wrote:

 On 08-08-2013, at 11:33, Dark i...@software-solutions.nl wrote:

 Hi all,

 I have a dataframe of users which contain US-state codes.
 Now I want to add a column named REGION based on the state code. I have
 already done a mapping:

 NorthEast - c(07, 20, 22, 30, 31, 33, 39, 41, 47)
 MidWest - c(14, 15, 16, 17, 23, 24, 26, 28, 35, 36, 43, 52)
 South - c(01, 04, 08, 09, 10, 11, 18, 19, 21, 25, 34, 37, 42, 44, 45, 49,
 51)
 West - c(02, 03, 05, 06, 12, 13, 27, 29, 32, 38, 46, 50, 53)
 Other - c(40, 48, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 94,
 98, 99)

 So for example:
 NameState_Code
 Tom   20
 Harry 56
 Ben 05
 Sally   04

 Should become like:
 So for example:
 NameState_Code REGION
 Tom   20   NorthEast
 Harry 56   Other
 Ben 05  West
 Sally   04   South


 dd - read.table(text=NameState_Code
 Tom   20
 Harry 56
 Ben 05
 Sally   04, header=TRUE, stringsAsFactors=FALSE)

 # Create table for regions indexed by state_code

 region.table - rep(UNKNOWN,99)
 region.table[NorthEast] - NorthEast
 region.table[MidWest] - MidWest
 region.table[South] - South
 region.table[West] - West
 region.table[Other] - Other
 region.table

 # then this is easy

 dd[,REGION] - region.table[dd$State_Code]


 Berend

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How is a file descriptor stored ?

2013-08-08 Thread William Dunlap
I cannot reproduce your problem.  You will have to
give more details.  (I assume you have already made
the suggested changes to your code - either label
the 3rd argument to your assign call 'envir=' or use
the syntax 'cpufile[[key]] - value' instead of assign.)

To start debugging this, have your function print the
class of the object your object before you try to close it. 
You can do this by adding a cat() statement or by using
options(error=recover) so you can inspect things after
an error.

Also, get rid of the tryCatch business and the calls to
sink().  The first hides where an error might be coming
from and the latter may be hiding some printed messages
that might help you track down the problem. 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: mohan.radhakrish...@polarisft.com
 [mailto:mohan.radhakrish...@polarisft.com]
 Sent: Wednesday, August 07, 2013 11:56 PM
 To: r-help@r-project.org
 Cc: Berend Hasselman; William Dunlap
 Subject: RE: [R] How is a file descriptor stored ?
 
 Hi,
 
 The file handling code sometimes throws this exception.
 
 
 Error in UseMethod(close) :
   no applicable method for 'close' applied to an object of class c
 ('integer', 'numeric')
 
 Is there a sample based on my code that I can test ? I want to extract the
 file descriptors from the hashmap and close them. I think that is causing
 the exception. Sometimes just closing - close(fd) - is causing this too.
 
 Thanks,
 Mohan
 
 
 
 
 
 
 RE: [R] How is a file descriptor stored ?
 
 
 William Dunlap
to:
  Berend Hasselman, mohan.radhakrish...@polarisft.com
  07-08-2013 08:01 PM
 
 
 
 
 Cc:
 r-help@r-project.org
 
 
 
 
 
 
 
 
 
  Use
 
  assign(key, file( key, w ), envir=cpufile)
 
  In your assign expression you are assigning cpufile to the third formal
 argument which is
  pos.
  You meant the envir argument, I presume.
 
 Or use the syntax
 cpufile[[key]] - file(key, w)
 instead of
 assign(key, file( key, w ), envir=cpufile)
 The former works for lists and environments and corresponds to your
 later usage of
  listoffiles[[key]]
 to retrieve the data.
 
 From what I've seen of your example, a list might be a better way to
 store your data, because of its copy-on-write semantics and because
 it doesn't keep a parent environment in memory.  By using
 '[[' instead of 'get' and 'assign' you minimize the number of changes
 required to switch between a list and an environment.
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf
  Of Berend Hasselman
  Sent: Wednesday, August 07, 2013 4:10 AM
  To: mohan.radhakrish...@polarisft.com
  Cc: r-help@r-project.org
  Subject: Re: [R] How is a file descriptor stored ?
 
 
  On 07-08-2013, at 12:13, mohan.radhakrish...@polarisft.com wrote:
 
  
   Hi,
  I thought that 'R' like java will allow me to store file names
   (keys) and file descriptors(values) in a hashmap.
  
  
   filelist.array - function(n){
sink(nmon.log)
cpufile - new.env(hash=T, parent=emptyenv())
for (i in 1:n) {
  key - paste(output, i, .txt, sep = )
  assign(key, file( key, w ), cpufile)
}
  sink()
 return (cpufile)
   }
  
   But when I try to test it like this there is an exception
  
   [1] Exception is  Error in UseMethod(\close\): no applicable method
 for
   'close' applied to an object of class \c('integer', 'numeric')\\n
  
   test.simple.filelist.array - function() {
  
  execution - tryCatch({
  sink(nmon.log)
  listoffiles - filelist.array(3)
  for (v in ls(listoffiles)) {
  print(paste(Map value is [,
 listoffiles[[v]], ]))
  fd - listoffiles[[v]]
  close(fd)
  }
  sink()
  }, error = function(err){
  print(paste(Exception is ,err))
  })
   }
  
   I think I am missing some fundamentals.
  
 
  Read the help page for assign more carefully.
  Use
 
  assign(key, file( key, w ), envir=cpufile)
 
  In your assign expression you are assigning cpufile to the third formal
 argument which is
  pos.
  You meant the envir argument, I presume.
 
  Berend
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 This e-Mail may contain proprietary and confidential information and is sent 
 for the
 intended recipient(s) only.  If by an addressing 

Re: [R] Add column to dataframe based on code in other column

2013-08-08 Thread arun
dat1- read.table(text=
Name    State_Code
Tom  20
Harry    56
Ben    05
Sally  04
,sep=,header=TRUE,stringsAsFactors=FALSE)

dat2- do.call(cbind,list(NorthEast,MidWest,South,West,Other)) 
 colnames(dat2)- c(NorthEast,MidWest,South,West,Other)
 dat2- as.data.frame(dat2)
library(reshape2)
datM-melt(dat2)
colnames(datM)- c(REGION,State_Code)
library(plyr)
join(dat1,datM,type=left,match=first,by=State_Code)[,c(2,1,3)]
#   Name State_Code    REGION
#1   Tom 20 NorthEast
#2 Harry 56 Other
#3   Ben  5  West
#4 Sally  4 South
A.K.



- Original Message -
From: Dark i...@software-solutions.nl
To: r-help@r-project.org
Cc: 
Sent: Thursday, August 8, 2013 5:33 AM
Subject: [R] Add column to dataframe based on code in other column

Hi all,

I have a dataframe of users which contain US-state codes. 
Now I want to add a column named REGION based on the state code. I have
already done a mapping:

NorthEast - c(07, 20, 22, 30, 31, 33, 39, 41, 47)
MidWest - c(14, 15, 16, 17, 23, 24, 26, 28, 35, 36, 43, 52)
South - c(01, 04, 08, 09, 10, 11, 18, 19, 21, 25, 34, 37, 42, 44, 45, 49,
51)
West - c(02, 03, 05, 06, 12, 13, 27, 29, 32, 38, 46, 50, 53)
Other - c(40, 48, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 94,
98, 99)

So for example:
Name    State_Code
Tom       20
Harry     56
Ben         05
Sally       04

Should become like:
So for example:
Name    State_Code REGION
Tom       20                   NorthEast
Harry     56                   Other
Ben         05                  West
Sally       04                   South

Could anyone help me with a clever statement?



--
View this message in context: 
http://r.789695.n4.nabble.com/Add-column-to-dataframe-based-on-code-in-other-column-tp4673335.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cbind with headers

2013-08-08 Thread arun
Hi,

You can save it in file.  I copy and paste:

Subtype,Gender,Expression
A,m,-0.54
A,f,-0.8
B,f,-1.03
C,m,-0.41


on the gedit and save it as data1.csv.  You might be able to do the same 
with notepad.

x - read.csv(data1.csv,header=T,sep=,)
x2 - read.csv(data2N.csv,header=T,sep=,)
x3 - cbind(x,x2)

 x3
#  Subtype Gender Expression Age City
#1   A  m  -0.54  32 New York
#2   A  f  -0.80  21  Houston
#3   B  f  -1.03  34  Seattle
#4   C  m  -0.41  67  Houston



#or if the dataset is small as in the example

x- read.table(text=
Subtype,Gender,Expression
A,m,-0.54
A,f,-0.8
B,f,-1.03
C,m,-0.41
,sep=,,header=TRUE,stringsAsFactors=FALSE)

x2- read.table(text=
Age,City
32,New York
21,Houston
34,Seattle
67,Houston
,sep=,,header=TRUE,stringsAsFactors=FALSE)
cbind(x,x2)
#  Subtype Gender Expression Age City
#1   A  m  -0.54  32 New York
#2   A  f  -0.80  21  Houston
#3   B  f  -1.03  34  Seattle
#4   C  m  -0.41  67  Houston
A.K.




Hi, 

I can't seem to get this to work: 
http://www.endmemo.com/program/R/cbind.php

Do
 I save the data as data1.csv in note pad and pull in the file? Do I 
type 
data1.csv-Subtype,Gender,Expression,A,m,-0.54,A,f,-0.8,B,f,-1.03,C,m,-0.41?? 

I can do a simple matrix. But, I want to have headers and data to combine. 

Simple Matrix I combined. 
 m-as.data.frame(matrix(c(1:6),ncol=2)) 
 n-as.data.frame(matrix(c(7:12),ncol=2)) 
 m 
  V1 V2 
1  1  4 
2  2  5 
3  3  6 
 n 
  V1 V2 
1  7 10 
2  8 11 
3  9 12 
 Mary-cbind(m,n) 
 Mary 
  V1 V2 V1 V2 
1  1  4  7 10 
2  2  5  8 11 
3  3  6  9 12

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Why is mclappy slower than apply in this case?

2013-08-08 Thread Tomas Reigl


Hello,


i'm pretty confused. I want to speed up my algorithm by using mclapply:
parallel, but when I compare time efficiency, apply still wins.

I'm smoothing log2ratio data by rq.fit.fnb:quantreg which is called by my 
function quantsm and I'm wrapping my data into matrix/list for apply/lapply
(mclapply) usage.




I adjust my data like this:

codespan class='pln'q /spanspan class='pun'=/spanspan class='pln' 
matrix/spanspan class='pun'(/spanspan class='pln'data/spanspan 
class='pun',/spanspan class='pln' ncol/spanspan 
class='pun'=/spanspan class='pln'N/spanspan class='pun')/spanspan 
class='pln'/spanspan class='com'# wrapping into matrix (using N =
 2, 4, 6 or 8)/spanspan class='pln'
ql /spanspan class='pun'=/spanspan class='pln' /spanspan 
class='kwd'as/spanspan class='pun'./spanspan 
class='pln'list/spanspan class='pun'(/spanspan 
class='kwd'as/spanspan class='pun'./spanspan 
class='pln'data/spanspan class='pun'./spanspan 
class='pln'frame/spanspan class='pun'(/spanspan 
class='pln'q/spanspan class='pun'))/spanspan class='pln'  
/spanspan class='com'# making list/span/code

And time comparing:

codespan class='pln'apply/spanspan class='pun'=/spanspan 
class='pln'system/spanspan class='pun'./spanspan 
class='pln'time/spanspan class='pun'(/spanspan 
class='pln'apply/spanspan class='pun'(/spanspan 
class='pln'q/spanspan class='pun',/spanspan class='pln' /spanspan 
class='lit'1/spanspan class='pun',/spanspan class='pln' 
FUN/spanspan class='pun'=/spanspan class='pln'quantsm/spanspan 
class='pun',/spanspan class='pln' /spanspan 
class='lit'0.50/spanspan class='pun',/spanspan class='pln' 
/spanspan class='lit'2/spanspan class='pun'))/spanspan class='pln'
lapply/spanspan class='pun'=/spanspan class='pln'system/spanspan 
class='pun'./spanspan class='pln'time/spanspan 
class='pun'(/spanspan class='pln'lapply/spanspan 
class='pun'(/spanspan class='pln'ql/spanspan class='pun',/spanspan
 class='pln' FUN/spanspan class='pun'=/spanspan 
class='pln'quantsm/spanspan class='pun',/spanspan class='pln' 
/spanspan class='lit'0.50/spanspan class='pun',/spanspan 
class='pln' /spanspan class='lit'2/spanspan class='pun'))/spanspan
 class='pln'
mc2lapply/spanspan class='pun'=/spanspan class='pln'system/spanspan
 class='pun'./spanspan class='pln'time/spanspan 
class='pun'(/spanspan class='pln'mclapply/spanspan 
class='pun'(/spanspan class='pln'ql/spanspan class='pun',/spanspan
 class='pln' FUN/spanspan class='pun'=/spanspan 
class='pln'quantsm/spanspan class='pun',/spanspan class='pln' 
/spanspan class='lit'0.50/spanspan class='pun',/spanspan 
class='pln' /spanspan class='lit'2/spanspan class='pun',/spanspan 
class='pln' mc/spanspan class='pun'./spanspan 
class='pln'cores/spanspan class='pun'=/spanspan 
class='lit'2/spanspan class='pun'))/spanspan class='pln'
mc4lapply/spanspan class='pun'=/spanspan class='pln'system/spanspan
 class='pun'./spanspan class='pln'time/spanspan 
class='pun'(/spanspan class='pln'mclapply/spanspan 
class='pun'(/spanspan class='pln'ql/spanspan class='pun',/spanspan
 class='pln' FUN/spanspan class='pun'=/spanspan 
class='pln'quantsm/spanspan class='pun',/spanspan class='pln' 
/spanspan class='lit'0.50/spanspan class='pun',/spanspan 
class='pln' /spanspan class='lit'2/spanspan class='pun',/spanspan 
class='pln' mc/spanspan class='pun'./spanspan 
class='pln'cores/spanspan class='pun'=/spanspan 
class='lit'4/spanspan class='pun'))/spanspan class='pln'
mc6lapply/spanspan class='pun'=/spanspan class='pln'system/spanspan
 class='pun'./spanspan class='pln'time/spanspan 
class='pun'(/spanspan class='pln'mclapply/spanspan 
class='pun'(/spanspan class='pln'ql/spanspan class='pun',/spanspan
 class='pln' FUN/spanspan class='pun'=/spanspan 
class='pln'quantsm/spanspan class='pun',/spanspan class='pln' 
/spanspan class='lit'0.50/spanspan class='pun',/spanspan 
class='pln' /spanspan class='lit'2/spanspan class='pun',/spanspan 
class='pln' mc/spanspan class='pun'./spanspan 
class='pln'cores/spanspan class='pun'=/spanspan 
class='lit'6/spanspan class='pun'))/spanspan class='pln'
mc8lapply/spanspan class='pun'=/spanspan class='pln'system/spanspan
 class='pun'./spanspan class='pln'time/spanspan 
class='pun'(/spanspan class='pln'mclapply/spanspan 
class='pun'(/spanspan class='pln'ql/spanspan class='pun',/spanspan
 class='pln' FUN/spanspan class='pun'=/spanspan 
class='pln'quantsm/spanspan class='pun',/spanspan class='pln' 
/spanspan class='lit'0.50/spanspan class='pun',/spanspan 
class='pln' /spanspan class='lit'2/spanspan class='pun',/spanspan 
class='pln' mc/spanspan class='pun'./spanspan 
class='pln'cores/spanspan class='pun'=/spanspan 
class='lit'8/spanspan class='pun'))/spanspan class='pln'
timing/spanspan class='pun'=/spanspan class='pln'rbind/spanspan 
class='pun'(/spanspan class='pln'apply/spanspan 
class='pun',/spanspan class='pln'lapply/spanspan 
class='pun',/spanspan class='pln'mc2lapply/spanspan 
class='pun',/spanspan class='pln'mc4lapply/spanspan 
class='pun',/spanspan 

[R] mgcv predict.bam strange results

2013-08-08 Thread fwickler
Dear useR,

I don't understand the results of the predict.bam function of mgcv package
when constucting a varying-coefficient model with bam instead of gam:

 library(mgcv)
dat - gamSim(4)
b - gam(y ~ fac+s(x2,by=fac)+s(x0), data=dat)
predict(b, dat[1,], type = terms)

with gam everything is fine: only s(x2):fac1 is different of zero but using
bam:

b1 - bam(y ~ fac+s(x2,by=fac, bs = cc)+s(x0),data=dat)
predict(b1, dat[1,], type = terms)

all terms s(x2):fac1, s(x2):fac2  and s(x2):fac3 are differnt of zero.

Thanks for your help
Florian





--
View this message in context: 
http://r.789695.n4.nabble.com/mgcv-predict-bam-strange-results-tp4673364.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cbind with headers

2013-08-08 Thread Docbanks84
Hi,

I can't seem to get this to work:
http://www.endmemo.com/program/R/cbind.php

Do I save the data as data1.csv in note pad and pull in the file? Do I type
data1.csv-Subtype,Gender,Expression,A,m,-0.54,A,f,-0.8,B,f,-1.03,C,m,-0.41??

I can do a simple matrix. But, I want to have headers and data to combine.

Simple Matrix I combined.
 m-as.data.frame(matrix(c(1:6),ncol=2))
 n-as.data.frame(matrix(c(7:12),ncol=2))
 m
  V1 V2
1  1  4
2  2  5
3  3  6
 n
  V1 V2
1  7 10
2  8 11
3  9 12
 Mary-cbind(m,n)
 Mary
  V1 V2 V1 V2
1  1  4  7 10
2  2  5  8 11
3  3  6  9 12




--
View this message in context: 
http://r.789695.n4.nabble.com/cbind-with-headers-tp4673354.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] For loop output

2013-08-08 Thread Jenny Williams
I am having difficulty storing the output of a for loop I have generated. All I 
want to do is find all the files that I have, create a string with all of the 
names in quotes and separated by commas. This is proving more difficult than I 
initially anticipated.
I am sure it is either very simple or the construction of the for loop is not 
quite right
The result gets automatically printed after the loop but I can't seem to save 
it.
I have tried to create the element in advance but the result is the same: NULL

individual.proj = Sys.glob(Arabica/proj_current/individual_projections/*.img, 
dirmark = FALSE)
individual.proj
[1] 
Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GBM.img
 [2] 
Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GLM.img
 [3] 
Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_MARS.img
 [4] 
Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_RF.img
 [5] 
Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_RUN10_GBM.img


##generate loop to create string out of the table of projected files.
L.ip = length(individual.proj)
  for (i in 1:L.ip){
   individual.proj.i - individual.proj[i]
   individual.proj.quote = cat(paste('', individual.proj.i, '', 
',',sep=))
   }
Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GBM.img,Arabica/proj_current/individual_projections/proj_current

##print output string
individual.proj.quote
NULL

#command to be applied to individual.proj.quote to removed the final comma from 
the string
substr(individual.proj.quote, 1, nchar(individual.proj.quote)-1)

Any help or pointers would be greatly appreciated, no amount of extensive 
google searches have been fruitful so far.


**
Jenny Williams
Spatial Information Scientist, GIS Unit
Herbarium, Library, Art  Archives Directorate
Royal Botanic Gardens, Kew
Richmond, TW9 3AB, UK

Tel: +44 (0)208 332 5277
email: jenny.willi...@kew.orgmailto:jenny.willi...@kew.org
**

Film: The Forgotten Home of Coffee - Beyond the 
Gardenshttp://www.youtube.com/watch?v=-uDtytKMKpAsns=tw
Stories: Coffee Expedition - 
Ethiopiahttp://storify.com/KewGIS/coffee-expedition-ethiopia
 Kew in Harapan Rainforest 
Sumatrahttp://storify.com/KewGIS/kew-in-harapan-rainforest
Articles: Seeing the wood for the 
treeshttp://www.kew.org/ucm/groups/public/documents/document/kppcont_060602.pdf
How Kew's GIS team and South East Asia botanists are working to help conserve 
and restore a rainforest in Sumatra. Download a pdf of this article 
here.http://www.kew.org/ucm/groups/public/documents/document/kppcont_060602.pdf



The Royal Botanic Gardens, Kew is a non-departmental public body with exempt 
charitable status, whose principal place of business is at Royal Botanic 
Gardens, Kew, Richmond, Surrey TW9 3AB, United Kingdom.

The information contained in this email and any attachments is intended solely 
for the addressee(s) and may contain confidential or legally privileged 
information. If you have received this message in error, please return it 
immediately and permanently delete it. Do not use, copy or disclose the 
information contained in this email or in any attachment.

Any views expressed in this email do not necessarily reflect the opinions of 
RBG Kew.

Any files attached to this email have been inspected with virus detection 
software by RBG Kew before transmission, however you should carry out your own 
virus checks before opening any attachments. RBG Kew accepts no liability for 
any loss or damage which may be caused by software viruses.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] For loop output

2013-08-08 Thread Charles Determan Jr
Hi Jenny,

Firstly, to my knowledge you cannot assign the output of cat to an object
(i.e. it only prints it).
Second, you can just add the 'collapse' option of the paste function.

individual.proj.quote - paste(individual.proj, collapse = ,)

if you really want the quotes
individual.proj.quote - paste(individual.proj, collapse=',')

but you will be stuck with some backslashes I can't recall the syntax to
remove.

Hope this serves your purposes
Cheers,

Charles


On Thu, Aug 8, 2013 at 10:05 AM, Jenny Williams jenny.willi...@kew.orgwrote:

 I am having difficulty storing the output of a for loop I have generated.
 All I want to do is find all the files that I have, create a string with
 all of the names in quotes and separated by commas. This is proving more
 difficult than I initially anticipated.
 I am sure it is either very simple or the construction of the for loop is
 not quite right
 The result gets automatically printed after the loop but I can't seem to
 save it.
 I have tried to create the element in advance but the result is the same:
 NULL

 individual.proj =
 Sys.glob(Arabica/proj_current/individual_projections/*.img, dirmark =
 FALSE)
 individual.proj
 [1]
 Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GBM.img
  [2]
 Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GLM.img
  [3]
 Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_MARS.img
  [4]
 Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_RF.img
  [5]
 Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_RUN10_GBM.img


 ##generate loop to create string out of the table of projected files.
 L.ip = length(individual.proj)
   for (i in 1:L.ip){
individual.proj.i - individual.proj[i]
individual.proj.quote = cat(paste('', individual.proj.i, '',
 ',',sep=))
}

 Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GBM.img,Arabica/proj_current/individual_projections/proj_current

 ##print output string
 individual.proj.quote
 NULL

 #command to be applied to individual.proj.quote to removed the final comma
 from the string
 substr(individual.proj.quote, 1, nchar(individual.proj.quote)-1)

 Any help or pointers would be greatly appreciated, no amount of extensive
 google searches have been fruitful so far.


 **
 Jenny Williams
 Spatial Information Scientist, GIS Unit
 Herbarium, Library, Art  Archives Directorate
 Royal Botanic Gardens, Kew
 Richmond, TW9 3AB, UK

 Tel: +44 (0)208 332 5277
 email: jenny.willi...@kew.orgmailto:jenny.willi...@kew.org
 **

 Film: The Forgotten Home of Coffee - Beyond the Gardens
 http://www.youtube.com/watch?v=-uDtytKMKpAsns=tw
 Stories: Coffee Expedition - Ethiopia
 http://storify.com/KewGIS/coffee-expedition-ethiopia
  Kew in Harapan Rainforest Sumatra
 http://storify.com/KewGIS/kew-in-harapan-rainforest
 Articles: Seeing the wood for the trees
 http://www.kew.org/ucm/groups/public/documents/document/kppcont_060602.pdf
 
 How Kew's GIS team and South East Asia botanists are working to help
 conserve and restore a rainforest in Sumatra. Download a pdf of this
 article here.
 http://www.kew.org/ucm/groups/public/documents/document/kppcont_060602.pdf
 


 
 The Royal Botanic Gardens, Kew is a non-departmental public body with
 exempt charitable status, whose principal place of business is at Royal
 Botanic Gardens, Kew, Richmond, Surrey TW9 3AB, United Kingdom.

 The information contained in this email and any attachments is intended
 solely for the addressee(s) and may contain confidential or legally
 privileged information. If you have received this message in error, please
 return it immediately and permanently delete it. Do not use, copy or
 disclose the information contained in this email or in any attachment.

 Any views expressed in this email do not necessarily reflect the opinions
 of RBG Kew.

 Any files attached to this email have been inspected with virus detection
 software by RBG Kew before transmission, however you should carry out your
 own virus checks before opening any attachments. RBG Kew accepts no
 liability for any loss or damage which may be caused by software viruses.

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do 

Re: [R] For loop output

2013-08-08 Thread Jan Kim
On Thu, Aug 08, 2013 at 04:05:57PM +0100, Jenny Williams wrote:
 I am having difficulty storing the output of a for loop I have generated. All 
 I want to do is find all the files that I have, create a string with all of 
 the names in quotes and separated by commas. This is proving more difficult 
 than I initially anticipated.
 I am sure it is either very simple or the construction of the for loop is not 
 quite right
 The result gets automatically printed after the loop but I can't seem to save 
 it.
 I have tried to create the element in advance but the result is the same: NULL

This is a somewhat frequent confusion of the (very different!) concepts
of assignment and printing. The fact that something becomes visible
to the human user via printing has nothing to do really with the fact
that an object is generated and assigned to a variable, thereby becoming
accessible (and, in that allegorical sense, visible) to the subsequent
code.

The cat function prints values but it does not return them. It returns
NULL, which is what you get.

 individual.proj = 
 Sys.glob(Arabica/proj_current/individual_projections/*.img, dirmark = FALSE)
 individual.proj
 [1] 
 Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GBM.img
  [2] 
 Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GLM.img
  [3] 
 Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_MARS.img
  [4] 
 Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_RF.img
  [5] 
 Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_RUN10_GBM.img
 
 
 ##generate loop to create string out of the table of projected files.
 L.ip = length(individual.proj)
   for (i in 1:L.ip){
individual.proj.i - individual.proj[i]
individual.proj.quote = cat(paste('', individual.proj.i, '', 
 ',',sep=))
}
 Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GBM.img,Arabica/proj_current/individual_projections/proj_current
 
 ##print output string
 individual.proj.quote
 NULL
 
 #command to be applied to individual.proj.quote to removed the final comma 
 from the string
 substr(individual.proj.quote, 1, nchar(individual.proj.quote)-1)
 
 Any help or pointers would be greatly appreciated, no amount of extensive 
 google searches have been fruitful so far.

If you don't mind me suggesting this, reading the documentation page
of the function(s) you use is an approach that is more targeted and
therefore often quicker. (Googling cat is probably especially bad,
as there are various cat applications and functions in several languages,
not to mention several species of mammals...  ;-)  )

As a further remark, you don't need a loop, one line composed of sprintf
and paste (check the collapse parameter) should do the trick you're
after.

Best regards, Jan
 
 **
 Jenny Williams
 Spatial Information Scientist, GIS Unit
 Herbarium, Library, Art  Archives Directorate
 Royal Botanic Gardens, Kew
 Richmond, TW9 3AB, UK
 
 Tel: +44 (0)208 332 5277
 email: jenny.willi...@kew.orgmailto:jenny.willi...@kew.org
 **
 
 Film: The Forgotten Home of Coffee - Beyond the 
 Gardenshttp://www.youtube.com/watch?v=-uDtytKMKpAsns=tw
 Stories: Coffee Expedition - 
 Ethiopiahttp://storify.com/KewGIS/coffee-expedition-ethiopia
  Kew in Harapan Rainforest 
 Sumatrahttp://storify.com/KewGIS/kew-in-harapan-rainforest
 Articles: Seeing the wood for the 
 treeshttp://www.kew.org/ucm/groups/public/documents/document/kppcont_060602.pdf
 How Kew's GIS team and South East Asia botanists are working to help conserve 
 and restore a rainforest in Sumatra. Download a pdf of this article 
 here.http://www.kew.org/ucm/groups/public/documents/document/kppcont_060602.pdf
 
 
 
 The Royal Botanic Gardens, Kew is a non-departmental public body with exempt 
 charitable status, whose principal place of business is at Royal Botanic 
 Gardens, Kew, Richmond, Surrey TW9 3AB, United Kingdom.
 
 The information contained in this email and any attachments is intended 
 solely for the addressee(s) and may contain confidential or legally 
 privileged information. If you have received this message in error, please 
 return it immediately and permanently delete it. Do not use, copy or disclose 
 the information contained in this email or in any attachment.
 
 Any views expressed in this email do not necessarily reflect the opinions of 
 RBG Kew.
 
 Any files attached to this email have been inspected with virus detection 
 software by RBG Kew before transmission, however you should carry out your 
 own virus checks before opening any attachments. RBG Kew accepts no liability 
 for any loss or damage which may be caused by software viruses.
 
   [[alternative HTML version deleted]]
 
 

Re: [R] Why is mclappy slower than apply in this case?

2013-08-08 Thread Bert Gunter
Tomas:

Do some reading on parallelization.

Parallelizing code requires the overhead of setting up, keeping track
of, synching the separate threads. Whether that overhead is worth the
cost depends on the problem,the size,  the algorithms, the
machines/hardware,...

Cheers,
Bert

On Thu, Aug 8, 2013 at 4:00 AM, Tomas Reigl inciv...@seznam.cz wrote:


 Hello,


 i'm pretty confused. I want to speed up my algorithm by using mclapply:
 parallel, but when I compare time efficiency, apply still wins.

 I'm smoothing log2ratio data by rq.fit.fnb:quantreg which is called by my
 function quantsm and I'm wrapping my data into matrix/list for apply/lapply
 (mclapply) usage.




 I adjust my data like this:

 codespan class='pln'q /spanspan class='pun'=/spanspan class='pln'
 matrix/spanspan class='pun'(/spanspan class='pln'data/spanspan
 class='pun',/spanspan class='pln' ncol/spanspan
 class='pun'=/spanspan class='pln'N/spanspan class='pun')/spanspan
 class='pln'/spanspan class='com'# wrapping into matrix (using N =
  2, 4, 6 or 8)/spanspan class='pln'
 ql /spanspan class='pun'=/spanspan class='pln' /spanspan
 class='kwd'as/spanspan class='pun'./spanspan
 class='pln'list/spanspan class='pun'(/spanspan
 class='kwd'as/spanspan class='pun'./spanspan
 class='pln'data/spanspan class='pun'./spanspan
 class='pln'frame/spanspan class='pun'(/spanspan
 class='pln'q/spanspan class='pun'))/spanspan class='pln'
 /spanspan class='com'# making list/span/code

 And time comparing:

 codespan class='pln'apply/spanspan class='pun'=/spanspan
 class='pln'system/spanspan class='pun'./spanspan
 class='pln'time/spanspan class='pun'(/spanspan
 class='pln'apply/spanspan class='pun'(/spanspan
 class='pln'q/spanspan class='pun',/spanspan class='pln' /spanspan
 class='lit'1/spanspan class='pun',/spanspan class='pln'
 FUN/spanspan class='pun'=/spanspan class='pln'quantsm/spanspan
 class='pun',/spanspan class='pln' /spanspan
 class='lit'0.50/spanspan class='pun',/spanspan class='pln'
 /spanspan class='lit'2/spanspan class='pun'))/spanspan class='pln'
 lapply/spanspan class='pun'=/spanspan class='pln'system/spanspan
 class='pun'./spanspan class='pln'time/spanspan
 class='pun'(/spanspan class='pln'lapply/spanspan
 class='pun'(/spanspan class='pln'ql/spanspan class='pun',/spanspan
  class='pln' FUN/spanspan class='pun'=/spanspan
 class='pln'quantsm/spanspan class='pun',/spanspan class='pln'
 /spanspan class='lit'0.50/spanspan class='pun',/spanspan
 class='pln' /spanspan class='lit'2/spanspan class='pun'))/spanspan
  class='pln'
 mc2lapply/spanspan class='pun'=/spanspan class='pln'system/spanspan
  class='pun'./spanspan class='pln'time/spanspan
 class='pun'(/spanspan class='pln'mclapply/spanspan
 class='pun'(/spanspan class='pln'ql/spanspan class='pun',/spanspan
  class='pln' FUN/spanspan class='pun'=/spanspan
 class='pln'quantsm/spanspan class='pun',/spanspan class='pln'
 /spanspan class='lit'0.50/spanspan class='pun',/spanspan
 class='pln' /spanspan class='lit'2/spanspan class='pun',/spanspan
 class='pln' mc/spanspan class='pun'./spanspan
 class='pln'cores/spanspan class='pun'=/spanspan
 class='lit'2/spanspan class='pun'))/spanspan class='pln'
 mc4lapply/spanspan class='pun'=/spanspan class='pln'system/spanspan
  class='pun'./spanspan class='pln'time/spanspan
 class='pun'(/spanspan class='pln'mclapply/spanspan
 class='pun'(/spanspan class='pln'ql/spanspan class='pun',/spanspan
  class='pln' FUN/spanspan class='pun'=/spanspan
 class='pln'quantsm/spanspan class='pun',/spanspan class='pln'
 /spanspan class='lit'0.50/spanspan class='pun',/spanspan
 class='pln' /spanspan class='lit'2/spanspan class='pun',/spanspan
 class='pln' mc/spanspan class='pun'./spanspan
 class='pln'cores/spanspan class='pun'=/spanspan
 class='lit'4/spanspan class='pun'))/spanspan class='pln'
 mc6lapply/spanspan class='pun'=/spanspan class='pln'system/spanspan
  class='pun'./spanspan class='pln'time/spanspan
 class='pun'(/spanspan class='pln'mclapply/spanspan
 class='pun'(/spanspan class='pln'ql/spanspan class='pun',/spanspan
  class='pln' FUN/spanspan class='pun'=/spanspan
 class='pln'quantsm/spanspan class='pun',/spanspan class='pln'
 /spanspan class='lit'0.50/spanspan class='pun',/spanspan
 class='pln' /spanspan class='lit'2/spanspan class='pun',/spanspan
 class='pln' mc/spanspan class='pun'./spanspan
 class='pln'cores/spanspan class='pun'=/spanspan
 class='lit'6/spanspan class='pun'))/spanspan class='pln'
 mc8lapply/spanspan class='pun'=/spanspan class='pln'system/spanspan
  class='pun'./spanspan class='pln'time/spanspan
 class='pun'(/spanspan class='pln'mclapply/spanspan
 class='pun'(/spanspan class='pln'ql/spanspan class='pun',/spanspan
  class='pln' FUN/spanspan class='pun'=/spanspan
 class='pln'quantsm/spanspan class='pun',/spanspan class='pln'
 /spanspan class='lit'0.50/spanspan class='pun',/spanspan
 class='pln' /spanspan class='lit'2/spanspan class='pun',/spanspan
 class='pln' mc/spanspan class='pun'./spanspan
 class='pln'cores/spanspan class='pun'=/spanspan
 

Re: [R] For loop output

2013-08-08 Thread Jan Kim
On Thu, Aug 08, 2013 at 11:38:33AM -0500, Charles Determan Jr wrote:
 Hi Jenny,
 
 Firstly, to my knowledge you cannot assign the output of cat to an object
 (i.e. it only prints it).
 Second, you can just add the 'collapse' option of the paste function.
 
 individual.proj.quote - paste(individual.proj, collapse = ,)
 
 if you really want the quotes
 individual.proj.quote - paste(individual.proj, collapse=',')

No -- the backslashes are part of the visualisation of the value, not
part of the value itself.

Please see FAQ 7.37 for details.

And also notice that the collapse parameter specifies a separator,
and the quotes are intended to enclose the individual words here,
not to separate them. In this case, the left quote of the first word
and the right quote of the last word will not be generated as a result
of this misuse of a separator...

Best regards, Jan

 but you will be stuck with some backslashes I can't recall the syntax to
 remove.
 
 Hope this serves your purposes
 Cheers,
 
 Charles
 
 
 On Thu, Aug 8, 2013 at 10:05 AM, Jenny Williams jenny.willi...@kew.orgwrote:
 
  I am having difficulty storing the output of a for loop I have generated.
  All I want to do is find all the files that I have, create a string with
  all of the names in quotes and separated by commas. This is proving more
  difficult than I initially anticipated.
  I am sure it is either very simple or the construction of the for loop is
  not quite right
  The result gets automatically printed after the loop but I can't seem to
  save it.
  I have tried to create the element in advance but the result is the same:
  NULL
 
  individual.proj =
  Sys.glob(Arabica/proj_current/individual_projections/*.img, dirmark =
  FALSE)
  individual.proj
  [1]
  Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GBM.img
   [2]
  Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GLM.img
   [3]
  Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_MARS.img
   [4]
  Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_RF.img
   [5]
  Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_RUN10_GBM.img
 
 
  ##generate loop to create string out of the table of projected files.
  L.ip = length(individual.proj)
for (i in 1:L.ip){
 individual.proj.i - individual.proj[i]
 individual.proj.quote = cat(paste('', individual.proj.i, '',
  ',',sep=))
 }
 
  Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GBM.img,Arabica/proj_current/individual_projections/proj_current
 
  ##print output string
  individual.proj.quote
  NULL
 
  #command to be applied to individual.proj.quote to removed the final comma
  from the string
  substr(individual.proj.quote, 1, nchar(individual.proj.quote)-1)
 
  Any help or pointers would be greatly appreciated, no amount of extensive
  google searches have been fruitful so far.
 
 
  **
  Jenny Williams
  Spatial Information Scientist, GIS Unit
  Herbarium, Library, Art  Archives Directorate
  Royal Botanic Gardens, Kew
  Richmond, TW9 3AB, UK
 
  Tel: +44 (0)208 332 5277
  email: jenny.willi...@kew.orgmailto:jenny.willi...@kew.org
  **
 
  Film: The Forgotten Home of Coffee - Beyond the Gardens
  http://www.youtube.com/watch?v=-uDtytKMKpAsns=tw
  Stories: Coffee Expedition - Ethiopia
  http://storify.com/KewGIS/coffee-expedition-ethiopia
   Kew in Harapan Rainforest Sumatra
  http://storify.com/KewGIS/kew-in-harapan-rainforest
  Articles: Seeing the wood for the trees
  http://www.kew.org/ucm/groups/public/documents/document/kppcont_060602.pdf
  
  How Kew's GIS team and South East Asia botanists are working to help
  conserve and restore a rainforest in Sumatra. Download a pdf of this
  article here.
  http://www.kew.org/ucm/groups/public/documents/document/kppcont_060602.pdf
  
 
 
  
  The Royal Botanic Gardens, Kew is a non-departmental public body with
  exempt charitable status, whose principal place of business is at Royal
  Botanic Gardens, Kew, Richmond, Surrey TW9 3AB, United Kingdom.
 
  The information contained in this email and any attachments is intended
  solely for the addressee(s) and may contain confidential or legally
  privileged information. If you have received this message in error, please
  return it immediately and permanently delete it. Do not use, copy or
  disclose the information contained in this email or in any attachment.
 
  Any views expressed in this email do not necessarily reflect the opinions
  of RBG Kew.
 
  Any files attached to this email have been inspected with virus detection
  software by RBG Kew before transmission, however you should carry out your
  own virus checks before opening any attachments. RBG Kew accepts no
  liability for any loss or 

Re: [R] For loop output

2013-08-08 Thread David Carlson
It's not clear how you are planning to use this within R, but
you don't need a loop.

individual.proj.quote -
capture.output(write.table(matrix(individual.proj, 1),
quote=TRUE, sep=,, row.names=FALSE, col.names=FALSE))

This produces a single character string which consists of the
quoted file names separated by commas.

-
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Jenny
Williams
Sent: Thursday, August 8, 2013 10:06 AM
To: 'r-help@r-project.org'
Subject: [R] For loop output

I am having difficulty storing the output of a for loop I have
generated. All I want to do is find all the files that I have,
create a string with all of the names in quotes and separated by
commas. This is proving more difficult than I initially
anticipated.
I am sure it is either very simple or the construction of the
for loop is not quite right
The result gets automatically printed after the loop but I can't
seem to save it.
I have tried to create the element in advance but the result is
the same: NULL

individual.proj =
Sys.glob(Arabica/proj_current/individual_projections/*.img,
dirmark = FALSE)
individual.proj
[1]
Arabica/proj_current/individual_projections/proj_current_arabic
a_pa.data.tmp$pa.tab_Full_GBM.img
 [2]
Arabica/proj_current/individual_projections/proj_current_arabic
a_pa.data.tmp$pa.tab_Full_GLM.img
 [3]
Arabica/proj_current/individual_projections/proj_current_arabic
a_pa.data.tmp$pa.tab_Full_MARS.img
 [4]
Arabica/proj_current/individual_projections/proj_current_arabic
a_pa.data.tmp$pa.tab_Full_RF.img
 [5]
Arabica/proj_current/individual_projections/proj_current_arabic
a_pa.data.tmp$pa.tab_RUN10_GBM.img


##generate loop to create string out of the table of projected
files.
L.ip = length(individual.proj)
  for (i in 1:L.ip){
   individual.proj.i - individual.proj[i]
   individual.proj.quote = cat(paste('',
individual.proj.i, '', ',',sep=))
   }
Arabica/proj_current/individual_projections/proj_current_arabic
a_pa.data.tmp$pa.tab_Full_GBM.img,Arabica/proj_current/individ
ual_projections/proj_current

##print output string
individual.proj.quote
NULL

#command to be applied to individual.proj.quote to removed the
final comma from the string
substr(individual.proj.quote, 1, nchar(individual.proj.quote)-1)

Any help or pointers would be greatly appreciated, no amount of
extensive google searches have been fruitful so far.


**
Jenny Williams
Spatial Information Scientist, GIS Unit
Herbarium, Library, Art  Archives Directorate
Royal Botanic Gardens, Kew
Richmond, TW9 3AB, UK

Tel: +44 (0)208 332 5277
email: jenny.willi...@kew.orgmailto:jenny.willi...@kew.org
**

Film: The Forgotten Home of Coffee - Beyond the
Gardenshttp://www.youtube.com/watch?v=-uDtytKMKpAsns=tw
Stories: Coffee Expedition -
Ethiopiahttp://storify.com/KewGIS/coffee-expedition-ethiopia
 Kew in Harapan Rainforest
Sumatrahttp://storify.com/KewGIS/kew-in-harapan-rainforest
Articles: Seeing the wood for the
treeshttp://www.kew.org/ucm/groups/public/documents/document/kp
pcont_060602.pdf
How Kew's GIS team and South East Asia botanists are working to
help conserve and restore a rainforest in Sumatra. Download a
pdf of this article
here.http://www.kew.org/ucm/groups/public/documents/document/kp
pcont_060602.pdf



The Royal Botanic Gardens, Kew is a non-departmental public body
with exempt charitable status, whose principal place of business
is at Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3AB,
United Kingdom.

The information contained in this email and any attachments is
intended solely for the addressee(s) and may contain
confidential or legally privileged information. If you have
received this message in error, please return it immediately and
permanently delete it. Do not use, copy or disclose the
information contained in this email or in any attachment.

Any views expressed in this email do not necessarily reflect the
opinions of RBG Kew.

Any files attached to this email have been inspected with virus
detection software by RBG Kew before transmission, however you
should carry out your own virus checks before opening any
attachments. RBG Kew accepts no liability for any loss or damage
which may be caused by software viruses.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and 

[R] [R-pkgs] WriteXLS version 3.2.1 - Bug Fix Release

2013-08-08 Thread Marc Schwartz
Hi all,

WriteXLS version 3.2.1 has been submitted to CRAN, with thanks to the CRAN 
maintainers.

This is a bug fix release with the following fixes:

1. When row.names = TRUE, the initial comments row, which contains the comments 
attributes for the data frame columns and is rbind()ed to the source data 
frame, was not properly identified by the included Perl script code, resulting 
in an error in the Excel file. Bug identified yesterday by Robert Zeigler via 
GitHub.

2. The rownames for the original data frame were not being properly preserved, 
hence were in error in the Excel file when row.names = TRUE. Picked up in the 
course of fixing the above bug.


Source tarballs of the updated package are being mirrored and binaries for 
Windows and OSX should appear in due course.

Regards,

Marc Schwartz

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] matrix with standard errors of lm model

2013-08-08 Thread iza.ch1
Hi 

Can someone give me a hint on how to create a matrix with standard errors from 
lm model? I have already managed to get the matrix with coefficients: 

coef-as.data.frame(sapply(seq_len(ncol(es.w)),function( i) {x1- 
summary(lm(es.w[,i]~es.median[,i]));x1$coef[,1]}))

but I can't get the one like this for standard errors. I do regression for each 
column.


Thanks a lot :)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why is mclappy slower than apply in this case?

2013-08-08 Thread Steve Lianoglou
Tomas,

Also: please don't send html emails (as is specified in the posting
guide[1]). This is what your email looked like on our end of the
table:

https://stat.ethz.ch/pipermail/r-help/attachments/20130808/74a3c7c2/attachment.pl

[1] Posting Guide: http://www.r-project.org/posting-guide.html

-steve

On Thu, Aug 8, 2013 at 9:52 AM, Bert Gunter gunter.ber...@gene.com wrote:
 Tomas:

 Do some reading on parallelization.

 Parallelizing code requires the overhead of setting up, keeping track
 of, synching the separate threads. Whether that overhead is worth the
 cost depends on the problem,the size,  the algorithms, the
 machines/hardware,...

 Cheers,
 Bert

 On Thu, Aug 8, 2013 at 4:00 AM, Tomas Reigl inciv...@seznam.cz wrote:


 Hello,


 i'm pretty confused. I want to speed up my algorithm by using mclapply:
 parallel, but when I compare time efficiency, apply still wins.

 I'm smoothing log2ratio data by rq.fit.fnb:quantreg which is called by my
 function quantsm and I'm wrapping my data into matrix/list for apply/lapply
 (mclapply) usage.




 I adjust my data like this:

 codespan class='pln'q /spanspan class='pun'=/spanspan class='pln'
 matrix/spanspan class='pun'(/spanspan class='pln'data/spanspan
 class='pun',/spanspan class='pln' ncol/spanspan
 class='pun'=/spanspan class='pln'N/spanspan class='pun')/spanspan
 class='pln'/spanspan class='com'# wrapping into matrix (using N 
 =
  2, 4, 6 or 8)/spanspan class='pln'
 ql /spanspan class='pun'=/spanspan class='pln' /spanspan
 class='kwd'as/spanspan class='pun'./spanspan
 class='pln'list/spanspan class='pun'(/spanspan
 class='kwd'as/spanspan class='pun'./spanspan
 class='pln'data/spanspan class='pun'./spanspan
 class='pln'frame/spanspan class='pun'(/spanspan
 class='pln'q/spanspan class='pun'))/spanspan class='pln'
 /spanspan class='com'# making list/span/code

 And time comparing:

 codespan class='pln'apply/spanspan class='pun'=/spanspan
 class='pln'system/spanspan class='pun'./spanspan
 class='pln'time/spanspan class='pun'(/spanspan
 class='pln'apply/spanspan class='pun'(/spanspan
 class='pln'q/spanspan class='pun',/spanspan class='pln' /spanspan
 class='lit'1/spanspan class='pun',/spanspan class='pln'
 FUN/spanspan class='pun'=/spanspan class='pln'quantsm/spanspan
 class='pun',/spanspan class='pln' /spanspan
 class='lit'0.50/spanspan class='pun',/spanspan class='pln'
 /spanspan class='lit'2/spanspan class='pun'))/spanspan 
 class='pln'
 lapply/spanspan class='pun'=/spanspan class='pln'system/spanspan
 class='pun'./spanspan class='pln'time/spanspan
 class='pun'(/spanspan class='pln'lapply/spanspan
 class='pun'(/spanspan class='pln'ql/spanspan 
 class='pun',/spanspan
  class='pln' FUN/spanspan class='pun'=/spanspan
 class='pln'quantsm/spanspan class='pun',/spanspan class='pln'
 /spanspan class='lit'0.50/spanspan class='pun',/spanspan
 class='pln' /spanspan class='lit'2/spanspan 
 class='pun'))/spanspan
  class='pln'
 mc2lapply/spanspan class='pun'=/spanspan 
 class='pln'system/spanspan
  class='pun'./spanspan class='pln'time/spanspan
 class='pun'(/spanspan class='pln'mclapply/spanspan
 class='pun'(/spanspan class='pln'ql/spanspan 
 class='pun',/spanspan
  class='pln' FUN/spanspan class='pun'=/spanspan
 class='pln'quantsm/spanspan class='pun',/spanspan class='pln'
 /spanspan class='lit'0.50/spanspan class='pun',/spanspan
 class='pln' /spanspan class='lit'2/spanspan class='pun',/spanspan
 class='pln' mc/spanspan class='pun'./spanspan
 class='pln'cores/spanspan class='pun'=/spanspan
 class='lit'2/spanspan class='pun'))/spanspan class='pln'
 mc4lapply/spanspan class='pun'=/spanspan 
 class='pln'system/spanspan
  class='pun'./spanspan class='pln'time/spanspan
 class='pun'(/spanspan class='pln'mclapply/spanspan
 class='pun'(/spanspan class='pln'ql/spanspan 
 class='pun',/spanspan
  class='pln' FUN/spanspan class='pun'=/spanspan
 class='pln'quantsm/spanspan class='pun',/spanspan class='pln'
 /spanspan class='lit'0.50/spanspan class='pun',/spanspan
 class='pln' /spanspan class='lit'2/spanspan class='pun',/spanspan
 class='pln' mc/spanspan class='pun'./spanspan
 class='pln'cores/spanspan class='pun'=/spanspan
 class='lit'4/spanspan class='pun'))/spanspan class='pln'
 mc6lapply/spanspan class='pun'=/spanspan 
 class='pln'system/spanspan
  class='pun'./spanspan class='pln'time/spanspan
 class='pun'(/spanspan class='pln'mclapply/spanspan
 class='pun'(/spanspan class='pln'ql/spanspan 
 class='pun',/spanspan
  class='pln' FUN/spanspan class='pun'=/spanspan
 class='pln'quantsm/spanspan class='pun',/spanspan class='pln'
 /spanspan class='lit'0.50/spanspan class='pun',/spanspan
 class='pln' /spanspan class='lit'2/spanspan class='pun',/spanspan
 class='pln' mc/spanspan class='pun'./spanspan
 class='pln'cores/spanspan class='pun'=/spanspan
 class='lit'6/spanspan class='pun'))/spanspan class='pln'
 mc8lapply/spanspan class='pun'=/spanspan 
 class='pln'system/spanspan
  class='pun'./spanspan class='pln'time/spanspan
 class='pun'(/spanspan class='pln'mclapply

Re: [R] matrix with standard errors of lm model

2013-08-08 Thread Bert Gunter
Perhaps

?vcov

is what you are looking for.

-- Bert

On Thu, Aug 8, 2013 at 10:37 AM, iza.ch1 iza@op.pl wrote:
 Hi

 Can someone give me a hint on how to create a matrix with standard errors 
 from lm model? I have already managed to get the matrix with coefficients:

 coef-as.data.frame(sapply(seq_len(ncol(es.w)),function( i) {x1- 
 summary(lm(es.w[,i]~es.median[,i]));x1$coef[,1]}))

 but I can't get the one like this for standard errors. I do regression for 
 each column.


 Thanks a lot :)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating new vectors from other dataFrames

2013-08-08 Thread Steven Ranney
I have two data frames

data1 - as.data.frame(matrix(data=c(1:4,5:8,9:12,13:24), nrow=4, ncol=6,
byrow=F, dimnames=list(c(1:4),c(a,b,c,d,e,z
data2 - as.data.frame(matrix(data=c(1:4,5:8,9:12,37:48), nrow=4, ncol=6,
byrow=F, dimnames=list(c(1:4),c(a,b,c,f,g,z

that have some common column names.

Comparing the names of the columns within each data frame to the other

setdiff(names(data1), names(data2))
setdiff(names(data2), names(data1))

provides which columns are different.

For each column that appears in data1 that DOES NOT appear in data2, I need
to create those columns and fill them with NA values.  The same is true for
the reverse.  So, I can create a vector of new column names that need to be
filled with NA values, but here is where I'm stuck.  I don't know how to
get the names from inside the vector into the respective dataFrame.

tmp1 - as.factor(paste(data2$, setdiff(names(data1), names(data2)),
sep=))
tmp2 - as.factor(paste(data1$, setdiff(names(data2), names(data1)),
sep=))

Of course, if it were as simple as only a few columns, I could do all of
this by hand, but in my original data frames, I have 60 different columns
that need to be created and filled with NA values for both data1 and data2.

Eventually, the point of this exercise is so that I can rbind(data1, data2)
and create a SQL table out of the merged dataFrames.  Unfortunately, I
can't rbind() everything until the column names are common across both
data1 and data2.

Thoughts?

Thanks -

SR



Steven H. Ranney

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating new vectors from other dataFrames

2013-08-08 Thread Steven Ranney
This is exactly what I'm looking for.  Each dataFrame will have those
columns that are endemic to the other filled with NA.

Thanks.

Steven H. Ranney


On Thu, Aug 8, 2013 at 12:17 PM, arun smartpink...@yahoo.com wrote:

 HI,

 Not sure about your expected result.

 library(plyr)
 data2New-join_all(lapply(setdiff(names(data1), names(data2)),function(x)
 {data2[,x]-NA; data2}))

 data1New-join_all(lapply(setdiff(names(data2),
 names(data1)),function(x){data1[,x]-NA;data1}))
  data1New
 #  a b  c  d  e  z  f  g
 #1 1 5  9 13 17 21 NA NA
 #2 2 6 10 14 18 22 NA NA
 #3 3 7 11 15 19 23 NA NA
 #4 4 8 12 16 20 24 NA NA
 A.K.



 - Original Message -
 From: Steven Ranney steven.ran...@gmail.com
 To: r-help@r-project.org r-help@r-project.org
 Cc:
 Sent: Thursday, August 8, 2013 2:01 PM
 Subject: [R] Creating new vectors from other dataFrames

 I have two data frames

 data1 - as.data.frame(matrix(data=c(1:4,5:8,9:12,13:24), nrow=4, ncol=6,
 byrow=F, dimnames=list(c(1:4),c(a,b,c,d,e,z
 data2 - as.data.frame(matrix(data=c(1:4,5:8,9:12,37:48), nrow=4, ncol=6,
 byrow=F, dimnames=list(c(1:4),c(a,b,c,f,g,z

 that have some common column names.

 Comparing the names of the columns within each data frame to the other

 setdiff(names(data1), names(data2))
 setdiff(names(data2), names(data1))

 provides which columns are different.

 For each column that appears in data1 that DOES NOT appear in data2, I need
 to create those columns and fill them with NA values.  The same is true for
 the reverse.  So, I can create a vector of new column names that need to be
 filled with NA values, but here is where I'm stuck.  I don't know how to
 get the names from inside the vector into the respective dataFrame.

 tmp1 - as.factor(paste(data2$, setdiff(names(data1), names(data2)),
 sep=))
 tmp2 - as.factor(paste(data1$, setdiff(names(data2), names(data1)),
 sep=))

 Of course, if it were as simple as only a few columns, I could do all of
 this by hand, but in my original data frames, I have 60 different columns
 that need to be created and filled with NA values for both data1 and data2.

 Eventually, the point of this exercise is so that I can rbind(data1, data2)
 and create a SQL table out of the merged dataFrames.  Unfortunately, I
 can't rbind() everything until the column names are common across both
 data1 and data2.

 Thoughts?

 Thanks -

 SR



 Steven H. Ranney

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Method dispatch in S4

2013-08-08 Thread Martin Morgan

On 08/04/2013 02:13 AM, Simon Zehnder wrote:

So, I found a solution: First in the initialize method of class C coerce
the C object into a B object. Then call the next method in the list with the
B class object. Now, in the initialize method of class B the object is a B
object and the respective generateSpec method is called. Then, in the
initialize method of C the returned object from callNextMethod has to be
written to the C class object in .Object. See the code below.

setMethod(initialize, C, function(.Object, value) {.Object@c - value;
object - as(.Object, B); object - callNextMethod(object, value);
as(.Object, B) - object; .Object - generateSpec(.Object);
return(.Object)})

This setting works. I do not know though, if this setting is the usual way
such things are done in R OOP. Maybe the whole class design is
disadvantageous. If anyone detects a mistaken design, I am very thankful to
learn.


Hi Simon -- your 'simple' example is pretty complicated, and I didn't really 
follow it in detail! The code is not formatted for easy reading (e.g., lines 
spanning no more than 80 columns) and some of it (e.g., generateSpec) might not 
be necessary to describe the problem you're having.


A good strategy is to ensure that 'new' called with no arguments works (there 
are other solutions, but following this rule has helped me to keep my classes 
and methods simple). This is not the case for


  new(A)
  new(C)

The reason for this strategy has to do with the way inheritance is implemented, 
in particular the coercion from derived to super class. Usually it is better to 
provide default values for arguments to initialize, and to specify arguments 
after a '...'. This means that your initialize methods will respects the 
contract set out in ?initialize, in particular the handling of unnamed arguments:


 ...: data to include in the new object.  Named arguments
  correspond to slots in the class definition. Unnamed
  arguments must be objects from classes that this class
  extends.

I might have written initialize,A-method as

  setMethod(initialize, A, function(.Object, ..., value=numeric()){
  .Object - callNextMethod(.Object, ..., a=value)
  generateSpec(.Object)
  })

Likely in a subsequent iteration I would have ended up with (using the 
convention that function names preceded by '.' are not exported)


  .A - setClass(A, representation(a = numeric, specA = numeric))

  .generateSpecA - function(a) {
  1 / a
   }

  A - function(a=numeric(), ...) {
  specA - .generateSpecA(a)
  .A(..., a=a, specA=specA)
  }

  setMethod(generateSpec, A, function(object) {
  .generateSpecA(object@a)
  })

ensuring that A() returns a valid object and avoiding the definition of an 
initialize method entirely.


Martin



Best

Simon


On Aug 3, 2013, at 9:43 PM, Simon Zehnder simon.zehn...@googlemail.com
wrote:


setMethod(initialize, C, function(.Object, value) {.Object@c - value;
.Object - callNextMethod(.Object, value); .Object -
generateSpec(.Object); return(.Object)})


__ R-help@r-project.org mailing
list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
guide http://www.R-project.org/posting-guide.html and provide commented,
minimal, self-contained, reproducible code.




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix with standard errors of lm model

2013-08-08 Thread David Winsemius

On Aug 8, 2013, at 10:54 AM, Bert Gunter wrote:

 Perhaps
 
 ?vcov
 
 is what you are looking for.
 
 -- Bert
 
 On Thu, Aug 8, 2013 at 10:37 AM, iza.ch1 iza@op.pl wrote:
 Hi
 
 Can someone give me a hint on how to create a matrix with standard errors 
 from lm model? I have already managed to get the matrix with coefficients:
 
 coef-as.data.frame(sapply(seq_len(ncol(es.w)),function( i) {x1- 
 summary(lm(es.w[,i]~es.median[,i]));x1$coef[,1]}))
 
 but I can't get the one like this for standard errors. I do regression for 
 each column.
 

It's a bit of a feature that coef(summary(lm(...))) returns a 4 column matrix 
of coefficients, standard errors, t-ratios and p-values while coef(lm(...)) 
just returns the estimated coefficients.

?lm
?summary.lm

The third column of the coef(summary()) result should be == 
diag(vcov(lm(...)).

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix with standard errors of lm model

2013-08-08 Thread Bert Gunter
Not quite, David. ... (see inline)

On Thu, Aug 8, 2013 at 1:56 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Aug 8, 2013, at 10:54 AM, Bert Gunter wrote:

 Perhaps

 ?vcov

 is what you are looking for.

 -- Bert

 On Thu, Aug 8, 2013 at 10:37 AM, iza.ch1 iza@op.pl wrote:
 Hi

 Can someone give me a hint on how to create a matrix with standard errors 
 from lm model? I have already managed to get the matrix with coefficients:

 coef-as.data.frame(sapply(seq_len(ncol(es.w)),function( i) {x1- 
 summary(lm(es.w[,i]~es.median[,i]));x1$coef[,1]}))

 but I can't get the one like this for standard errors. I do regression for 
 each column.


 It's a bit of a feature that coef(summary(lm(...))) returns a 4 column 
 matrix of coefficients, standard errors, t-ratios and p-values while 
 coef(lm(...)) just returns the estimated coefficients.

 ?lm
 ?summary.lm

 The third column of the coef(summary()) result should be == 
 diag(vcov(lm(...)).

No, it's == sqrt(diag(vcov(lm(...))

-- Bert


 --
 David Winsemius
 Alameda, CA, USA




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mac Os X 10.6.8, R and edit/vi

2013-08-08 Thread MacQueen, Don
This is a question for R-sig-mac.

However, try

  edit(file=file.choose())

Also, before your edit() command, try

  getwd()

Is the file in that directory??

-Don
-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 8/8/13 7:51 AM, David Epstein david.epst...@warwick.ac.uk wrote:

I tried using various versions of the 'edit' command. Here is an account
of
how this failed. I hope I have included all relevant information.

I haven't used R for a couple of years. Before restarting with R, I
downloaded the latest version I could find in its binary version, and
installed it without any problems.
Mac Os X Finder command About R responds with
R 3.0.1 GUI 1.61 Snow Leopard build (6492)

From inside R
version
  
 platform   x86_64-apple-darwin10.8.0
 arch   x86_64
 os darwin10.8.0#(However, my os is in fact 10.6.8)

 system x86_64, darwin10.8.0
 status  
 major  3
 minor  0.1
 year   2013
 month  05
 day16
 svn rev62743
 language   R
 version.string R version 3.0.1 (2013-05-16)
 nickname   Good Sport
 
edit(file='2.9.R')
 Error in file(con, r) : cannot open the connection
 In addition: Warning message:
 In file(con, r) : cannot open file '2.9.R': No such file or directory
 
getOption('editor')
[1] vi

edit(file='2.9.R',editor='/opt/local/bin/vim')
 Error in file(con, r) : cannot open the connection
 In addition: Warning message:
 In file(con, r) : cannot open file '2.9.R': No such file or directory
 
vi(file='try')
 Error in file(con, r) : cannot open the connection
 In addition: Warning message:
 In file(con, r) : cannot open file 'try': No such file or directory
 
And here is my interaction with tcsh (my default shell)
H2:~% echo $VISUAL
/opt/local/bin/vim
H2:~% echo $EDITOR
/opt/local/bin/vim
H2:~% which vi
vi:   aliased to /opt/local/bin/vim
H2:~/4Chap2% ls -ld
drwxr-xr-x  11 dbae  dbae  374  8 Aug 10:54 ./


What am I doing wrong?
Thanks for any help.
David
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] For loop output

2013-08-08 Thread MacQueen, Don
If I understand the request correctly, here is an easy to follow example:
(I'm using the first four letters as surrogates for the file names)
(and assuming we want quotes at both the beginning and the end)


 tmp - letters[1:4]
 tmp
[1] a b c d

 foo - paste( ', paste(tmp,collapse=','), ', sep='')
 cat(foo,'\n')
'a','b','c','d' 

In this case, cat() does a better job than print() of telling you exactly
what you have.
I use this sort of thing to construct arguments for the SQL in clause)



-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 8/8/13 8:05 AM, Jenny Williams jenny.willi...@kew.org wrote:

I am having difficulty storing the output of a for loop I have generated.
All I want to do is find all the files that I have, create a string with
all of the names in quotes and separated by commas. This is proving more
difficult than I initially anticipated.
I am sure it is either very simple or the construction of the for loop is
not quite right
The result gets automatically printed after the loop but I can't seem to
save it.
I have tried to create the element in advance but the result is the same:
NULL

individual.proj = 
Sys.glob(Arabica/proj_current/individual_projections/*.img, dirmark =
FALSE)
individual.proj
[1] 
Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.
tmp$pa.tab_Full_GBM.img
 [2] 
Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.
tmp$pa.tab_Full_GLM.img
 [3] 
Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.
tmp$pa.tab_Full_MARS.img
 [4] 
Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.
tmp$pa.tab_Full_RF.img
 [5] 
Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.
tmp$pa.tab_RUN10_GBM.img


##generate loop to create string out of the table of projected files.
L.ip = length(individual.proj)
  for (i in 1:L.ip){
   individual.proj.i - individual.proj[i]
   individual.proj.quote = cat(paste('', individual.proj.i, '',
',',sep=))
   }
Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.
tmp$pa.tab_Full_GBM.img,Arabica/proj_current/individual_projections/proj
_current

##print output string
individual.proj.quote
NULL

#command to be applied to individual.proj.quote to removed the final
comma from the string
substr(individual.proj.quote, 1, nchar(individual.proj.quote)-1)

Any help or pointers would be greatly appreciated, no amount of extensive
google searches have been fruitful so far.


**
Jenny Williams
Spatial Information Scientist, GIS Unit
Herbarium, Library, Art  Archives Directorate
Royal Botanic Gardens, Kew
Richmond, TW9 3AB, UK

Tel: +44 (0)208 332 5277
email: jenny.willi...@kew.orgmailto:jenny.willi...@kew.org
**

Film: The Forgotten Home of Coffee - Beyond the
Gardenshttp://www.youtube.com/watch?v=-uDtytKMKpAsns=tw
Stories: Coffee Expedition -
Ethiopiahttp://storify.com/KewGIS/coffee-expedition-ethiopia
 Kew in Harapan Rainforest
Sumatrahttp://storify.com/KewGIS/kew-in-harapan-rainforest
Articles: Seeing the wood for the
treeshttp://www.kew.org/ucm/groups/public/documents/document/kppcont_0606
02.pdf
How Kew's GIS team and South East Asia botanists are working to help
conserve and restore a rainforest in Sumatra. Download a pdf of this
article 
here.http://www.kew.org/ucm/groups/public/documents/document/kppcont_0606
02.pdf



The Royal Botanic Gardens, Kew is a non-departmental public body with
exempt charitable status, whose principal place of business is at Royal
Botanic Gardens, Kew, Richmond, Surrey TW9 3AB, United Kingdom.

The information contained in this email and any attachments is intended
solely for the addressee(s) and may contain confidential or legally
privileged information. If you have received this message in error,
please return it immediately and permanently delete it. Do not use, copy
or disclose the information contained in this email or in any attachment.

Any views expressed in this email do not necessarily reflect the opinions
of RBG Kew.

Any files attached to this email have been inspected with virus detection
software by RBG Kew before transmission, however you should carry out
your own virus checks before opening any attachments. RBG Kew accepts no
liability for any loss or damage which may be caused by software viruses.

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide 

Re: [R] Add column to dataframe based on code in other column

2013-08-08 Thread MacQueen, Don
Assuming your data frame of users is named 'users',
and using your mapping vectors:

 users$regions - ''
 users$regions[ users$State_Code) %in% NorthEast ] - 'NorthEast'

and repeat for the other regions


Or, if you put your mappings in a data frame then it is
as simple as

  merge(yourdataframe, regions)

(assuming the data frame of mappings is named 'regions')


The regions data frame should have two columns and 50 rows
The two columns contain the state codes and their
respective regions.

How you get that data frame of regions could vary;
here's an example using your vectors, but just two
of the regions:

regions - data.frame(
  region= c( rep('NorthEast',length(NorthEast)),
 rep('MidWest,length(MidWest))
   ),
  State_Code=c(NorthEast,Midwest)
  )

Note that this is untested. For example,
I could easily have mismatched parentheses.

The whole thing could also be done using match(),
without creating the dataframe of regions.


-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 8/8/13 2:33 AM, Dark i...@software-solutions.nl wrote:

Hi all,

I have a dataframe of users which contain US-state codes.
Now I want to add a column named REGION based on the state code. I have
already done a mapping:

NorthEast - c(07, 20, 22, 30, 31, 33, 39, 41, 47)
MidWest - c(14, 15, 16, 17, 23, 24, 26, 28, 35, 36, 43, 52)
South - c(01, 04, 08, 09, 10, 11, 18, 19, 21, 25, 34, 37, 42, 44, 45, 49,
51)
West - c(02, 03, 05, 06, 12, 13, 27, 29, 32, 38, 46, 50, 53)
Other - c(40, 48, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 94,
98, 99)

So for example:
NameState_Code
Tom   20
Harry 56
Ben 05
Sally   04

Should become like:
So for example:
NameState_Code REGION
Tom   20   NorthEast
Harry 56   Other
Ben 05  West
Sally   04   South

Could anyone help me with a clever statement?



--
View this message in context:
http://r.789695.n4.nabble.com/Add-column-to-dataframe-based-on-code-in-oth
er-column-tp4673335.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] p values for partial correlations

2013-08-08 Thread Peter Langfelder
I am not an expert on shrinkage estimators of partial correlations
(such as the one in corpcor), but my sense is that it is difficult to
provide a good estimate of a p-value. You could try to email the
authors of the package and ask them, but this may be more of a
statistics rather than R question.

Peter

On Wed, Aug 7, 2013 at 12:01 PM, Demetrio Luis Guadagnin
dlguadag...@gmail.com wrote:
 Dear:
 I needed to calculate partial correlations and used the package corpcor for
 that purpose.
 The output doesnot provide p values and I was unable to find information or
 posts on how to get them.
 Does someone can help me?
 Thanks.

 --
 Dr. Demetrio Luis Guadagnin
 Conservação e Manejo de Vida Silvestre
 Universidade Federal do Rio Grande do Sul
 Departamento de Ecologia
 Av. Bento Gonçalves 9500
 Setor 4, Prédio 43422, Sala 105
 Caixa Postal 15007 - 91501-970 Porto Alegre RS
 Fone: (51) 3308 6774
 Fax: (51) 3308 7626
 dlguadag...@gmail.com
 Skype: demetriolguadagnin

 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting only multiple occurrences

2013-08-08 Thread Kevin Parent
A lot of helpful solutions that pretty much all work. Thanks, everyone!

 
_
Kevin Parent, Ph.D
Korea Maritime University


 From: Rolf Turner rolf.tur...@xtra.co.nz
To: Jim Lemon j...@bitwrit.com.au 

ject.org 
Sent: Thursday, August 8, 2013 6:26 PM
Subject: Re: [R] Extracting only multiple occurrences


On 08/08/13 20:27, Jim Lemon wrote:
 On 08/08/2013 04:23 PM, Kevin Parent wrote:
 Well that almost works, and I didn't know about duplicated() so 
 thanks for that. However, it only gives me the duplicated values. I 
 need the original ones too. So the result I want is: 
 [g,g,m,m,s,s,t,t,u,u,u,v,v,x,x,y,y,y]. What duplicated() gives me is 
 [g,m,s,t,u,u,v,x,y,y]


 Hi Kevin,
 How about:

 x[x %in% duplicated(x)]

Uh, I think you mean

     x[x %in% x[duplicated(x)]]

Another idear:

     tx - table(x)
     tx - tx[tx1]
     rep(names(tx),tx)

Well, that's three lines as opposed to one, so not as good.  But it 
perhaps demonstrates
a useful tool to add to one's kit.

     cheers,

     Rolf
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] t.test error

2013-08-08 Thread iza.ch1
Hi 

I receive a very strange error message after trying to do t-test. When I write 
the code t.test(x) I get an error message: error in t.test(x) : function sqr 
not found

I don't understand this problem. Can someone help me how to do it right?

Thanks a lot :)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t.test error

2013-08-08 Thread David Winsemius

On Aug 8, 2013, at 6:09 PM, iza.ch1 wrote:

 Hi 
 
 I receive a very strange error message after trying to do t-test. When I 
 write the code t.test(x) I get an error message: error in t.test(x) : 
 function sqr not found
 
 I don't understand this problem. Can someone help me how to do it right?

Not unless you provide the code. I suspect you have written your own version of 
't.test' and have overwritten the version that is from the stats package. It's 
possible to mask R functions. When you type: `t.test` do you see this?

 t.test
function (x, ...) 
UseMethod(t.test)
bytecode: 0x102c124a0
environment: namespace:stats


--
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating new vectors from other dataFrames

2013-08-08 Thread arun
HI,

Not sure about your expected result.

library(plyr)
data2New-join_all(lapply(setdiff(names(data1), names(data2)),function(x) 
{data2[,x]-NA; data2}))

data1New-join_all(lapply(setdiff(names(data2), 
names(data1)),function(x){data1[,x]-NA;data1}))
 data1New
#  a b  c  d  e  z  f  g
#1 1 5  9 13 17 21 NA NA
#2 2 6 10 14 18 22 NA NA
#3 3 7 11 15 19 23 NA NA
#4 4 8 12 16 20 24 NA NA
A.K.



- Original Message -
From: Steven Ranney steven.ran...@gmail.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Thursday, August 8, 2013 2:01 PM
Subject: [R] Creating new vectors from other dataFrames

I have two data frames

data1 - as.data.frame(matrix(data=c(1:4,5:8,9:12,13:24), nrow=4, ncol=6,
byrow=F, dimnames=list(c(1:4),c(a,b,c,d,e,z
data2 - as.data.frame(matrix(data=c(1:4,5:8,9:12,37:48), nrow=4, ncol=6,
byrow=F, dimnames=list(c(1:4),c(a,b,c,f,g,z

that have some common column names.

Comparing the names of the columns within each data frame to the other

setdiff(names(data1), names(data2))
setdiff(names(data2), names(data1))

provides which columns are different.

For each column that appears in data1 that DOES NOT appear in data2, I need
to create those columns and fill them with NA values.  The same is true for
the reverse.  So, I can create a vector of new column names that need to be
filled with NA values, but here is where I'm stuck.  I don't know how to
get the names from inside the vector into the respective dataFrame.

tmp1 - as.factor(paste(data2$, setdiff(names(data1), names(data2)),
sep=))
tmp2 - as.factor(paste(data1$, setdiff(names(data2), names(data1)),
sep=))

Of course, if it were as simple as only a few columns, I could do all of
this by hand, but in my original data frames, I have 60 different columns
that need to be created and filled with NA values for both data1 and data2.

Eventually, the point of this exercise is so that I can rbind(data1, data2)
and create a SQL table out of the merged dataFrames.  Unfortunately, I
can't rbind() everything until the column names are common across both
data1 and data2.

Thoughts?

Thanks -

SR



Steven H. Ranney

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating new vectors from other dataFrames

2013-08-08 Thread arun


Also, a more compact solution would be:library(plyr)
#Creating a different dataframe as data2 columns were having almost the same as 
data1
set.seed(24)
 data3- as.data.frame(matrix(sample(1:40,6*4,replace=TRUE),ncol=6))
 colnames(data3)- colnames(data2)
 join(data3,data1)
#Joining by: a, b, c, z
 #  a  b  c  f  g  z  d  e
#1 12 27 33 27  8  4 NA NA
#2  9 37 11 27  2 23 NA NA
#3 29 12 25 13 21 30 NA NA
#4 21 31 15 37  6  6 NA NA
 join(data1,data3)
#Joining by: a, b, c, z
#  a b  c  d  e  z  f  g
#1 1 5  9 13 17 21 NA NA
#2 2 6 10 14 18 22 NA NA
#3 3 7 11 15 19 23 NA NA
#4 4 8 12 16 20 24 NA NA

A.K.




A.K.






From: Steven Ranney steven.ran...@gmail.com
To: arun smartpink...@yahoo.com 
Cc: R help r-help@r-project.org 
Sent: Thursday, August 8, 2013 2:21 PM
Subject: Re: [R] Creating new vectors from other dataFrames



This is exactly what I'm looking for.  Each dataFrame will have those columns 
that are endemic to the other filled with NA.

Thanks.


Steven H. Ranney


On Thu, Aug 8, 2013 at 12:17 PM, arun smartpink...@yahoo.com wrote:

HI,

Not sure about your expected result.

library(plyr)
data2New-join_all(lapply(setdiff(names(data1), names(data2)),function(x) 
{data2[,x]-NA; data2}))

data1New-join_all(lapply(setdiff(names(data2), 
names(data1)),function(x){data1[,x]-NA;data1}))
 data1New
#  a b  c  d  e  z  f  g
#1 1 5  9 13 17 21 NA NA
#2 2 6 10 14 18 22 NA NA
#3 3 7 11 15 19 23 NA NA
#4 4 8 12 16 20 24 NA NA
A.K.




- Original Message -
From: Steven Ranney steven.ran...@gmail.com
To: r-help@r-project.org r-help@r-project.org
Cc:
Sent: Thursday, August 8, 2013 2:01 PM
Subject: [R] Creating new vectors from other dataFrames

I have two data frames

data1 - as.data.frame(matrix(data=c(1:4,5:8,9:12,13:24), nrow=4, ncol=6,
byrow=F, dimnames=list(c(1:4),c(a,b,c,d,e,z
data2 - as.data.frame(matrix(data=c(1:4,5:8,9:12,37:48), nrow=4, ncol=6,
byrow=F, dimnames=list(c(1:4),c(a,b,c,f,g,z

that have some common column names.

Comparing the names of the columns within each data frame to the other

setdiff(names(data1), names(data2))
setdiff(names(data2), names(data1))

provides which columns are different.

For each column that appears in data1 that DOES NOT appear in data2, I need
to create those columns and fill them with NA values.  The same is true for
the reverse.  So, I can create a vector of new column names that need to be
filled with NA values, but here is where I'm stuck.  I don't know how to
get the names from inside the vector into the respective dataFrame.

tmp1 - as.factor(paste(data2$, setdiff(names(data1), names(data2)),
sep=))
tmp2 - as.factor(paste(data1$, setdiff(names(data2), names(data1)),
sep=))

Of course, if it were as simple as only a few columns, I could do all of
this by hand, but in my original data frames, I have 60 different columns
that need to be created and filled with NA values for both data1 and data2.

Eventually, the point of this exercise is so that I can rbind(data1, data2)
and create a SQL table out of the merged dataFrames.  Unfortunately, I
can't rbind() everything until the column names are common across both
data1 and data2.

Thoughts?

Thanks -

SR



Steven H. Ranney

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.