date:20150616

An aside...

Just wanted to point out that:

fun - function(x)log(x)

can be more simply replaced by:

fun - log

Functions in R a full first class objects and can be treated as such. In
your example, this is still silly of course, but becomes relevant in
function calls where you can do things like

myfun - function( FUN = log,...)

{ ...
something - FUN(X)
...
}

Just in case this might be useful to you.

Cheers,
Bert

Bert Gunter

Data is not information. Information is not knowledge. And knowledge is
certainly not wisdom.
   -- Clifford Stoll

On Mon, Jun 15, 2015 at 4:32 PM, Greg Hather ghat...@gmail.com wrote:

 Hello R users,

 I encountered a strange problem while writing a package that uses the
 nlme function.  First, I wrote some code that uses the nlme function,
 and it ran without errors.  However, when I tried to put the code into
 a package, the nlme function was unable to locate a function that was
 used in the formula.  Could it be that nlme is looking in the wrong
 environment?  I would appreciate any suggestions.  Below is a
 reproducible example with the problem.

 ### BEGIN EXAMPLE ##

 #' Fake package to show nlme error
 #' @export

 main_function - function(x){
  library(nlme)
  result - nlme(height ~ SSasymp(age, Asym, R0, lrc) +
 nonlinear_function(age),
 data = Loblolly,
 fixed = Asym + R0 + lrc ~ 1,
 random = Asym ~ 1,
 start = c(Asym = 103, R0 = -8.5, lrc = -3.3))
  result
 }

 nonlinear_function - function(x){
  log(x)
 }

 ### END EXAMPLE ##

 The above code can be installed as a package and run with the commands

 library(devtools)
 library(roxygen2)
 setwd(C:/test)  # or any prefered directory
 create(testPackage)
 setwd(./testPackage)
 document()
 setwd(..)
 install(testPackage)
 main_function()

 The output is

  main_function()
 Error in eval(expr, envir, enclos) :
  could not find function nonlinear_function
 
  sessionInfo()
 R version 3.1.3 (2015-03-09)
 Platform: x86_64-w64-mingw32/x64 (64-bit)
 Running under: Windows 8 x64 (build 9200)
 locale:
 [1] LC_COLLATE=English_United States.1252
 [2] LC_CTYPE=English_United States.1252
 [3] LC_MONETARY=English_United States.1252
 [4] LC_NUMERIC=C
 [5] LC_TIME=English_United States.1252
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods
 [7] base
 other attached packages:
 [1] nlme_3.1-120   testPackage_0.0.0.9000
 [3] roxygen2_4.1.1 devtools_1.8.0
 loaded via a namespace (and not attached):
 [1] curl_0.8digest_0.6.8git2r_0.10.1
 [4] grid_3.1.3  lattice_0.20-31 magrittr_1.5
 [7] memoise_0.2.1   Rcpp_0.11.6 rversions_1.0.1
 [10] stringi_0.4-1   stringr_1.0.0   tools_3.1.3
 [13] xml2_0.1.1

 Note that if I simply paste main_function and nonlinear_function into
 the R console, then main_function() runs without errors.

 Greg

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error in local package install

2015-06-16 Thread Uwe Ligges




On 16.06.2015 16:33, Axel Urbiz wrote:

Thanks again Uwe. I haven't renamed the file, only in the text sent to
R-help. Here's the error again I'm getting. Sorry, this s a bit
frustrating...


No idea. Perhaps the down load failed? Can you open the file using some 
zip software and extract the DESCRIPTION file?


Best,
Uwe Ligges




Thanks,
Axel


Error in read.dcf(file.path(pkgname, DESCRIPTION), c(Package,
Type)) :
   cannot open the connection
In addition: Warning messages:
1: In unzip(zipname, exdir = dest) : error 1 in extracting from zip file
2: In read.dcf(file.path(pkgname, DESCRIPTION), c(Package, Type)) :
   cannot open compressed file 'calibr/DESCRIPTION', probable reason 'No
such file or directory'
 

On Tue, Jun 16, 2015 at 10:18 AM, Uwe Ligges
lig...@statistik.tu-dortmund.de
mailto:lig...@statistik.tu-dortmund.de wrote:



On 16.06.2015 15:16, Axel Urbiz wrote:

Thanks Uwe. Actually, the problem persists in R-3.2.1.

If it helps, the .zip file is here:

http://win-builder.r-project.org/yC8eUu09w3Ui/



Works for me, but your error message is:


cannot open compressed file 'mypackage/DESCRIPTION'

which suggests you renamed the file?  You must not do that, just
keep the filename calibr_0.0.0.9000.zip.

Best,
Uwe Ligges


Thank you,
Axel.



On Mon, Jun 15, 2015 at 5:41 PM, Uwe Ligges
lig...@statistik.tu-dortmund.de
mailto:lig...@statistik.tu-dortmund.de
mailto:lig...@statistik.tu-dortmund.de
mailto:lig...@statistik.tu-dortmund.de wrote:



 On 15.06.2015 22:32, Axel Urbiz wrote:

 Hello,

 I've built a windows binary package from my Mac using
the help
 from this
 site: http://win-builder.r-project.org

 As expected, I got back the file mypackage.zip. Also,
the logs
 show no
 errors.


 No, you got a file packagename_version.zip.



 Now, when I try to install on windows using the GUI
install
 package(s)
 from local zip files, I get the following error:

 utils:::menuInstallLocal()

 Error in read.dcf(file.path(pkgname, DESCRIPTION),
 c(Package, Type))
 :
 cannot open the connection
 In addition: Warning messages:
 1: In unzip(zipname, exdir = dest) : error 1 in
extracting from
 zip file
 2: In read.dcf(file.path(pkgname, DESCRIPTION),
c(Package,
 Type)) :
 cannot open compressed file 'mypackage/DESCRIPTION',
 probable reason 'No
 such file or directory'

 I've attempted to use the solutions from prior similar
email
 threats with
 no success. Btw - I've install all the packages
dependencies
 prior to the
 above. I'm on R 3.2.0.


 please try the release condadate of R-3.2.1, R-3.2.0 had a
bug for
 package installation from local zip files.

 Best,
 Uwe Ligges


 Any guidance would be much appreciated.

 Thank you.

 Axel.

  [[alternative HTML version deleted]]

 __
R-help@r-project.org mailto:R-help@r-project.org
mailto:R-help@r-project.org mailto:R-help@r-project.org
mailing list
 -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
reproducible code.





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about XML package (accurately access one attribute in an multi-attribution node on the web page)

2015-06-16 Thread Humphrey Zhao

Dear Sir/Madam:

Thank you for your attention to my question. I have downloaded the source code 
of some web pages by RCurl, and I am trying to extract the URL from them. In 
these web pages, there are many nodes contains the same URL, such like the 
followings:

a href=\http://cos.name/2015/05/the-data-wisdom-for-data-science/\ 
rel=\bookmark\

a 
href=\http://blog.shakirm.com/2015/03/a-statistical-view-of-deep-learning-ii-auto-encoders-and-free-energy/\;
 target=\_blank\

a 
href=\http://cos.name/2015/05/the-data-wisdom-for-data-science/#more-10947\ 
class=\more-link\

I want to accurately choose the URL I need(the href in the first one), and I 
tried many ways the most accuracy is just like the following:

library(XML)

#links-getHTMLLinks(base.html, xpQuery = //a/@href)

links-getHTMLLinks(base.html, xpQuery = c(//a/href[@rel='bookmark']))

However, I still believe that there is a correct method to do this very well, 
but I could not find it. I wonder if you could give me some advice on solving 
this problem. And I would be most grateful if you could reply at your earliest 
convenience. Looking forward to hearing from you. Thank you very much.

 Sincerely yours 

 Humphrey Zhao
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Boxplot using a shapefile

2015-06-16 Thread Preethi Balaji

Dear all,

Thanks very much for your help! I will keep your suggestions in mind
and will get back to you if I get stuck!



On Tue, Jun 16, 2015 at 1:28 PM, Roger Bivand roger.biv...@nhh.no wrote:
 Boris Steipe boris.steipe at utoronto.ca writes:


 Your workflow in principle is:

 - read the image into an object for which you can obtain values-per-pixel
 in a 2D structure;
 - read the shapefile and convert into a polygon;
 - determine the bounding box of the polygon;
 - use the inout() function of the splancs package to get a list of
 booleans for the
 points in the bounding box, TRUE if they are _inside_ the polygon;
 - subset your image points to those for which inout() returns TRUE;
 - plot as boxplot().

 The CRAN taskview http://cran.r-project.org/web/views/MedicalImaging.html
 has a section on general
 image processing, guiding you to helpful packages.

 Actually, this is the wrong taskview if the data are as described, as
 Spatial data are covered in the Spatial task view at:

 http://cran.r-project.org/web/views/Spatial.html

 The workflow as described is also muddled: [T]he shapefile takes the
 pixel values from the image and shows the distribution of pixels in
 the form of a boxplot doesn't actually mean anything without further
 assumptions.

 A shapefile is an ESRI file format for GIS vector geometries (and
 attributes) that may be polygons, lines or points, and has an associated
 coordinate reference system; it is almost never used for other kinds of data.

 The image - presumably a GIS raster data file, should have the same
 coordinate reference system, or be transformed to the same system (use
 spTransform in the rgdal package, which is also the package you should use
 for reading the input data as it correctly reads input coordinate reference
 systems if available).

 The operation then needed is called an over() method in the sp package, and
 extract() in the raster package.

 If the shapefile contains points, the over query is asking the value(s) of
 the raster cells (pixels) at those points, given the same coordinate
 reference systems - but only one boxplot. If lines, for each line you may
 get a vector of values from raster cells intersected by the lines, and could
 make a boxplot for each line; you may wish to weight each value by the
 length of line in each cell. If polygons, as lines, with weighting by
 intersection area.

 The over vignette in the sp package is where you need to go to begin:

 http://cran.r-project.org/web/packages/sp/vignettes/over.pdf

 and the introduction to the raster package as a further reference:

 http://cran.r-project.org/web/packages/raster/vignettes/Raster.pdf


 Ask again if you get stuck - but(!):
 - see here for some hints on how to ask questions productively:
   http://adv-r.had.co.nz/Reproducibility.html

 http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
 - ... and please read the posting guide and don't post in HTML.


 Definitely! And note that this is a question that is better suited to the
 R-sig-geo list.

 Hope this clarifies,

 Roger

 B.

 On Jun 15, 2015, at 7:19 AM, Preethi Balaji preet.balaji20 at
 gmail.com wrote:

  Dear all,
 
  I am trying to generate boxplots by giving a shapefile and an image as
  input. The shapefile takes the pixel values from the image and shows
  the distribution of pixels in the form of a boxplot.
 
  Can somebody please tell me how I can execute this in R?
 
  Many thanks!
 
  --
 
  Regards,
  Preethi Malur Balaji | PhD Student
  University College Cork | Cork, Ireland.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Regards,
Preethi Malur Balaji | PhD Student
University College Cork | Cork, Ireland.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with nlme, environments, and packages

2015-06-16 Thread Duncan Murdoch

On 16/06/2015 10:34 AM, Greg Hather wrote:
 Hi Duncan,
 
 I checked the global environment, and it was empty, so I think that
 rules out the second possibility.  I posted a tarball at
 
 https://drive.google.com/file/d/0B8hBX90jtuLcaGtOUktqV2V4UUU/view?usp=sharing
 
 Thank you for your help!
 
 Greg
 

The problem is that nlme does a lot of evaluation of formula objects
without taking their associated environment into account.  Fixing it
doesn't look easy, because the evaluation happens in a lot of places.

One workaround is to put the appropriate environment(s) on the search
list before calling nlme().  This isn't perfect, because the search
order will be wrong, but it will get you something.

For example, your main_function could be

main_function - function(x){

  library(nlme)
  attach(parent.env(env=environment()))
  result - nlme(height ~ SSasymp(age, Asym, R0, lrc) +
nonlinear_function(age),
 data = Loblolly,
 fixed = Asym + R0 + lrc ~ 1,
 random = Asym ~ 1,
 start = c(Asym = 103, R0 = -8.5, lrc = -3.3))
  detach()
  result
}

Duncan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error in local package install

2015-06-16 Thread Axel Urbiz

Thanks again Uwe. I haven't renamed the file, only in the text sent to
R-help. Here's the error again I'm getting. Sorry, this s a bit
frustrating...

Thanks,
Axel


Error in read.dcf(file.path(pkgname, DESCRIPTION), c(Package, Type))
:
  cannot open the connection
In addition: Warning messages:
1: In unzip(zipname, exdir = dest) : error 1 in extracting from zip file
2: In read.dcf(file.path(pkgname, DESCRIPTION), c(Package, Type)) :
  cannot open compressed file 'calibr/DESCRIPTION', probable reason 'No
such file or directory'


On Tue, Jun 16, 2015 at 10:18 AM, Uwe Ligges 
lig...@statistik.tu-dortmund.de wrote:



 On 16.06.2015 15:16, Axel Urbiz wrote:

 Thanks Uwe. Actually, the problem persists in R-3.2.1.

 If it helps, the .zip file is here:

 http://win-builder.r-project.org/yC8eUu09w3Ui/



 Works for me, but your error message is:


 cannot open compressed file 'mypackage/DESCRIPTION'

 which suggests you renamed the file?  You must not do that, just keep the
 filename calibr_0.0.0.9000.zip.

 Best,
 Uwe Ligges


  Thank you,
 Axel.



 On Mon, Jun 15, 2015 at 5:41 PM, Uwe Ligges
 lig...@statistik.tu-dortmund.de
 mailto:lig...@statistik.tu-dortmund.de wrote:



 On 15.06.2015 22:32, Axel Urbiz wrote:

 Hello,

 I've built a windows binary package from my Mac using the help
 from this
 site: http://win-builder.r-project.org

 As expected, I got back the file mypackage.zip. Also, the logs
 show no
 errors.


 No, you got a file packagename_version.zip.



 Now, when I try to install on windows using the GUI install
 package(s)
 from local zip files, I get the following error:

 utils:::menuInstallLocal()

 Error in read.dcf(file.path(pkgname, DESCRIPTION),
 c(Package, Type))
 :
 cannot open the connection
 In addition: Warning messages:
 1: In unzip(zipname, exdir = dest) : error 1 in extracting from
 zip file
 2: In read.dcf(file.path(pkgname, DESCRIPTION), c(Package,
 Type)) :
 cannot open compressed file 'mypackage/DESCRIPTION',
 probable reason 'No
 such file or directory'

 I've attempted to use the solutions from prior similar email
 threats with
 no success. Btw - I've install all the packages dependencies
 prior to the
 above. I'm on R 3.2.0.


 please try the release condadate of R-3.2.1, R-3.2.0 had a bug for
 package installation from local zip files.

 Best,
 Uwe Ligges


 Any guidance would be much appreciated.

 Thank you.

 Axel.

  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailto:R-help@r-project.org mailing list
 -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error working dsm: Error in fix.by(by.x, x) : 'by' must specify a uniquely valid column

2015-06-16 Thread Milagros Antun

Hello, I`m trying to use dsm package, *(library(Distance); library(dsm)*) ,
following Miller`s Appendix (
http://onlinelibrary.wiley.com/store/10./2041-210X.12105/asset/supinfo/mee312105-sup-0001-AppendixS1.pdf?v=1s=ced953b57365e5eb5753f0ad76dcc02c26918736
 ).

I work with three dataframes, whose str are:

*1) segdata:*
 data.frame': 193 obs. of  17 variables:
 $ Sample.Lab: int  1 2 3 4 5 6 7 8 9 10 ...
 $ Transect.Label: Factor w/ 56 levels 1,100,101,..: 36 36 36 36 36
20 56 52 52 52 ...
 $ Effort: int  1800 1800 1800 1800 1800 1800 1800 1800 1800 1800
...
 $ x : num  4443636 4437817 4442085 4440564 4439117 ...
 $ y : num  5267395 5271579 5268309 5269266 5270337 ...
 $ ID_ESTRATO: int  3 2 3 2 2 2 2 2 4 2 ...
 $ NDVI2010  : num  1813 1816 1804 1807 1816 ...
 $ NDVI2011  : num  2007 1943 1935 1894 1893 ...
 $ NDVI2012  : num  1705 1736 1686 1691 1729 ...
 $ NDVI2013  : num  2206 2305 2145 2211 2279 ...
 $ PROM_NDVI : num  2218 2313 2148 2206 2275 ...
 $ DIST_PUEST: num  959 455 2652 3194 1394 ...
 $ DIST_CUADR: num  1482.1 137.5 549.9 62.9 514.8 ...
 $ DIST_MOLIN: num  794 5022 2519 4156 5715 ...
 $ X_4326: num  -63.7 -63.8 -63.7 -63.7 -63.7 ...
 $ Y_4326: num  -42.7 -42.7 -42.7 -42.7 -42.7 ...
 $ O.KM2_2015: num  64.1 34.6 43.4 44.4 46.6 ...

*2) obsdata:*
'data.frame': 399 obs. of  6 variables:
 $ Especie.: Factor w/ 1 level Oveja: 1 1 1 1 1 1 1 1 1 1 ...
 $ size: int  3 1 5 18 6 2 6 3 5 2 ...
 $ distance: int  210 178 65 210 250 37 72 350 380 320 ...
 $ object  : int  1 2 5 7 8 13 14 20 30 31 ...
 $ Sample.Label: int  26 26 30 30 30 29 28 27 31 31 ...
 $ Effort  : num  1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 ...

*3)disdata*
'data.frame': 399 obs. of  7 variables:
 $ x   : num  4418278 4418667 4421229 4421308 4421308 ...
 $ y   : num  5299140 5298846 5295963 5295805 5295805 ...
 $ Especie.: Factor w/ 1 level Oveja: 1 1 1 1 1 1 1 1 1 1 ...
 $ size: int  3 1 5 18 6 2 6 3 5 2 ...
 $ distance: int  210 178 65 210 250 37 72 350 380 320 ...
 $ object  : int  1 2 5 7 8 13 14 20 30 31 ...
 $ Effort  : num  1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 ...


*First, I **fitted a detection function with the **script: *

*hr.model -ds(distdata,truncation=10%,transect =line,dht.group=
FALSE,key =hr, convert.units = 1,adjustment =NULL) *

*And then, I tried to fit a  **very simple model, with the script:*


*mod1-dsm(count ~ s(x,y, k=6), ddf.obj=hr.model, segdata, obsdata, engine
= gam,convert.units = 1, family = quasipoisson(link = log),group =
FALSE,gamma = 1.4,control = list(keepData = TRUE),availability =
1,segment.area = NULL, weights = NULL)*

*Here I made a mistake, because R show me an Error: *


*Error in fix.by http://fix.by/(by.x, x) : 'by' must specify a uniquely
valid column*

*Does anybody can help me? Thanks in advance!*


*Milagros*
-- 
Lic. Ma. de los Milagros Antún
Centro Nacional Patagónico-CONICET
Boulevard Brown 2915
9120 Puerto Madryn
Argentina
Tel. +54 (0) 280 4883184
Interno 1345
Fax +54 (0) 280 4883543

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] reading daily snow depth data

2015-06-16 Thread Alemu Tadesse

Dear All,

I was going to read daily snow data  for each state and station/city from
the following link. I was not able to separate a given state's data from
the rest of the contents of the file, read the data to a data frame and
save it to file.

http://www1.ncdc.noaa.gov/pub/data/snowmonitoring/fema/06-2015-dlysndpth.txt

I really appreciate your time and help, and also appreciate any information
 for an alternative source.

Best,

Alemu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] model selection

Wrong list! This is about R. Post on a statistics list like
stats.stackexchange.com for statistics questions.

Cheers,
Bert

Bert Gunter

Data is not information. Information is not knowledge. And knowledge is
certainly not wisdom.
   -- Clifford Stoll

On Mon, Jun 15, 2015 at 3:55 PM, bruno cid bccgu...@yahoo.com.br wrote:

 Hi friends,

 Im trying to make a model selection comparing models built with lm
 function (package stats) and lme function (package nlme). Do you know
 if there is a problem to compare these models with the function AICtab
 (package bbmle).

 Thanks!!! Bruno Cid Crespo GuimarãesMestre em EcologiaLaboratório de
 Ecologia e Conservação de PopulaçõesUniversidade Federal do Rio de Janeiro
 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R-es] Error usando el paquete dsm

2015-06-16 Thread Milagros Antun

Hola, intento utilizar el paquete dsm,*(library(Distance); library(dsm)*) ,
siguiendo el Anexo de Miller (
http://onlinelibrary.wiley.com/store/10./2041-210X.12105/asset/supinfo/mee312105-sup-0001-AppendixS1.pdf?v=1s=ced953b57365e5eb5753f0ad76dcc02c26918736
).

Trabajo con *tres dataframes, *cuyas str se detallan:

*1) segdata:*
 data.frame': 193 obs. of  17 variables:
 $ Sample.Lab: int  1 2 3 4 5 6 7 8 9 10 ...
 $ Transect.Label: Factor w/ 56 levels 1,100,101,..: 36 36 36 36 36
20 56 52 52 52 ...
 $ Effort: int  1800 1800 1800 1800 1800 1800 1800 1800 1800 1800
...
 $ x : num  4443636 4437817 4442085 4440564 4439117 ...
 $ y : num  5267395 5271579 5268309 5269266 5270337 ...
 $ ID_ESTRATO: int  3 2 3 2 2 2 2 2 4 2 ...
 $ NDVI2010  : num  1813 1816 1804 1807 1816 ...
 $ NDVI2011  : num  2007 1943 1935 1894 1893 ...
 $ NDVI2012  : num  1705 1736 1686 1691 1729 ...
 $ NDVI2013  : num  2206 2305 2145 2211 2279 ...
 $ PROM_NDVI : num  2218 2313 2148 2206 2275 ...
 $ DIST_PUEST: num  959 455 2652 3194 1394 ...
 $ DIST_CUADR: num  1482.1 137.5 549.9 62.9 514.8 ...
 $ DIST_MOLIN: num  794 5022 2519 4156 5715 ...
 $ X_4326: num  -63.7 -63.8 -63.7 -63.7 -63.7 ...
 $ Y_4326: num  -42.7 -42.7 -42.7 -42.7 -42.7 ...
 $ O.KM2_2015: num  64.1 34.6 43.4 44.4 46.6 ...

*2) obsdata:*
'data.frame': 399 obs. of  6 variables:
 $ Especie.: Factor w/ 1 level Oveja: 1 1 1 1 1 1 1 1 1 1 ...
 $ size: int  3 1 5 18 6 2 6 3 5 2 ...
 $ distance: int  210 178 65 210 250 37 72 350 380 320 ...
 $ object  : int  1 2 5 7 8 13 14 20 30 31 ...
 $ Sample.Label: int  26 26 30 30 30 29 28 27 31 31 ...
 $ Effort  : num  1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 ...

*3)disdata*
'data.frame': 399 obs. of  7 variables:
 $ x   : num  4418278 4418667 4421229 4421308 4421308 ...
 $ y   : num  5299140 5298846 5295963 5295805 5295805 ...
 $ Especie.: Factor w/ 1 level Oveja: 1 1 1 1 1 1 1 1 1 1 ...
 $ size: int  3 1 5 18 6 2 6 3 5 2 ...
 $ distance: int  210 178 65 210 250 37 72 350 380 320 ...
 $ object  : int  1 2 5 7 8 13 14 20 30 31 ...
 $ Effort  : num  1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 ...


*Luego de ajustar una función de detección con el script: *

*hr.model -ds(distdata,truncation=10%,transect =line,dht.group=
FALSE,key =hr, convert.units = 1,adjustment =NULL) *

*Intento ajustar mis datos a un modelo aplicando gams por medio del paquete
dsm, corriendo el siguiente script:*


*mod1-dsm(count ~ s(x,y, k=6), ddf.obj=hr.model, segdata, obsdata, engine
= gam,convert.units = 1, family = quasipoisson(link = log),group =
FALSE,gamma = 1.4,control = list(keepData = TRUE),availability =
1,segment.area = NULL, weights = NULL)*

*Aquí es cuando tengo inconvenientes, ya que me sale el error:*


*Error in fix.by http://fix.by(by.x, x) : 'by' must specify a uniquely
valid column*

*Alguien podría ayudarme a resolverlo?? Desde ya muchas gracias!!!*

*Milagros*
-- 
Lic. Ma. de los Milagros Antún
Centro Nacional Patagónico-CONICET
Boulevard Brown 2915
9120 Puerto Madryn
Argentina
Tel. +54 (0) 280 4883184
Interno 1345
Fax +54 (0) 280 4883543

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

Thanks, Dimitri.  Burt is the real wizard here--I'll bet he can conjure up 
an elegant solution.


For me, just reaching a desired endpoint is enoughg.

Clint

Clint BowmanINTERNET:   cl...@ecy.wa.gov
Air Quality Modeler INTERNET:   cl...@math.utah.edu
Department of Ecology   VOICE:  (360) 407-6815
PO Box 47600FAX:(360) 407-7534
Olympia, WA 98504-7600

USPS:   PO Box 47600, Olympia, WA 98504-7600
Parcels:300 Desmond Drive, Lacey, WA 98503-1274

On Tue, 16 Jun 2015, Dimitri Liakhovitski wrote:


Thank you, Clint.
That's the thing: it's relatively easy to do it in base, but the
resulting code is not THAT simple.
I thought dplyr would make it easy...

On Tue, Jun 16, 2015 at 2:06 PM, Clint Bowman cl...@ecy.wa.gov wrote:

May want to add headers but the following provides the device number with
each set fo sums:

for (dev in (unique(md$device)))
{cat(colSums(subset(md,md$device==dev)==5,na.rm=T),dev,\n)}

Clint BowmanINTERNET:   cl...@ecy.wa.gov
Air Quality Modeler INTERNET:   cl...@math.utah.edu
Department of Ecology   VOICE:  (360) 407-6815
PO Box 47600FAX:(360) 407-7534
Olympia, WA 98504-7600

USPS:   PO Box 47600, Olympia, WA 98504-7600
Parcels:300 Desmond Drive, Lacey, WA 98503-1274

On Tue, 16 Jun 2015, Dimitri Liakhovitski wrote:


Except, of course, Bert, that you forgot that it had to be done by
device. Your solution ignores the device.

md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c =
c(1,3,4,3,5,5),
 device = c(1,1,2,2,3,3))
myvars = c(a, b, c)
md[2,3] - NA
md[4,1] - NA
md
vapply(md[myvars], function(x) sum(x==5,na.rm=TRUE),1L)

But the result should be by device.

On Tue, Jun 16, 2015 at 1:56 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:


Thank you, Bert.
I'll be honest - I am just learning dplyr and was wondering if one
could do it in dplyr.
But of course your solution is perfect...

On Tue, Jun 16, 2015 at 1:50 PM, Bert Gunter bgunter.4...@gmail.com
wrote:


Well, dplyr seems a bit of overkill as it's so simple with plain old
vapply() in base R :



dat - data.frame (a=sample(1:5,10,rep=TRUE),


+b=sample(3:7,10,rep=TRUE),
+g = sample(7:9,10,rep=TRUE))


vapply(dat,function(x)sum(x==5,na.rm=TRUE),1L)



a b g
5 4 0



Cheers,
Bert

Bert Gunter

Data is not information. Information is not knowledge. And knowledge is
certainly not wisdom.
   -- Clifford Stoll

On Tue, Jun 16, 2015 at 10:24 AM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:



Hello!

I have a data frame:

md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c =
c(1,3,4,3,5,5),
  device = c(1,1,2,2,3,3))
myvars = c(a, b, c)
md[2,3] - NA
md[4,1] - NA
md

I want to count number of 5s in each column - by device. I can do it
like
this:

library(dplyr)
group_by(md, device) %%
summarise(counts.a = sum(a==5, na.rm = T),
  counts.b = sum(b==5, na.rm = T),
  counts.c = sum(c==5, na.rm = T))

However, in real life I'll have tons of variables (the length of
'myvars' can be very large) - so that I can't specify those counts.a,
counts.b, etc. manually - dozens of times.

Does dplyr allow to run the count of 5s on all 'myvars' columns at
once?


--
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








--
Dimitri Liakhovitski





--
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







--
Dimitri Liakhovitski



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error in local package install

2015-06-16 Thread Duncan Murdoch

On 16/06/2015 1:27 PM, Uwe Ligges wrote:
 
 
 On 16.06.2015 16:33, Axel Urbiz wrote:
 Thanks again Uwe. I haven't renamed the file, only in the text sent to
 R-help. Here's the error again I'm getting. Sorry, this s a bit
 frustrating...
 
 No idea. Perhaps the down load failed? Can you open the file using some 
 zip software and extract the DESCRIPTION file?

It may also be a permissions problem:  perhaps the file couldn't be
unzipped, because the user doesn't have write permission.  Are you
installing to the default library?  Perhaps you should try installing to
a personal library instead.

Duncan Murdoch
 
 Best,
 Uwe Ligges
 
 

 Thanks,
 Axel


 Error in read.dcf(file.path(pkgname, DESCRIPTION), c(Package,
 Type)) :
cannot open the connection
 In addition: Warning messages:
 1: In unzip(zipname, exdir = dest) : error 1 in extracting from zip file
 2: In read.dcf(file.path(pkgname, DESCRIPTION), c(Package, Type)) :
cannot open compressed file 'calibr/DESCRIPTION', probable reason 'No
 such file or directory'
  

 On Tue, Jun 16, 2015 at 10:18 AM, Uwe Ligges
 lig...@statistik.tu-dortmund.de
 mailto:lig...@statistik.tu-dortmund.de wrote:



 On 16.06.2015 15:16, Axel Urbiz wrote:

 Thanks Uwe. Actually, the problem persists in R-3.2.1.

 If it helps, the .zip file is here:

 http://win-builder.r-project.org/yC8eUu09w3Ui/



 Works for me, but your error message is:


 cannot open compressed file 'mypackage/DESCRIPTION'

 which suggests you renamed the file?  You must not do that, just
 keep the filename calibr_0.0.0.9000.zip.

 Best,
 Uwe Ligges


 Thank you,
 Axel.



 On Mon, Jun 15, 2015 at 5:41 PM, Uwe Ligges
 lig...@statistik.tu-dortmund.de
 mailto:lig...@statistik.tu-dortmund.de
 mailto:lig...@statistik.tu-dortmund.de
 mailto:lig...@statistik.tu-dortmund.de wrote:



  On 15.06.2015 22:32, Axel Urbiz wrote:

  Hello,

  I've built a windows binary package from my Mac using
 the help
  from this
  site: http://win-builder.r-project.org

  As expected, I got back the file mypackage.zip. Also,
 the logs
  show no
  errors.


  No, you got a file packagename_version.zip.



  Now, when I try to install on windows using the GUI
 install
  package(s)
  from local zip files, I get the following error:

  utils:::menuInstallLocal()

  Error in read.dcf(file.path(pkgname, DESCRIPTION),
  c(Package, Type))
  :
  cannot open the connection
  In addition: Warning messages:
  1: In unzip(zipname, exdir = dest) : error 1 in
 extracting from
  zip file
  2: In read.dcf(file.path(pkgname, DESCRIPTION),
 c(Package,
  Type)) :
  cannot open compressed file 'mypackage/DESCRIPTION',
  probable reason 'No
  such file or directory'

  I've attempted to use the solutions from prior similar
 email
  threats with
  no success. Btw - I've install all the packages
 dependencies
  prior to the
  above. I'm on R 3.2.0.


  please try the release condadate of R-3.2.1, R-3.2.0 had a
 bug for
  package installation from local zip files.

  Best,
  Uwe Ligges


  Any guidance would be much appreciated.

  Thank you.

  Axel.

   [[alternative HTML version deleted]]

  __
 R-help@r-project.org mailto:R-help@r-project.org
 mailto:R-help@r-project.org mailto:R-help@r-project.org
 mailing list
  -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.



 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,

[R] dplyr - counting a number of specific values in each column - for all columns at once

Hello!

I have a data frame:

md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c = c(1,3,4,3,5,5),
  device = c(1,1,2,2,3,3))
myvars = c(a, b, c)
md[2,3] - NA
md[4,1] - NA
md

I want to count number of 5s in each column - by device. I can do it like this:

library(dplyr)
group_by(md, device) %%
summarise(counts.a = sum(a==5, na.rm = T),
  counts.b = sum(b==5, na.rm = T),
  counts.c = sum(c==5, na.rm = T))

However, in real life I'll have tons of variables (the length of
'myvars' can be very large) - so that I can't specify those counts.a,
counts.b, etc. manually - dozens of times.

Does dplyr allow to run the count of 5s on all 'myvars' columns at once?


-- 
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once


Any problem with

colSums(md==5, na.rm=T)

Clint BowmanINTERNET:   cl...@ecy.wa.gov
Air Quality Modeler INTERNET:   cl...@math.utah.edu
Department of Ecology   VOICE:  (360) 407-6815
PO Box 47600FAX:(360) 407-7534
Olympia, WA 98504-7600

USPS:   PO Box 47600, Olympia, WA 98504-7600
Parcels:300 Desmond Drive, Lacey, WA 98503-1274

On Tue, 16 Jun 2015, Dimitri Liakhovitski wrote:


Hello!

I have a data frame:

md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c = c(1,3,4,3,5,5),
 device = c(1,1,2,2,3,3))
myvars = c(a, b, c)
md[2,3] - NA
md[4,1] - NA
md

I want to count number of 5s in each column - by device. I can do it like this:

library(dplyr)
group_by(md, device) %%
summarise(counts.a = sum(a==5, na.rm = T),
 counts.b = sum(b==5, na.rm = T),
 counts.c = sum(c==5, na.rm = T))

However, in real life I'll have tons of variables (the length of
'myvars' can be very large) - so that I can't specify those counts.a,
counts.b, etc. manually - dozens of times.

Does dplyr allow to run the count of 5s on all 'myvars' columns at once?


--
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once


It would help if I could see beyond my allergy meds.

A start could be:

colSums(subset(md,md$device==1)==5,na.rm=T)
colSums(subset(md,md$device==2)==5,na.rm=T)
colSums(subset(md,md$device==3)==5,na.rm=T)


Clint BowmanINTERNET:   cl...@ecy.wa.gov
Air Quality Modeler INTERNET:   cl...@math.utah.edu
Department of Ecology   VOICE:  (360) 407-6815
PO Box 47600FAX:(360) 407-7534
Olympia, WA 98504-7600

USPS:   PO Box 47600, Olympia, WA 98504-7600
Parcels:300 Desmond Drive, Lacey, WA 98503-1274

On Tue, 16 Jun 2015, Dimitri Liakhovitski wrote:


Hello!

I have a data frame:

md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c = c(1,3,4,3,5,5),
 device = c(1,1,2,2,3,3))
myvars = c(a, b, c)
md[2,3] - NA
md[4,1] - NA
md

I want to count number of 5s in each column - by device. I can do it like this:

library(dplyr)
group_by(md, device) %%
summarise(counts.a = sum(a==5, na.rm = T),
 counts.b = sum(b==5, na.rm = T),
 counts.c = sum(c==5, na.rm = T))

However, in real life I'll have tons of variables (the length of
'myvars' can be very large) - so that I can't specify those counts.a,
counts.b, etc. manually - dozens of times.

Does dplyr allow to run the count of 5s on all 'myvars' columns at once?


--
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about XML package (accurately access one attribute in an multi-attribution node on the web page)

2015-06-16 Thread Boris Steipe

Humphrey -

Any correct method requires you to specify _uniquely_ what you are looking
for. If the bookmark keyword is necessary and unique, it appears you have a
working solution. Or what else where you trying to accomplish?

Cheers,
Boris

On Jun 16, 2015, at 9:01 AM, Humphrey Zhao humphrey.z...@yahoo.com wrote:

Dear Sir/Madam:

Thank you for your attention to my question. I have downloaded the source
code of some web pages by RCurl, and I am trying to extract the URL from
them. In these web pages, there are many nodes contains the same URL, such
like the followings:

a href=\http://cos.name/2015/05/the-data-wisdom-for-data-science/\;
rel=\bookmark\

a
href=\http://blog.shakirm.com/2015/03/a-statistical-view-of-deep-learning-ii-auto-encoders-and-free-energy/\;
target=\_blank\

a
href=\http://cos.name/2015/05/the-data-wisdom-for-data-science/#more-10947\;
class=\more-link\

I want to accurately choose the URL I need(the href in the first one), and
I tried many ways the most accuracy is just like the following:

library(XML)

#links-getHTMLLinks(base.html, xpQuery = //a/@href)

links-getHTMLLinks(base.html, xpQuery = c(//a/href[@rel='bookmark']))

However, I still believe that there is a correct method to do this very well,
but I could not find it. I wonder if you could give me some advice on solving
this problem. And I would be most grateful if you could reply at your
earliest convenience. Looking forward to hearing from you. Thank you very
much.

Sincerely yours

Humphrey Zhao
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

May want to add headers but the following provides the device number with 
each set fo sums:


for (dev in (unique(md$device))) 
{cat(colSums(subset(md,md$device==dev)==5,na.rm=T),dev,\n)}


Clint BowmanINTERNET:   cl...@ecy.wa.gov
Air Quality Modeler INTERNET:   cl...@math.utah.edu
Department of Ecology   VOICE:  (360) 407-6815
PO Box 47600FAX:(360) 407-7534
Olympia, WA 98504-7600

USPS:   PO Box 47600, Olympia, WA 98504-7600
Parcels:300 Desmond Drive, Lacey, WA 98503-1274

On Tue, 16 Jun 2015, Dimitri Liakhovitski wrote:


Except, of course, Bert, that you forgot that it had to be done by
device. Your solution ignores the device.

md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c = c(1,3,4,3,5,5),
 device = c(1,1,2,2,3,3))
myvars = c(a, b, c)
md[2,3] - NA
md[4,1] - NA
md
vapply(md[myvars], function(x) sum(x==5,na.rm=TRUE),1L)

But the result should be by device.

On Tue, Jun 16, 2015 at 1:56 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:

Thank you, Bert.
I'll be honest - I am just learning dplyr and was wondering if one
could do it in dplyr.
But of course your solution is perfect...

On Tue, Jun 16, 2015 at 1:50 PM, Bert Gunter bgunter.4...@gmail.com wrote:

Well, dplyr seems a bit of overkill as it's so simple with plain old
vapply() in base R :



dat - data.frame (a=sample(1:5,10,rep=TRUE),

+b=sample(3:7,10,rep=TRUE),
+g = sample(7:9,10,rep=TRUE))


vapply(dat,function(x)sum(x==5,na.rm=TRUE),1L)


a b g
5 4 0



Cheers,
Bert

Bert Gunter

Data is not information. Information is not knowledge. And knowledge is
certainly not wisdom.
   -- Clifford Stoll

On Tue, Jun 16, 2015 at 10:24 AM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:


Hello!

I have a data frame:

md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c =
c(1,3,4,3,5,5),
  device = c(1,1,2,2,3,3))
myvars = c(a, b, c)
md[2,3] - NA
md[4,1] - NA
md

I want to count number of 5s in each column - by device. I can do it like
this:

library(dplyr)
group_by(md, device) %%
summarise(counts.a = sum(a==5, na.rm = T),
  counts.b = sum(b==5, na.rm = T),
  counts.c = sum(c==5, na.rm = T))

However, in real life I'll have tons of variables (the length of
'myvars' can be very large) - so that I can't specify those counts.a,
counts.b, etc. manually - dozens of times.

Does dplyr allow to run the count of 5s on all 'myvars' columns at once?


--
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







--
Dimitri Liakhovitski




--
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

No problem at all, Clint.
I was just trying to figure out of dplyr can do it.

On Tue, Jun 16, 2015 at 1:40 PM, Clint Bowman cl...@ecy.wa.gov wrote:
 Any problem with

 colSums(md==5, na.rm=T)

 Clint BowmanINTERNET:   cl...@ecy.wa.gov
 Air Quality Modeler INTERNET:   cl...@math.utah.edu
 Department of Ecology   VOICE:  (360) 407-6815
 PO Box 47600FAX:(360) 407-7534
 Olympia, WA 98504-7600

 USPS:   PO Box 47600, Olympia, WA 98504-7600
 Parcels:300 Desmond Drive, Lacey, WA 98503-1274


 On Tue, 16 Jun 2015, Dimitri Liakhovitski wrote:

 Hello!

 I have a data frame:

 md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c =
 c(1,3,4,3,5,5),
  device = c(1,1,2,2,3,3))
 myvars = c(a, b, c)
 md[2,3] - NA
 md[4,1] - NA
 md

 I want to count number of 5s in each column - by device. I can do it like
 this:

 library(dplyr)
 group_by(md, device) %%
 summarise(counts.a = sum(a==5, na.rm = T),
  counts.b = sum(b==5, na.rm = T),
  counts.c = sum(c==5, na.rm = T))

 However, in real life I'll have tons of variables (the length of
 'myvars' can be very large) - so that I can't specify those counts.a,
 counts.b, etc. manually - dozens of times.

 Does dplyr allow to run the count of 5s on all 'myvars' columns at once?


 --
 Dimitri Liakhovitski

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

Thank you, Bert.
I'll be honest - I am just learning dplyr and was wondering if one
could do it in dplyr.
But of course your solution is perfect...

On Tue, Jun 16, 2015 at 1:50 PM, Bert Gunter bgunter.4...@gmail.com wrote:
 Well, dplyr seems a bit of overkill as it's so simple with plain old
 vapply() in base R :


 dat - data.frame (a=sample(1:5,10,rep=TRUE),
 +b=sample(3:7,10,rep=TRUE),
 +g = sample(7:9,10,rep=TRUE))

 vapply(dat,function(x)sum(x==5,na.rm=TRUE),1L)

 a b g
 5 4 0



 Cheers,
 Bert

 Bert Gunter

 Data is not information. Information is not knowledge. And knowledge is
 certainly not wisdom.
-- Clifford Stoll

 On Tue, Jun 16, 2015 at 10:24 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:

 Hello!

 I have a data frame:

 md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c =
 c(1,3,4,3,5,5),
   device = c(1,1,2,2,3,3))
 myvars = c(a, b, c)
 md[2,3] - NA
 md[4,1] - NA
 md

 I want to count number of 5s in each column - by device. I can do it like
 this:

 library(dplyr)
 group_by(md, device) %%
 summarise(counts.a = sum(a==5, na.rm = T),
   counts.b = sum(b==5, na.rm = T),
   counts.c = sum(c==5, na.rm = T))

 However, in real life I'll have tons of variables (the length of
 'myvars' can be very large) - so that I can't specify those counts.a,
 counts.b, etc. manually - dozens of times.

 Does dplyr allow to run the count of 5s on all 'myvars' columns at once?


 --
 Dimitri Liakhovitski

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] Regresión logística

2015-06-16 Thread eric


MaLuz, podrias adjuntar los datos o una direccion de donde descargarlos para 
ejecutar tu codigo  y replicar el problema ? justo estoy trabajando en unas 
regresiones logisticas y quiza podria darte una mano.

Saludos, eric.



On 16/06/15 05:01, MªLuz Morales wrote:

Gracias!

El 15 de junio de 2015, 16:54, Freddy Omar López Quintero 
freddy.vat...@gmail.com escribió:

 Holap.

 ran out of iterations and failed to converge


 Prueba aumentando el número de iteraciones, con el argumento maxit:

 GLM - bigglm(In.hospital_death ~ GCS + BUN, data = DatosGLM, family =
 binomial(logit), maxit=1000)


 Salud.

 --
 «No soy aquellas sombras tutelares
 que honré con versos que no olvida el tiempo.»

 JL Borges


[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es



--
Forest Engineer
Master in Environmental and Natural Resource Economics
Ph.D. student in Sciences of Natural Resources at La Frontera University
Member in AguaDeTemu2030, citizen movement for Temuco with green city standards 
for living

Nota: Las tildes se han omitido para asegurar compatibilidad con algunos 
lectores de correo.

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R-es] Problemas al cargar Rcomander en consola de Rstudio

2015-06-16 Thread eric

MaLuz, hasta donde entiendo RStudio y R-commander son entornos de 
trabajo graficos para R, R-commander no es una libreria 
(http://www.rcommander.com/), de modo que me parece raro invocarlo desde 
dentro de R. Segun yo deberias llamar a R-commander tal como llamas a 
RStudio, como un programa desde la consola linux o con un shorcut o 
desde un menu o algo asi. TclTk si es un conjunto de librerias graficas, 
por eso las puedes cargar sin problemas desde dentro de R.


Saludos, eric.



On 16/06/15 05:14, MªLuz Morales wrote:

Hola,
tengo instalado R y Rstudio sobre linux en máquina virtual. en la consola
de Rstudio he instalado el interfaz Rcommander con la instrucción:

install.packages(Rcmdr,dependences=TRUE),

pero al cargar el paquete
library(Rcmdr)

obtengo el siguiente error:


Loading required package: splinesLoading required package:
RcmdrMiscLoading required package: carLoading required package:
sandwichError : .onLoad failed in loadNamespace() for 'Rcmdr',
details:
   call: structure(.External(.C_dotTclObjv, objv), class = tclObj)
   error: [tcl] invalid command name tk_messageBox.
In addition: Warning message:In fun(libname, pkgname) : couldn't
connect to display :0Error: package or namespace load failed for
‘Rcmdr’


Esto lo carga sin problemas
library(tcltk)


Gracias
Un saludo
MªLuz

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es



--
Forest Engineer
Master in Environmental and Natural Resource Economics
Ph.D. student in Sciences of Natural Resources at La Frontera University
Member in AguaDeTemu2030, citizen movement for Temuco with green city 
standards for living


Nota: Las tildes se han omitido para asegurar compatibilidad con algunos 
lectores de correo.


___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

Well, dplyr seems a bit of overkill as it's so simple with plain old
vapply() in base R :


 dat - data.frame (a=sample(1:5,10,rep=TRUE),
+b=sample(3:7,10,rep=TRUE),
+g = sample(7:9,10,rep=TRUE))

 vapply(dat,function(x)sum(x==5,na.rm=TRUE),1L)

a b g
5 4 0



Cheers,
Bert

Bert Gunter

Data is not information. Information is not knowledge. And knowledge is
certainly not wisdom.
   -- Clifford Stoll

On Tue, Jun 16, 2015 at 10:24 AM, Dimitri Liakhovitski 
dimitri.liakhovit...@gmail.com wrote:

 Hello!

 I have a data frame:

 md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c =
 c(1,3,4,3,5,5),
   device = c(1,1,2,2,3,3))
 myvars = c(a, b, c)
 md[2,3] - NA
 md[4,1] - NA
 md

 I want to count number of 5s in each column - by device. I can do it like
 this:

 library(dplyr)
 group_by(md, device) %%
 summarise(counts.a = sum(a==5, na.rm = T),
   counts.b = sum(b==5, na.rm = T),
   counts.c = sum(c==5, na.rm = T))

 However, in real life I'll have tons of variables (the length of
 'myvars' can be very large) - so that I can't specify those counts.a,
 counts.b, etc. manually - dozens of times.

 Does dplyr allow to run the count of 5s on all 'myvars' columns at once?


 --
 Dimitri Liakhovitski

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

Except, of course, Bert, that you forgot that it had to be done by
device. Your solution ignores the device.

md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c = c(1,3,4,3,5,5),
  device = c(1,1,2,2,3,3))
myvars = c(a, b, c)
md[2,3] - NA
md[4,1] - NA
md
vapply(md[myvars], function(x) sum(x==5,na.rm=TRUE),1L)

But the result should be by device.

On Tue, Jun 16, 2015 at 1:56 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Thank you, Bert.
 I'll be honest - I am just learning dplyr and was wondering if one
 could do it in dplyr.
 But of course your solution is perfect...

 On Tue, Jun 16, 2015 at 1:50 PM, Bert Gunter bgunter.4...@gmail.com wrote:
 Well, dplyr seems a bit of overkill as it's so simple with plain old
 vapply() in base R :


 dat - data.frame (a=sample(1:5,10,rep=TRUE),
 +b=sample(3:7,10,rep=TRUE),
 +g = sample(7:9,10,rep=TRUE))

 vapply(dat,function(x)sum(x==5,na.rm=TRUE),1L)

 a b g
 5 4 0



 Cheers,
 Bert

 Bert Gunter

 Data is not information. Information is not knowledge. And knowledge is
 certainly not wisdom.
-- Clifford Stoll

 On Tue, Jun 16, 2015 at 10:24 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:

 Hello!

 I have a data frame:

 md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c =
 c(1,3,4,3,5,5),
   device = c(1,1,2,2,3,3))
 myvars = c(a, b, c)
 md[2,3] - NA
 md[4,1] - NA
 md

 I want to count number of 5s in each column - by device. I can do it like
 this:

 library(dplyr)
 group_by(md, device) %%
 summarise(counts.a = sum(a==5, na.rm = T),
   counts.b = sum(b==5, na.rm = T),
   counts.c = sum(c==5, na.rm = T))

 However, in real life I'll have tons of variables (the length of
 'myvars' can be very large) - so that I can't specify those counts.a,
 counts.b, etc. manually - dozens of times.

 Does dplyr allow to run the count of 5s on all 'myvars' columns at once?


 --
 Dimitri Liakhovitski

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 --
 Dimitri Liakhovitski



-- 
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

Yes, indeed. Thanks, David.

But if you check, tapply, aggregate(), by(), etc. are all basically
wrappers to lapply() .So it's all a question of what syntax one feels most
comfortable with. However note that data.table, plyR stuff and perhaps
others are different in that they re-implement the underlying engines,
thereby gaining efficiencies that some folks may want as well as new syntax.


Cheers,
Bert

Bert Gunter

Data is not information. Information is not knowledge. And knowledge is
certainly not wisdom.
   -- Clifford Stoll

On Tue, Jun 16, 2015 at 1:22 PM, David L Carlson dcarl...@tamu.edu wrote:

 Not in base, but in stats:

  aggregate(md[,-4]==5, list(device=md$device), sum, na.rm=TRUE)
   device a b c
 1  1 1 2 0
 2  2 0 1 0
 3  3 1 0 2

 -
 David L Carlson
 Department of Anthropology
 Texas AM University
 College Station, TX 77840-4352

 -Original Message-
 From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bert
 Gunter
 Sent: Tuesday, June 16, 2015 3:02 PM
 To: Hadley Wickham
 Cc: r-help
 Subject: Re: [R] dplyr - counting a number of specific values in each
 column - for all columns at once

 ... my bad! -- I filed to read carefully.

 A base syntax version is:

 dat - data.frame (a=sample(1:5,10,rep=TRUE),
b=sample(3:7,10,rep=TRUE),
g = sample(7:9,10,rep=TRUE))

 dev - sample(1:3,10,rep=TRUE)

 sapply(dat,function(x)
   tapply(x,dev,function(x)sum(x==5,na.rm=TRUE)))

   a b g
 1 2 0 0
 2 1 3 0
 3 2 1 0

 I think, no matter what, that there are 2 loops here: An outer one by
 column and an inner one by device within each column.

 Being both old and lazy, I have found it easier and more natural to stick
 with the basic functional syntax of the apply family of functions rather
 than to learn an alternative database type syntax (and semantics). My
 applications were never so large that the possible execution inefficiency
 mattered. However, it certainly might for others.  And of course, what is
 natural for me might not be for others.

 Cheers,
 Bert

 Bert Gunter

 Data is not information. Information is not knowledge. And knowledge is
 certainly not wisdom.
-- Clifford Stoll

 On Tue, Jun 16, 2015 at 12:47 PM, Hadley Wickham h.wick...@gmail.com
 wrote:

  On Tue, Jun 16, 2015 at 12:24 PM, Dimitri Liakhovitski
  dimitri.liakhovit...@gmail.com wrote:
   Hello!
  
   I have a data frame:
  
   md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c =
  c(1,3,4,3,5,5),
 device = c(1,1,2,2,3,3))
   myvars = c(a, b, c)
   md[2,3] - NA
   md[4,1] - NA
   md
  
   I want to count number of 5s in each column - by device. I can do it
  like this:
  
   library(dplyr)
   group_by(md, device) %%
   summarise(counts.a = sum(a==5, na.rm = T),
 counts.b = sum(b==5, na.rm = T),
 counts.c = sum(c==5, na.rm = T))
  
   However, in real life I'll have tons of variables (the length of
   'myvars' can be very large) - so that I can't specify those counts.a,
   counts.b, etc. manually - dozens of times.
  
   Does dplyr allow to run the count of 5s on all 'myvars' columns at
 once?
 
  md %%
group_by(device) %%
summarise_each(funs(sum(. == 5, na.rm = TRUE)))
 
  Hadley
 
  --
  http://had.co.nz/
 
  __
  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] Ayuda boxplot ggplot2

2015-06-16 Thread Juan Camilo Lara

Muchas gracias, funcionó correctamente.

Att:

Juan Camilo Lara C.

El 16 de junio de 2015, 15:15, pepeceb pepe...@yahoo.es escribió:

 Hola, tienes que añadirlle esta instrucciion
 + ylim (0,60)+

 Saludos


 vplayout - function(x, y) viewport(layout.pos.row = x, layout.pos.col = y)


 tor-ggplot(parasitos, aes(x=Arrenurus, y = torax, fill= Arrenurus)) +

 geom_boxplot(binwidth = 2) +


 geom_boxplot(binwidth = 2) + ylim (0,60)+


 El 16/6/2015 21:54:38, Juan Camilo Lara'juanch...@gmail.com' escribió:
 Hola a todos

 Me gustaría saber si me pueden ayudar con lo siguiente.

 Realicé un Boxplot usando ggplot2 para visualizar el comportamiento de dos
 variables. Visualmente no se notan las diferencias porque la gráfica de la
 derecha (parásitos en el abdomen) llega hasta 20 en el eje y. ¿Cómo puedo
 hacer para que las dos gráficas muestren la misma escala en el eje Y, es
 decir, que las dos lleguen a 60?

 Adjunto el boxplot y a continuación el código que usé para producirlo.

 vplayout - function(x, y) viewport(layout.pos.row = x, layout.pos.col = y
 )


 tor-ggplot(parasitos, aes(x=Arrenurus, y = torax, fill= Arrenurus)) +

 geom_boxplot(binwidth = 2) +

 scale_fill_manual(values = c(lightgreen, lightblue))+

 ylab(Total parásitos)+

 xlab()+

 ggtitle(Parásitos en el tórax)

 abd- ggplot(parasitos, aes(x=Arrenurus, y = abdomen, fill= Arrenurus)) +

 geom_boxplot(binwidth = 2) +

 scale_fill_manual(values = c(lightgreen, lightblue))+

 ylab(Total parásitos)+

 xlab()+

 ggtitle(Parásitos en el abdomen)


 grid.newpage()

 pushViewport(viewport(layout = grid.layout(1, 2)))

 print(tor, vp = vplayout(1, 1))

 print(abd, vp = vplayout(1, 2))



 Gracias por su ayuda


 Att: Juan Camilo Lara C.



[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread David Winsemius


On Jun 16, 2015, at 11:18 AM, Clint Bowman wrote:

 Thanks, Dimitri.  Burt is the real wizard here--I'll bet he can conjure up an 
 elegant solution.

This would be base method:

 by( md[-4]==5, md[4], colSums)
device: 1
a b c 
1 2 0 
- 
device: 2
a b c 
1 1 0 
- 
device: 3
a b c 
1 0 2 

You could adapt that to use myvars:

 by(md[myvars]==5, md[!names(md) %in% myvars],colSums)
device: 1
a b c 
1 2 0 
- 
device: 2
a b c 
1 1 0 
- 
device: 3
a b c 
1 0 2 

And if you want them smushed into a matrix then use rbind:

 do.call( rbind, by(md[myvars]==5, md[!names(md) %in% myvars],colSums))
  a b c
1 1 2 0
2 1 1 0
3 1 0 2

 
 For me, just reaching a desired endpoint is enoughg.
 
 Clint
 
 Clint Bowman  INTERNET:   cl...@ecy.wa.gov
 Air Quality Modeler   INTERNET:   cl...@math.utah.edu
 Department of Ecology VOICE:  (360) 407-6815
 PO Box 47600  FAX:(360) 407-7534
 Olympia, WA 98504-7600
 
USPS:   PO Box 47600, Olympia, WA 98504-7600
Parcels:300 Desmond Drive, Lacey, WA 98503-1274
 
 On Tue, 16 Jun 2015, Dimitri Liakhovitski wrote:
 
 Thank you, Clint.
 That's the thing: it's relatively easy to do it in base, but the
 resulting code is not THAT simple.
 I thought dplyr would make it easy...
 
 On Tue, Jun 16, 2015 at 2:06 PM, Clint Bowman cl...@ecy.wa.gov wrote:
 May want to add headers but the following provides the device number with
 each set fo sums:
 
 for (dev in (unique(md$device)))
 {cat(colSums(subset(md,md$device==dev)==5,na.rm=T),dev,\n)}
 
 Clint BowmanINTERNET:   cl...@ecy.wa.gov
 Air Quality Modeler INTERNET:   cl...@math.utah.edu
 Department of Ecology   VOICE:  (360) 407-6815
 PO Box 47600FAX:(360) 407-7534
 Olympia, WA 98504-7600
 
USPS:   PO Box 47600, Olympia, WA 98504-7600
Parcels:300 Desmond Drive, Lacey, WA 98503-1274
 
 On Tue, 16 Jun 2015, Dimitri Liakhovitski wrote:
 
 Except, of course, Bert, that you forgot that it had to be done by
 device. Your solution ignores the device.
 
 md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c =
 c(1,3,4,3,5,5),
 device = c(1,1,2,2,3,3))
 myvars = c(a, b, c)
 md[2,3] - NA
 md[4,1] - NA
 md
 vapply(md[myvars], function(x) sum(x==5,na.rm=TRUE),1L)
 
 But the result should be by device.
 
 On Tue, Jun 16, 2015 at 1:56 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 
 Thank you, Bert.
 I'll be honest - I am just learning dplyr and was wondering if one
 could do it in dplyr.
 But of course your solution is perfect...
 
 On Tue, Jun 16, 2015 at 1:50 PM, Bert Gunter bgunter.4...@gmail.com
 wrote:
 
 Well, dplyr seems a bit of overkill as it's so simple with plain old
 vapply() in base R :
 
 
 dat - data.frame (a=sample(1:5,10,rep=TRUE),
 
 +b=sample(3:7,10,rep=TRUE),
 +g = sample(7:9,10,rep=TRUE))
 
 vapply(dat,function(x)sum(x==5,na.rm=TRUE),1L)
 
 
 a b g
 5 4 0
 
 
 
 Cheers,
 Bert
 
 Bert Gunter
 
 Data is not information. Information is not knowledge. And knowledge is
 certainly not wisdom.
   -- Clifford Stoll
 
 On Tue, Jun 16, 2015 at 10:24 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 
 
 Hello!
 
 I have a data frame:
 
 md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c =
 c(1,3,4,3,5,5),
  device = c(1,1,2,2,3,3))
 myvars = c(a, b, c)
 md[2,3] - NA
 md[4,1] - NA
 md
 
 I want to count number of 5s in each column - by device. I can do it
 like
 this:
 
 library(dplyr)
 group_by(md, device) %%
 summarise(counts.a = sum(a==5, na.rm = T),
  counts.b = sum(b==5, na.rm = T),
  counts.c = sum(c==5, na.rm = T))
 
 However, in real life I'll have tons of variables (the length of
 'myvars' can be very large) - so that I can't specify those counts.a,
 counts.b, etc. manually - dozens of times.
 
 Does dplyr allow to run the count of 5s on all 'myvars' columns at
 once?
 
 
 --
 Dimitri Liakhovitski
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
 
 --
 Dimitri Liakhovitski
 
 
 
 
 --
 Dimitri Liakhovitski
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
 -- 
 Dimitri Liakhovitski

Re: [R] reading daily snow depth data

2015-06-16 Thread jim holtman

Here is an example of reading in the data.  After that it is a data frame
and should be able to process it with dplyr/data.table without much trouble:

 x - readLines(
http://www1.ncdc.noaa.gov/pub/data/snowmonitoring/fema/06-2015-dlysndpth.txt
)
 writeLines(x, '/temp/snow.txt')  # save for testing
 head(x)
[1]


[2] State:
AL

[3]Lat Lon  COOP# StnID State City/Station Name
County Elev  Jun 1  Jun 2  Jun 3  Jun
4  Jun 5  Jun 6  Jun 7  Jun 8  Jun 9  Jun10
Jun11  Jun12  Jun13  Jun14  Jun15  Jun16
[4]  33.59  -85.86 010272  AL ANNISTON ARPT ASOS
CALHOUN  594  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000  -.000
[5]  33.83  -85.78 014209  AL JACKSONVILLE
CALHOUN  608  -.000  -.000  -.000
-.000  -.000  0.000  0.000  -.000  -.000
-.000  -.000  -.000  -.000  -.000  -.000  -.000
[6]  34.74  -87.60 015749  AL MUSCLE SHOALS AP
COLBERT  540  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000  -.000
 z - grepl((^$)|(^State)|(^   Lat), x)  # get lines to discard
 xm - x[!z]  # remove info lines
 head(xm)
[1]  33.59  -85.86 010272  AL ANNISTON ARPT ASOS
CALHOUN  594  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000  -.000
[2]  33.83  -85.78 014209  AL JACKSONVILLE
CALHOUN  608  -.000  -.000  -.000
-.000  -.000  0.000  0.000  -.000  -.000
-.000  -.000  -.000  -.000  -.000  -.000  -.000
[3]  34.74  -87.60 015749  AL MUSCLE SHOALS AP
COLBERT  540  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000  -.000
[4]  31.32  -85.45 012372  AL DOTHAN FAA AIRPORT
DALE 374  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000  -.000
[5]  32.70  -87.58 013511  AL GREENSBORO
HALE 220  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000  -.000
[6]  33.57  -86.74 010831  AL BIRMINGHAM AP ASOS
JEFFERSON615  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000
0.000  0.000  0.000  0.000  0.000  0.000  -.000

 # read in the data
 xf - textConnection(xm)
 snow - read.fwf(xf
+ , width = c(6,8,7,10,3,32,26,6,rep(11,16))
+ , comment.char = ''
+ , as.is = TRUE
+ )
 str(snow)
'data.frame':   3067 obs. of  24 variables:
 $ V1 : num  33.6 33.8 34.7 31.3 32.7 ...
 $ V2 : num  -85.9 -85.8 -87.6 -85.5 -87.6 ...
 $ V3 : int  10272 14209 15749 12372 13511 10831 11225 14064 12245 15478 ...
 $ V4 : chr  ...
 $ V5 : chr  AL  AL  AL  AL  ...
 $ V6 : chr  ANNISTON ARPT ASOS  
JACKSONVILLE MUSCLE SHOALS AP
DOTHAN FAA AIRPORT   ...
 $ V7 : chr  CALHOUNCALHOUN   
COLBERTDALE   ...
 $ V8 : int  594 608 540 374 220 615 461 624 100 215 ...
 $ V9 : num  0 - 0 0 0 ...
 $ V10: num  0 - 0 0 0 ...
 $ V11: num  0 - 0 0 0 ...
 $ V12: num  0 - 0 0 0 ...
 $ V13: num  0 - 0 0 0 ...
 $ V14: num  0 0 0 0 0 ...
 $ V15: num  0 0 0 0 0 ...
 $ V16: num  0 - 0 0 0 ...
 $ V17: num  0 - 0 0 0 ...
 $ V18: num  0 - 0 0 0 ...
 $ V19: num  0 - 0 0 0 ...
 $ V20: num  0 - 0 0 0 ...
 $ V21: num  0 - 0 0 0 ...
 $ V22: num  0 - 0 0 0 ...
 $ V23: num  0 - 0 0 0 ...
 $ V24: num  - - - - - ...
 table(snow$V5)  # tally up the states
AK  AL  AR  AZ  CA  CO  CT  DE  FL  GA  HI  IA  ID  IL  IN  KS  KY  LA  MA
MD  ME  MI  MN  MO  MS  MT
 72  18  65  55  99 128  10   1  30  33   6 112  57 103  85  90  49  29
35  14  40  86  90 124  27 113
NC  ND  NE  NH  NJ  NM  NV  NY  OH  OK  OR  PA  RI  SC  SD  TN  TX  UT  VA
VT  WA  WI  WV  WY
 45  19 136  22  13  53  65  76  31 106  51  84   2  30  79  64 185  68
70  18  56 103  36  84



Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Tue, Jun 16, 2015 at 11:38 AM, Alemu Tadesse alemu.tade...@gmail.com
wrote:

 Dear All,

 I was

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread Hadley Wickham

On Tue, Jun 16, 2015 at 12:24 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Hello!

 I have a data frame:

 md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c = c(1,3,4,3,5,5),
   device = c(1,1,2,2,3,3))
 myvars = c(a, b, c)
 md[2,3] - NA
 md[4,1] - NA
 md

 I want to count number of 5s in each column - by device. I can do it like 
 this:

 library(dplyr)
 group_by(md, device) %%
 summarise(counts.a = sum(a==5, na.rm = T),
   counts.b = sum(b==5, na.rm = T),
   counts.c = sum(c==5, na.rm = T))

 However, in real life I'll have tons of variables (the length of
 'myvars' can be very large) - so that I can't specify those counts.a,
 counts.b, etc. manually - dozens of times.

 Does dplyr allow to run the count of 5s on all 'myvars' columns at once?

md %%
  group_by(device) %%
  summarise_each(funs(sum(. == 5, na.rm = TRUE)))

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] mlogit error

2015-06-16 Thread mikebarr

Hello,

I am trying to run a mixed logit model (panel form) with the mlogit package.
I am running into the following error: Error in random.nb[, sel, drop = F]
: subscript out of bounds.

I have searched the R Help forum (and online) and see no instances of this
error. Below is the coding that I used which follows along with
Discrete-Choice Logit Models with R by Philip A. Viton
(http://facweb.knowlton.ohio-state.edu/pviton/courses2/crp5700/5700-mlogit.pdf).
My dataset consists of 244 individuals each answering 8 choices between 3
alternatives. 

 clogit - read.csv(/Users/name/Desktop/DCEinR/R365.csv)
 save(clogit,file=/Users/name/Desktop/DCEinR/clogit.rdata)
 load(/Users/name/Desktop/DCEinR/clogit.rdata)
 clogit$mode.ids-factor(rep(1:3,244))
 clogit$mode.ids-factor(rep(1:3, 244), labels=c(c1,c2,sq))
 clogit$indivs-factor(rep(1:244,each=24))
 CLOGIT-mlogit.data(clogit,shape=long,
 choice=choice,alt.var=mode.ids, id.var=indivs)
 CLOGIT.mxl - mlogit(Choice~-1+ASC+Price+Payment+Penalty+Length+Local|0,
 CLOGIT, rpar=c(ASC='n', Price='n', Payment='n', Penalty='n', Length='n',
 Local='n'), R=100, halton=NA, print.level=0, panel=TRUE)

Are there any suggestions on: 1) what does this error mean; and 2) how to
fix this issue?

Thanks in advance.



--
View this message in context: 
http://r.789695.n4.nabble.com/mlogit-error-tp4708706.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

2015-06-16 Thread David L Carlson

Not in base, but in stats:

 aggregate(md[,-4]==5, list(device=md$device), sum, na.rm=TRUE)
  device a b c
1  1 1 2 0
2  2 0 1 0
3  3 1 0 2

-
David L Carlson
Department of Anthropology
Texas AM University
College Station, TX 77840-4352

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter
Sent: Tuesday, June 16, 2015 3:02 PM
To: Hadley Wickham
Cc: r-help
Subject: Re: [R] dplyr - counting a number of specific values in each column - 
for all columns at once

... my bad! -- I filed to read carefully.

A base syntax version is:

dat - data.frame (a=sample(1:5,10,rep=TRUE),
   b=sample(3:7,10,rep=TRUE),
   g = sample(7:9,10,rep=TRUE))

dev - sample(1:3,10,rep=TRUE)

sapply(dat,function(x)
  tapply(x,dev,function(x)sum(x==5,na.rm=TRUE)))

  a b g
1 2 0 0
2 1 3 0
3 2 1 0

I think, no matter what, that there are 2 loops here: An outer one by
column and an inner one by device within each column.

Being both old and lazy, I have found it easier and more natural to stick
with the basic functional syntax of the apply family of functions rather
than to learn an alternative database type syntax (and semantics). My
applications were never so large that the possible execution inefficiency
mattered. However, it certainly might for others.  And of course, what is
natural for me might not be for others.

Cheers,
Bert

Bert Gunter

Data is not information. Information is not knowledge. And knowledge is
certainly not wisdom.
   -- Clifford Stoll

On Tue, Jun 16, 2015 at 12:47 PM, Hadley Wickham h.wick...@gmail.com
wrote:

 On Tue, Jun 16, 2015 at 12:24 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
  Hello!
 
  I have a data frame:
 
  md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c =
 c(1,3,4,3,5,5),
device = c(1,1,2,2,3,3))
  myvars = c(a, b, c)
  md[2,3] - NA
  md[4,1] - NA
  md
 
  I want to count number of 5s in each column - by device. I can do it
 like this:
 
  library(dplyr)
  group_by(md, device) %%
  summarise(counts.a = sum(a==5, na.rm = T),
counts.b = sum(b==5, na.rm = T),
counts.c = sum(c==5, na.rm = T))
 
  However, in real life I'll have tons of variables (the length of
  'myvars' can be very large) - so that I can't specify those counts.a,
  counts.b, etc. manually - dozens of times.
 
  Does dplyr allow to run the count of 5s on all 'myvars' columns at once?

 md %%
   group_by(device) %%
   summarise_each(funs(sum(. == 5, na.rm = TRUE)))

 Hadley

 --
 http://had.co.nz/

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reading daily snow depth data

2015-06-16 Thread Alemu Tadesse

Thank you Jim and Bob. This is really big help for me.

Jim, this is your second time to help me out.
Best

Alemu


On Tue, Jun 16, 2015 at 1:50 PM, boB Rudis b...@rudis.net wrote:

 This look similar to snow data I used last year:
 https://github.com/hrbrmstr/snowfirst/blob/master/R/snowfirst.R

 All the data worked pretty well.

 On Tue, Jun 16, 2015 at 3:21 PM, jim holtman jholt...@gmail.com wrote:
  Here is an example of reading in the data.  After that it is a data frame
  and should be able to process it with dplyr/data.table without much
 trouble:
 
  x - readLines(
 
 http://www1.ncdc.noaa.gov/pub/data/snowmonitoring/fema/06-2015-dlysndpth.txt
  )
  writeLines(x, '/temp/snow.txt')  # save for testing
  head(x)
  [1]
  
 
  [2] State:
  AL
 
  [3]Lat Lon  COOP# StnID State City/Station Name
  County Elev  Jun 1  Jun 2  Jun 3  Jun
  4  Jun 5  Jun 6  Jun 7  Jun 8  Jun 9  Jun10
  Jun11  Jun12  Jun13  Jun14  Jun15  Jun16
  [4]  33.59  -85.86 010272  AL ANNISTON ARPT ASOS
  CALHOUN  594  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000  -.000
  [5]  33.83  -85.78 014209  AL JACKSONVILLE
  CALHOUN  608  -.000  -.000  -.000
  -.000  -.000  0.000  0.000  -.000  -.000
  -.000  -.000  -.000  -.000  -.000  -.000
 -.000
  [6]  34.74  -87.60 015749  AL MUSCLE SHOALS AP
  COLBERT  540  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000  -.000
  z - grepl((^$)|(^State)|(^   Lat), x)  # get lines to discard
  xm - x[!z]  # remove info lines
  head(xm)
  [1]  33.59  -85.86 010272  AL ANNISTON ARPT ASOS
  CALHOUN  594  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000  -.000
  [2]  33.83  -85.78 014209  AL JACKSONVILLE
  CALHOUN  608  -.000  -.000  -.000
  -.000  -.000  0.000  0.000  -.000  -.000
  -.000  -.000  -.000  -.000  -.000  -.000
 -.000
  [3]  34.74  -87.60 015749  AL MUSCLE SHOALS AP
  COLBERT  540  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000  -.000
  [4]  31.32  -85.45 012372  AL DOTHAN FAA AIRPORT
  DALE 374  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000  -.000
  [5]  32.70  -87.58 013511  AL GREENSBORO
  HALE 220  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000  -.000
  [6]  33.57  -86.74 010831  AL BIRMINGHAM AP ASOS
  JEFFERSON615  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000
  0.000  0.000  0.000  0.000  0.000  0.000  -.000
 
  # read in the data
  xf - textConnection(xm)
  snow - read.fwf(xf
  + , width = c(6,8,7,10,3,32,26,6,rep(11,16))
  + , comment.char = ''
  + , as.is = TRUE
  + )
  str(snow)
  'data.frame':   3067 obs. of  24 variables:
   $ V1 : num  33.6 33.8 34.7 31.3 32.7 ...
   $ V2 : num  -85.9 -85.8 -87.6 -85.5 -87.6 ...
   $ V3 : int  10272 14209 15749 12372 13511 10831 11225 14064 12245 15478
 ...
   $ V4 : chr  ...
   $ V5 : chr  AL  AL  AL  AL  ...
   $ V6 : chr  ANNISTON ARPT ASOS  
  JACKSONVILLE MUSCLE SHOALS AP
  DOTHAN FAA AIRPORT   ...
   $ V7 : chr  CALHOUNCALHOUN   
  COLBERTDALE   ...
   $ V8 : int  594 608 540 374 220 615 461 624 100 215 ...
   $ V9 : num  0 - 0 0 0 ...
   $ V10: num  0 - 0 0 0 ...
   $ V11: num  0 - 0 0 0 ...
   $ V12: num  0 - 0 0 0 ...
   $ V13: num  0 - 0 0 0 ...
   $ V14: num  0 0 0 0 0 ...
   $ V15: num  0 0 0 0 0 ...
   $ V16: num  0 - 0 0 0 ...
   $ V17: num  0 - 0 0 0 ...
   $ V18: num  0 - 0 0 0 ...
   $ V19: num  0 - 0 0 0 ...
   $ V20: num  0 - 0 0 0 ...
   $ V21: num  0 - 0 0 0 ...
   $ V22: num  0 - 0 0 0 ...
   $ V23: num  0 - 0 0 0 ...
   $ V24: num  - - - - - ...
  table(snow$V5)  # tally up the states
  AK  AL  AR  AZ  CA  CO  CT  DE  FL  GA  HI  IA

[R] Polysomnographic data analysis with R?

2015-06-16 Thread Charles Novaes de Santana

Dear all,

Do you know if there is any R package or function we can use to analyze
polysomnographic data?

For example, something that can import an EDF file (or in a different
format) and can give some properties of the polysomnographic records like
periods of different sleep phases, etc.

I looked for it in the web and I didn't find. But maybe I used the wrong
key-words.

Any help will be much appreciated!

Best,

Charles
-- 
Um axé! :)

--
Charles Novaes de Santana, PhD
http://www.imedea.uib-csic.es/~charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

Thank you guys - it's a great learning: 'summarise_each' and 'funs'

On Tue, Jun 16, 2015 at 3:47 PM, Hadley Wickham h.wick...@gmail.com wrote:
 On Tue, Jun 16, 2015 at 12:24 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Hello!

 I have a data frame:

 md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c = c(1,3,4,3,5,5),
   device = c(1,1,2,2,3,3))
 myvars = c(a, b, c)
 md[2,3] - NA
 md[4,1] - NA
 md

 I want to count number of 5s in each column - by device. I can do it like 
 this:

 library(dplyr)
 group_by(md, device) %%
 summarise(counts.a = sum(a==5, na.rm = T),
   counts.b = sum(b==5, na.rm = T),
   counts.c = sum(c==5, na.rm = T))

 However, in real life I'll have tons of variables (the length of
 'myvars' can be very large) - so that I can't specify those counts.a,
 counts.b, etc. manually - dozens of times.

 Does dplyr allow to run the count of 5s on all 'myvars' columns at once?

 md %%
   group_by(device) %%
   summarise_each(funs(sum(. == 5, na.rm = TRUE)))

 Hadley

 --
 http://had.co.nz/



-- 
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once

Thank you, Clint.
That's the thing: it's relatively easy to do it in base, but the
resulting code is not THAT simple.
I thought dplyr would make it easy...

On Tue, Jun 16, 2015 at 2:06 PM, Clint Bowman cl...@ecy.wa.gov wrote:
 May want to add headers but the following provides the device number with
 each set fo sums:

 for (dev in (unique(md$device)))
 {cat(colSums(subset(md,md$device==dev)==5,na.rm=T),dev,\n)}

 Clint BowmanINTERNET:   cl...@ecy.wa.gov
 Air Quality Modeler INTERNET:   cl...@math.utah.edu
 Department of Ecology   VOICE:  (360) 407-6815
 PO Box 47600FAX:(360) 407-7534
 Olympia, WA 98504-7600

 USPS:   PO Box 47600, Olympia, WA 98504-7600
 Parcels:300 Desmond Drive, Lacey, WA 98503-1274

 On Tue, 16 Jun 2015, Dimitri Liakhovitski wrote:

 Except, of course, Bert, that you forgot that it had to be done by
 device. Your solution ignores the device.

 md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c =
 c(1,3,4,3,5,5),
  device = c(1,1,2,2,3,3))
 myvars = c(a, b, c)
 md[2,3] - NA
 md[4,1] - NA
 md
 vapply(md[myvars], function(x) sum(x==5,na.rm=TRUE),1L)

 But the result should be by device.

 On Tue, Jun 16, 2015 at 1:56 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:

 Thank you, Bert.
 I'll be honest - I am just learning dplyr and was wondering if one
 could do it in dplyr.
 But of course your solution is perfect...

 On Tue, Jun 16, 2015 at 1:50 PM, Bert Gunter bgunter.4...@gmail.com
 wrote:

 Well, dplyr seems a bit of overkill as it's so simple with plain old
 vapply() in base R :


 dat - data.frame (a=sample(1:5,10,rep=TRUE),

 +b=sample(3:7,10,rep=TRUE),
 +g = sample(7:9,10,rep=TRUE))

 vapply(dat,function(x)sum(x==5,na.rm=TRUE),1L)


 a b g
 5 4 0



 Cheers,
 Bert

 Bert Gunter

 Data is not information. Information is not knowledge. And knowledge is
 certainly not wisdom.
-- Clifford Stoll

 On Tue, Jun 16, 2015 at 10:24 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:


 Hello!

 I have a data frame:

 md - data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c =
 c(1,3,4,3,5,5),
   device = c(1,1,2,2,3,3))
 myvars = c(a, b, c)
 md[2,3] - NA
 md[4,1] - NA
 md

 I want to count number of 5s in each column - by device. I can do it
 like
 this:

 library(dplyr)
 group_by(md, device) %%
 summarise(counts.a = sum(a==5, na.rm = T),
   counts.b = sum(b==5, na.rm = T),
   counts.c = sum(c==5, na.rm = T))

 However, in real life I'll have tons of variables (the length of
 'myvars' can be very large) - so that I can't specify those counts.a,
 counts.b, etc. manually - dozens of times.

 Does dplyr allow to run the count of 5s on all 'myvars' columns at
 once?


 --
 Dimitri Liakhovitski

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






 --
 Dimitri Liakhovitski




 --
 Dimitri Liakhovitski

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R-es] Ayuda boxplot ggplot2

2015-06-16 Thread Juan Camilo Lara

Hola a todos

Me gustaría saber si me pueden ayudar con lo siguiente.

Realicé un Boxplot usando ggplot2 para visualizar el comportamiento de dos
variables. Visualmente no se notan las diferencias porque la gráfica de la
derecha (parásitos en el abdomen) llega hasta 20 en el eje y. ¿Cómo puedo
hacer para que las dos gráficas muestren la misma escala en el eje Y, es
decir, que las dos lleguen a 60?

Adjunto el boxplot y a continuación el código que usé para producirlo.

vplayout - function(x, y) viewport(layout.pos.row = x, layout.pos.col = y)


tor-ggplot(parasitos, aes(x=Arrenurus, y = torax, fill= Arrenurus)) +

geom_boxplot(binwidth = 2) +

scale_fill_manual(values = c(lightgreen, lightblue))+

ylab(Total parásitos)+

xlab()+

ggtitle(Parásitos en el tórax)

abd- ggplot(parasitos, aes(x=Arrenurus, y = abdomen, fill= Arrenurus)) +

geom_boxplot(binwidth = 2) +

scale_fill_manual(values = c(lightgreen, lightblue))+

ylab(Total parásitos)+

xlab()+

ggtitle(Parásitos en el abdomen)


grid.newpage()

pushViewport(viewport(layout = grid.layout(1, 2)))

print(tor, vp = vplayout(1, 1))

print(abd, vp = vplayout(1, 2))



Gracias por su ayuda


Att: Juan Camilo Lara C.
___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] dplyr - counting a number of specific values in each column - for all columns at once