[R] Need help with renaming sub-directories and its files after attribute vales from a shapefile

2016-09-14 Thread P Mads
Hello,
Keep in mind I am VERY new to using R... What I am trying to do is package
hundreds of files into a new sub-directory. I was able to accomplish this
with the code below. HOWEVER, I have come to find that instead of merely
having to name the new sub-directory after the 7-digit numeric prefix in
the file names, the sub-directories AND the corresponding files all have to
be named after a certain attribute value in a shapefile ("DOQ_Index").
Basically, what I need to do is 1) match the filenames' 7-digit prefix to
the "ID" attribute field in the shapefile (eg. 4310950 = '4310950' - "ID"
field); then, 2) for those matches, I need to create a sub-directory based
on a DIFFERENT attribute field value ("CODE" field) and then 3) rename the
matching files based on the "CODE" attribute field and move those files to
the new sub-directory. Whew. Does that make sense?? Anyway, can someone
help me with this while keeping the basic code structure below?
Thank you! - Pmads

#read in the QuadIndex *-- I recently added this step thinking this is how
to read the shapefile*
q<- readOGR(dsn="C:/GeoHub/Test", layer="DOQ_Index")

#set the working directory
setwd("C:/GeoHub/Test/TestZip")

#get a list of directories
dlist<- list.dirs()
#remove the first vector
dlist<- dlist[-1]

#If the files are all in the working directory
  vec1 <-  list.files()
  vec1

  lst1 <- split(vec1,substr(vec1,3,9))

  #Create the sub-directory based on the first 7 numeric prefix
  sapply(file.path(getwd(),names(lst1)),dir.create)

  #Move the files to the sub-directory
  lapply(seq_along(lst1), function(i)
file.rename(lst1[[i]],paste(file.path(names(lst1[i])), lst1[[i]],sep="/")))
  list.files(recursive=TRUE)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time format lagging issue

2016-09-14 Thread Bhaskar Mitra
Thanks for all your feedbacks. This is helpful.

My apologies for any inconvenience due to asterisks.

Thanks,
Bhaskar

On Wed, Aug 31, 2016 at 6:09 PM, William Dunlap  wrote:

> That
>   tmp1 - 30*60
> can also be done as
>   tmp1 - as.difftime(30, units="mins")
> so you don't have to remember that the internal representation of POSIXct
> is seconds since the start of 1970.  You can choose from the following
> equivalent expressions.
>   tmp1 - as.difftime(0.5, units="hours")
>   tmp1 - as.difftime(1/2 * 1/24, units="days")
>   tmp1 - as.difftime(30*60, units="secs")
>   tmp1 - as.difftime(1/2 * 1/24 * 1/7, units="weeks")
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Wed, Aug 31, 2016 at 2:44 PM, MacQueen, Don  wrote:
>
>> Try following this example:
>>
>> mydf <- data.frame(t1=c('201112312230', '201112312330'))
>> tmp1 <- as.POSIXct(mydf$t1, format='%Y%m%d%H%M')
>> tmp2 <- tmp1 - 30*60
>> mydf$t2 <- format(tmp2, '%Y%m%d%H%M')
>>
>> It can be made into a single line, but I used intermediate variables tmp1
>> and tmp2 so that it would be easier to follow.
>>
>> Base R is more than adequate for this task.
>>
>> Please get rid of the asterisks in your next email. The just get in the
>> way. Learn how to send plain text email, not HTML email. Please.
>>
>>
>>
>>
>> --
>> Don MacQueen
>>
>> Lawrence Livermore National Laboratory
>> 7000 East Ave., L-627
>> Livermore, CA 94550
>> 925-423-1062
>>
>>
>>
>>
>>
>> On 8/31/16, 9:07 AM, "R-help on behalf of Bhaskar Mitra"
>> 
>> wrote:
>>
>> >Hello Everyone,
>> >
>> >I am trying a shift the time series in a dataframe (df) by 30 minutes .
>> My
>> >current format looks something like this :
>> >
>> >
>> >
>> >*df$$Time 1*
>> >
>> >
>> >*201112312230*
>> >
>> >*201112312300*
>> >
>> >*201112312330*
>> >
>> >
>> >
>> >*I am trying to add an additional column of time (df$Time 2) next to
>> Time
>> >1 by lagging it by ­ 30minutes. Something like this :*
>> >
>> >
>> >*df$Time1   **df$$Time2*
>> >
>> >
>> >*201112312230  **201112312200*
>> >
>> >*201112312300  **201112312230*
>> >
>> >*201112312330  **201112312300*
>> >
>> >*201112312330  *
>> >
>> >
>> >
>> >
>> >
>> >*Based on some of the suggestions available, I have tried this option *
>> >
>> >
>> >
>> >*require(zoo)*
>> >
>> >*df1$Time2  <- lag(df1$Time1, -1, na.pad = TRUE)*
>> >
>> >*View(df1)*
>> >
>> >
>> >
>> >*This does not however give me the desired result. I would appreciate any
>> >suggestions/advice in this regard.*
>> >
>> >
>> >*Thanks,*
>> >
>> >*Bhaskar*
>> >
>> >   [[alternative HTML version deleted]]
>> >
>> >__
>> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data Visualisation, Predictive Modelling Courses: Syd/Melb/Canb/Adel in September

2016-09-14 Thread Kris Angelovski
Hi,

SolutionMetrics is presenting Data Visualisation and Data Science/Predictive 
Modelling courses in Sydney, Melbourne, Canberra and Adelaide.

Data Visualisation (1 Day)

Introduction to Data Analysis and Graphics - Histograms, Box Plots, Bar Charts, 
Scatter Plots; Changing symbols, colours, style of points, axes, range; 3D 
plots; Time-Series plots; Lattice; ggplot2; Map plots; Shiny; Producing 
publication quality reports. More Info

Data Science/Predictive Modelling (1 Day)

Introduction to Data Science/Predictive Modelling, Regression modelling, 
Linear, Non-linear, Multiple, Stepwise and Regression Trees, Classification: 
Logistic Regression and Tree based methods; Clustering. More 
Info

Location


 Date


Course


Sydney

19 Sep 2016

Data Visualisation

20 Sep 2016

Data Science/Predictive modelling


Melbourne

22 Sep 2016

Data Visualisation

23 Sep 2016

Data Science/Predictive modelling


Canberra

26 Sep 2016

Data Visualisation

27 Sep 2016

Data Science/Predictive modelling


Adelaide

29 Sep 2016

Data Visualisation

30 Sep 2016

Data Science/Predictive modelling


To book, please email 
enquir...@solutionmetrics.com.au or 
call +61 2 9233 6888
Full Schedule
Regards,
Kris Angelovski | Chief Data Scientist | SolutionMetrics
T +61 2 9233 6888 | M +61 488 388 338
E 
kris.angelov...@solutionmetrics.com.au
solutionmetrics.com.au | Suite 44, Level 9, 
88 Pitt Street Sydney NSW 2000




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating dataframe with subtotals by all fields and totals of subtotals

2016-09-14 Thread David Winsemius

> On Sep 14, 2016, at 12:33 PM, Peter Lomas  wrote:
> 
> Hello R-Helpers,
> 
> I'm trying to to create a subtotal category for each column in a
> dataset, and totals of subtotals, resulting in one data frame.  I
> figure I could do this by a whack of aggregate() and rbind(), but I'm
> hoping there is a simpler way.
> 
> Below is a sample dataset. Underneath I create an "All" salesmen
> subtotal and rbind it with the original dataset.  I could do that for
> "Drink" and "Region", then also do combinations of salesmen, drink,
> and region subtotals.  However, I'm hoping somebody out there is more
> clever than I am.

I'm pretty sure that Hadley (who is rather smart) already built that into the 
plyr package where he lets people specify marginal subtotals. I'd be slightly 
surprised if that feature wasn't carried over to dplyr, although I have not see 
it illustrated yet. But my memory may be failing in htis area. I'm not able to 
put any substance to that notion after searching.

I've answered a couple of questions over the years on StackOverflow that deal 
with marginal calculations and you might find the addmargin function less of a 
"whack" that the route you were imagining:

http://stackoverflow.com/questions/5863456/r-calculating-margins-or-row-col-sums-for-a-data-frame
http://stackoverflow.com/questions/5982546/r-calculating-column-sums-row-sums-as-an-aggregation-from-a-dataframe/5982943#5982943


> 
> Thanks!
> Peter
> 
> 
> dat <- structure(list(Date = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L,
> 2L, 3L, 1L, 2L, 3L, 1L, 2L,
> 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
> 3L, 1L, 2L, 3L, 1L, 2L, 3L,
> 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("2012-01",
> 
>  "2012-02", "2012-03"), class =
> "factor"), Region = structure(c(1L,
> 
> 
>  1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L,
> 2L, 2L, 2L, 3L, 3L,
> 
> 
>  3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L,
> 1L, 1L, 2L, 2L, 2L,
> 
> 
>  3L, 3L, 3L), .Label = c("Zone1", "Zone2",
> "Zone3"), class = "factor"),
>  Drink = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 1L,
>  1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>  2L, 2L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L), .Label = c("Cola",
> 
>   "Orange Juice"), class = "factor"),
> Salesman = structure(c(1L,
> 
> 
>   1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L,
> 
> 
>   1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L,
> 
> 
>   2L, 2L, 2L, 2L, 2L), .Label = c("Joe", "Marty"), class
> = "factor"),
>  Sales = c(10L, 36L, 9L, 39L, 12L, 61L, 62L, 28L, 82L, 1L,
>38L, 14L, 55L, 50L, 62L, 64L, 69L,
> 65L, 28L, 85L, 66L, 66L,
>75L, 59L, 31L, 14L, 93L, 35L, 24L,
> 11L, 4L, 30L, 2L, 17L,
>36L, 47L)), .Names = c("Date",
> "Region", "Drink", "Salesman",
>   "Sales"), class
> = "data.frame", row.names = c(NA, -36L))
> 
> 
> 
> all.salesman <- aggregate(Sales~Date+Region+Drink, data=dat, FUN=sum)
> all.salesman$Salesman <- "All"
> dat <- rbind(dat, all.salesman)
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NaN Log-lik value in EM algorithm (fitting Gamma mixture model)

2016-09-14 Thread Duncan Murdoch

On 14/09/2016 4:46 PM, Aanchal Sharma wrote:

Hi,

I am trying to fit Gamma mixture model to my data (residual values obtained
after fitting Generalized linear Model) using gammamixEM. It is part of the
script which does it for multiple datasets in loop. The code is running
fine for some datasets but it terminates for some giving following error:

" iteration = 1  log-lik diff = NaN  log-lik = NaN
Error in while (diff > epsilon && iter < maxit) { :
  missing value where TRUE/FALSE needed"

Seems like EM is not able to calculate log-lik value (NaN) at the first
iteration itself. any idea why that can happen?
It works fine for the other genes in the loop. Tried looking for difference
in the inputs, but could not come up with anything striking.



THere are lots of ways to get NaN in numerical calculations.   A common 
one if you are using log() to calculate log likelihoods is that rounding 
error gives you a negative likelihood, and then log(lik) comes out to NaN.


You just need to look really closely at each step of your calculations. 
Avoid using log(); use the functions that build it in (e.g. instead of 
log(dnorm(x)), use dnorm(x, log = TRUE)).


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] NaN Log-lik value in EM algorithm (fitting Gamma mixture model)

2016-09-14 Thread Aanchal Sharma
Hi,

I am trying to fit Gamma mixture model to my data (residual values obtained
after fitting Generalized linear Model) using gammamixEM. It is part of the
script which does it for multiple datasets in loop. The code is running
fine for some datasets but it terminates for some giving following error:

" iteration = 1  log-lik diff = NaN  log-lik = NaN
Error in while (diff > epsilon && iter < maxit) { :
  missing value where TRUE/FALSE needed"

Seems like EM is not able to calculate log-lik value (NaN) at the first
iteration itself. any idea why that can happen?
It works fine for the other genes in the loop. Tried looking for difference
in the inputs, but could not come up with anything striking.

Regards
Anchal



-- 
Anchal Sharma, PhD
Postdoctoral Fellow
195, Little Albany street,
Cancer Institute of New Jersey
Rutgers University
NJ-08901

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cross-classified multilevel binary logistic regression model with random effects at level 2

2016-09-14 Thread MACDOUGALL Margaret
Thank you for this valued advice, David, in response to which I have sent a 
very similar message to the list you recommend.  I would also welcome 
suggestions from members of the r-help@r-project list, where appropriate.

Best wishes

Margaret






-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: 14 September 2016 16:17
To: MACDOUGALL Margaret 
Cc: r-help@r-project.org
Subject: Re: [R] Cross-classified multilevel binary logistic regression model 
with random effects at level 2


> On Sep 14, 2016, at 6:05 AM, MACDOUGALL Margaret 
>  wrote:
> 
> Hello
> 
> I am not a seasoned R user and am therefore keen to identify a book chapter 
> that can provide structured advice on setting up the type of model I am 
> interested in using R. I would like to run a cross-classified multilevel 
> binary logistic regression model. The model contains two level 2 random 
> effects variables. These variables are crossed to form a cross-classified 
> design. The model has subjects at level one and these subjects are nested 
> within each of the two level two variables. I understand that the R package 
> lme4 may be suitable for running my model and that there are several 
> published books on running mixed models in R. If a list member is able to 
> kindly recommend whether one of these books is particularly helpful in 
> helping less experienced R users fully understand how to use this (or an 
> alternative) program specifically for the model I have outlined above, I 
> would be most grateful for recommendations.

You would get the widest and most knowledgeable audience for this question at 
the MixedModels mailing list.

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- 


David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating dataframe with subtotals by all fields and totals of subtotals

2016-09-14 Thread Peter Lomas
Hello R-Helpers,

I'm trying to to create a subtotal category for each column in a
dataset, and totals of subtotals, resulting in one data frame.  I
figure I could do this by a whack of aggregate() and rbind(), but I'm
hoping there is a simpler way.

Below is a sample dataset. Underneath I create an "All" salesmen
subtotal and rbind it with the original dataset.  I could do that for
"Drink" and "Region", then also do combinations of salesmen, drink,
and region subtotals.  However, I'm hoping somebody out there is more
clever than I am.

Thanks!
Peter


dat <- structure(list(Date = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L,
 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 3L), .Label = c("2012-01",

  "2012-02", "2012-03"), class =
"factor"), Region = structure(c(1L,


  1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L,
2L, 2L, 2L, 3L, 3L,


  3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L,
1L, 1L, 2L, 2L, 2L,


  3L, 3L, 3L), .Label = c("Zone1", "Zone2",
"Zone3"), class = "factor"),
  Drink = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L,
  1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
  2L, 2L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), .Label = c("Cola",

   "Orange Juice"), class = "factor"),
Salesman = structure(c(1L,


   1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L,


   1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L,


   2L, 2L, 2L, 2L, 2L), .Label = c("Joe", "Marty"), class
= "factor"),
  Sales = c(10L, 36L, 9L, 39L, 12L, 61L, 62L, 28L, 82L, 1L,
38L, 14L, 55L, 50L, 62L, 64L, 69L,
65L, 28L, 85L, 66L, 66L,
75L, 59L, 31L, 14L, 93L, 35L, 24L,
11L, 4L, 30L, 2L, 17L,
36L, 47L)), .Names = c("Date",
"Region", "Drink", "Salesman",
   "Sales"), class
= "data.frame", row.names = c(NA, -36L))



all.salesman <- aggregate(Sales~Date+Region+Drink, data=dat, FUN=sum)
all.salesman$Salesman <- "All"
dat <- rbind(dat, all.salesman)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread Duncan Murdoch

On 14/09/2016 2:40 PM, jeremiah rounds wrote:

"If you want to add variable to data.frame you have to use attach, detach.
Right?"

Not quite.  Use it like a list to add a variable to a data.frame

e.g.
df = list()
df$var1 = 1:10
df = as.data.frame(df)
df$var2 = 1:10
df[["var3"]] = 1:10
df
df = as.list(df)
df$var4 = 1:10
as.data.frame(df)

Ironically the primary reason to use a data.frame in my head is to signal
that you are thinking of your data as a row-oriented tabular storage.
  "Ironic" because in technical detail that is not a requirement to be a
data.frame, but when I reflect on the typical way a seasoned R programmer
approaches list and data.frames that is basically what they are
communicating.


I believe it is intended to be a requirement.  You can construct things 
with class "data.frame" that don't have that structure, but lots of 
stuff will go wrong if you do.


Duncan Murdoch


I was going to post that a reason to use data.frames is to take advantages
of optimizations and syntax sugar for data.frames, but in reality if code
does not assume a row-oriented data structure in a data.frame there is not
much I can think of that exists in the way of optimization.  For example,
we could point to "subset" and say that is a reason to use data.frames and
not list, but that only works if you use data.frame in a conventional way.

In the end, my advice to you is if it is a table make it a data.frame and
if it is not easily thought of as a table or row-oriented data structure
keep it as a list.

Thanks,
Jeremiah





On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help 
wrote:

> thanks for all the answers. I think also ggplot2 requires data.frames.If
> you want to add variable to data.frame you have to use attach, detach.
> Right?Any more links that discuss thoe two different approaches?Alex
>
> On Wednesday, September 14, 2016 5:34 PM, Bert Gunter <
> bgunter.4...@gmail.com> wrote:
>
>
>  This is partially a matter of subjectve opinion, and so pointless; but
> I would point out that data frames are the canonical structure for a
> great many of R's modeling and graphics functions, e.g. lm, xyplot,
> etc.
>
> As for mutate() etc., that's about UI's and user friendliness, and
> imho my ho is meaningless.
>
> Best,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help 
> wrote:
> > Hi all,I have seen data.frames and operations from the mutate package
> getting really popular. In the last years I have been using extensively
> lists, is there any reason to not use lists and use other data types for
> data manipulation and storage?
> > Any article that describe their differences? I would like to thank you
> for your replyRegardsAlex
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maximum # of DLLs reached, or, how to clean up after yourself?

2016-09-14 Thread Henrik Bengtsson
As Jeff says, I think the common use case is to run/rerun in fresh R sessions.

But, yes, if you'd like to have each script clean up after itself,
then you need to check with pkgs0 <- loadedNamespaces() to see what
packages are loaded when the script starts (not just attached) and
then unload the ones added at the end by pkgsDiff <-
setdiff(loadedNamespaces(), pkgs0).   However, it's not as simple as
calling unloadNamespace(pkgsDiff), because they need to be unloaded in
an order that is compatible with the package dependencies.   One way
is to too use while(length(pkgDiffs) > 0) loop over with a
try(unloadNamespace(pkg)) until all are unloaded.   At the end, run
R.utils::gcDLLs() too (now on CRAN).

unloadNamespace("foo") should result in the same as
detach("package::foo", unload=TRUE) [anyone correct me if I'm wrong].

Hope this helps

Henrik

On Wed, Sep 14, 2016 at 6:41 AM, Jeff Newmiller
 wrote:
> I never detach packages. I rarely load more than 6 or 7 packages directly 
> before restarting R. I frequently re-run my scripts in new R sessions to 
> confirm reproducibility.
> --
> Sent from my phone. Please excuse my brevity.
>
> On September 14, 2016 1:49:55 AM PDT, Alexander Shenkin  
> wrote:
>>Hi Henrik,
>>
>>Thanks for your reply.  I didn't realize that floating DLLs were an
>>issue (good to know).  My query is actually a bit more basic.  I'm
>>actually wondering how folks manage their loading and unloading of
>>packages when calling scripts within scripts.
>>
>>Quick example:
>>Script1:
>>   library(package1)
>>   source("script2.r")
>>   # do stuff reliant on package1
>>   detach("package:package1", unload=TRUE)
>>
>>Script2:
>>   library(package1)
>>   library(package2)
>>   # do stuff reliant on package1 and package2
>>   detach("package:package1", unload=TRUE)
>>   detach("package:package2", unload=TRUE)
>>
>>Script2 breaks Script1 by unloading package1 (though unloading package2
>>
>>is ok).  I will have to test whether each package is loaded in Script2
>>before loading it, and use that list when unloading at the end of the
>>Script2.  *Unless there's a better way to do it* (which is my question
>>-
>>is there?).  I'm possibly just pushing the whole procedural scripting
>>thing too far, but I also think that this likely isn't uncommon in R.
>>
>>Any thoughts greatly appreciated!
>>
>>Thanks,
>>Allie
>>
>>On 9/13/2016 7:23 PM, Henrik Bengtsson wrote:
>>> In R.utils (>= 2.4.0), which I hope to submitted to CRAN today or
>>> tomorrow, you can simply call:
>>>
>>>R.utils::gcDLLs()
>>>
>>> It will look at base::getLoadedDLLs() and its content and compare to
>>> loadedNamespaces() and unregister any "stray" DLLs that remain after
>>> corresponding packages have been unloaded.
>>>
>>> Until the new version is on CRAN, you can install it via
>>>
>>>
>>source("http://callr.org/install#HenrikBengtsson/R.utils@develop;)
>>>
>>> or alternatively just source() the source file:
>>>
>>>
>>source("https://raw.githubusercontent.com/HenrikBengtsson/R.utils/develop/R/gcDLLs.R;)
>>>
>>>
>>> DISCUSSION:
>>> (this might be better suited for R-devel; feel free to move it there)
>>>
>>> As far as I understand the problem, running into this error / limit
>>is
>>> _not_ the fault of the user.  Instead, I'd argue that it is the
>>> responsibility of package developers to make sure to unregister any
>>> registered DLLs of theirs when the package is unloaded.  A developer
>>> can do this by adding the following to their package:
>>>
>>> .onUnload <- function(libpath) {
>>> library.dynam.unload(utils::packageName(), libpath)
>>>  }
>>>
>>> That should be all - then the DLL will be unloaded as soon as the
>>> package is unloaded.
>>>
>>> I would like to suggest that 'R CMD check' would include a check that
>>> asserts when a package is unloaded it does not leave any registered
>>> DLLs behind, e.g.
>>>
>>> * checking whether the namespace can be unloaded cleanly ... WARNING
>>>   Unloading the namespace does not unload DLL
>>> * checking loading without being on the library search path ... OK
>>>
>>> For further details on my thoughts on this, see
>>> https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29.
>>>
>>> Hope this helps
>>>
>>> Henrik
>>>
>>> On Tue, Sep 13, 2016 at 6:05 AM, Alexander Shenkin 
>>wrote:
 Hello all,

 I have a number of analyses that call bunches of sub-scripts, and in
>>the
 end, I get the "maximal number of DLLs reached" error.  This has
>>been asked
 before (e.g.

>>http://stackoverflow.com/questions/36974206/r-maximal-number-of-dlls-reached),
 and the general answer is, "just clean up after yourself".

 Assuming there are no plans to raise this 100-DLL limit in the near
>>future,
 my question becomes, what is best practice for cleaning up
>>(detaching)
 loaded packages in scripts, when those scripts are sometimes called
>>from
 other 

Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread jeremiah rounds
There is also this syntax for adding variables
df[, "var5"] = 1:10

and the syntax sugar for row-oriented storage:
df[1:5,]

On Wed, Sep 14, 2016 at 11:40 AM, jeremiah rounds 
wrote:

> "If you want to add variable to data.frame you have to use attach, detach.
> Right?"
>
> Not quite.  Use it like a list to add a variable to a data.frame
>
> e.g.
> df = list()
> df$var1 = 1:10
> df = as.data.frame(df)
> df$var2 = 1:10
> df[["var3"]] = 1:10
> df
> df = as.list(df)
> df$var4 = 1:10
> as.data.frame(df)
>
> Ironically the primary reason to use a data.frame in my head is to signal
> that you are thinking of your data as a row-oriented tabular storage.
>  "Ironic" because in technical detail that is not a requirement to be a
> data.frame, but when I reflect on the typical way a seasoned R programmer
> approaches list and data.frames that is basically what they are
> communicating.
>
> I was going to post that a reason to use data.frames is to take advantages
> of optimizations and syntax sugar for data.frames, but in reality if code
> does not assume a row-oriented data structure in a data.frame there is not
> much I can think of that exists in the way of optimization.  For example,
> we could point to "subset" and say that is a reason to use data.frames and
> not list, but that only works if you use data.frame in a conventional way.
>
> In the end, my advice to you is if it is a table make it a data.frame and
> if it is not easily thought of as a table or row-oriented data structure
> keep it as a list.
>
> Thanks,
> Jeremiah
>
>
>
>
>
> On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help 
> wrote:
>
>> thanks for all the answers. I think also ggplot2 requires data.frames.If
>> you want to add variable to data.frame you have to use attach, detach.
>> Right?Any more links that discuss thoe two different approaches?Alex
>>
>> On Wednesday, September 14, 2016 5:34 PM, Bert Gunter <
>> bgunter.4...@gmail.com> wrote:
>>
>>
>>  This is partially a matter of subjectve opinion, and so pointless; but
>> I would point out that data frames are the canonical structure for a
>> great many of R's modeling and graphics functions, e.g. lm, xyplot,
>> etc.
>>
>> As for mutate() etc., that's about UI's and user friendliness, and
>> imho my ho is meaningless.
>>
>> Best,
>> Bert
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help 
>> wrote:
>> > Hi all,I have seen data.frames and operations from the mutate package
>> getting really popular. In the last years I have been using extensively
>> lists, is there any reason to not use lists and use other data types for
>> data manipulation and storage?
>> > Any article that describe their differences? I would like to thank you
>> for your replyRegardsAlex
>> >[[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread jeremiah rounds
"If you want to add variable to data.frame you have to use attach, detach.
Right?"

Not quite.  Use it like a list to add a variable to a data.frame

e.g.
df = list()
df$var1 = 1:10
df = as.data.frame(df)
df$var2 = 1:10
df[["var3"]] = 1:10
df
df = as.list(df)
df$var4 = 1:10
as.data.frame(df)

Ironically the primary reason to use a data.frame in my head is to signal
that you are thinking of your data as a row-oriented tabular storage.
 "Ironic" because in technical detail that is not a requirement to be a
data.frame, but when I reflect on the typical way a seasoned R programmer
approaches list and data.frames that is basically what they are
communicating.

I was going to post that a reason to use data.frames is to take advantages
of optimizations and syntax sugar for data.frames, but in reality if code
does not assume a row-oriented data structure in a data.frame there is not
much I can think of that exists in the way of optimization.  For example,
we could point to "subset" and say that is a reason to use data.frames and
not list, but that only works if you use data.frame in a conventional way.

In the end, my advice to you is if it is a table make it a data.frame and
if it is not easily thought of as a table or row-oriented data structure
keep it as a list.

Thanks,
Jeremiah





On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help 
wrote:

> thanks for all the answers. I think also ggplot2 requires data.frames.If
> you want to add variable to data.frame you have to use attach, detach.
> Right?Any more links that discuss thoe two different approaches?Alex
>
> On Wednesday, September 14, 2016 5:34 PM, Bert Gunter <
> bgunter.4...@gmail.com> wrote:
>
>
>  This is partially a matter of subjectve opinion, and so pointless; but
> I would point out that data frames are the canonical structure for a
> great many of R's modeling and graphics functions, e.g. lm, xyplot,
> etc.
>
> As for mutate() etc., that's about UI's and user friendliness, and
> imho my ho is meaningless.
>
> Best,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help 
> wrote:
> > Hi all,I have seen data.frames and operations from the mutate package
> getting really popular. In the last years I have been using extensively
> lists, is there any reason to not use lists and use other data types for
> data manipulation and storage?
> > Any article that describe their differences? I would like to thank you
> for your replyRegardsAlex
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread Alaios via R-help
thanks for all the answers. I think also ggplot2 requires data.frames.If you 
want to add variable to data.frame you have to use attach, detach. Right?Any 
more links that discuss thoe two different approaches?Alex 

On Wednesday, September 14, 2016 5:34 PM, Bert Gunter 
 wrote:
 

 This is partially a matter of subjectve opinion, and so pointless; but
I would point out that data frames are the canonical structure for a
great many of R's modeling and graphics functions, e.g. lm, xyplot,
etc.

As for mutate() etc., that's about UI's and user friendliness, and
imho my ho is meaningless.

Best,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help  wrote:
> Hi all,I have seen data.frames and operations from the mutate package getting 
> really popular. In the last years I have been using extensively lists, is 
> there any reason to not use lists and use other data types for data 
> manipulation and storage?
> Any article that describe their differences? I would like to thank you for 
> your replyRegardsAlex
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gsub: replacing slashes in a string

2016-09-14 Thread Joe Ceradini
Thanks Jim!

Joe

On Wed, Sep 14, 2016 at 11:06 AM, jim holtman  wrote:

> try this:
>
> > gsub("", "/", test)
> [1] "8/24/2016" "8/24/2016" "6/16/2016" "6/16/2016"
>
>
>
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> On Wed, Sep 14, 2016 at 12:25 PM, Joe Ceradini 
> wrote:
>
>> Hi all,
>>
>> There are many R help posts out there dealing with slashes in gsub. I
>> understand slashes are "escape characters" and thus need to be treated
>> differently, and display differently in R. However, I'm still stuck on
>> find-replace problem, and would appreciate any tips. Thanks!
>>
>> GOAL: replace all "\\" with "/", so when export file to csv all slashes
>> are
>> the same.
>>
>> (test <- c("8/24/2016", "8/24/2016", "6/16/2016", "6\\16\\2016"))
>>
>> Lengths are all the same, I think (?) because of how R displays/deals with
>> slashes. However, when I export this to a csv, e.g., there are still
>> double
>> slashes, which is a problem for me.
>> nchar(test)
>>
>> Change direction of slashes - works.
>> (test2 <- gsub("\\", "//", test, fixed = TRUE))
>>
>> Now lengths are now not the same
>> nchar(test2)
>>
>> Change from double to single - does not work. Is this because it actually
>> is a single slash but R is just displaying it as double? Regardless, when
>> I
>> export from R the double slashes do appear.
>> gsub("", "//", test2, fixed = TRUE)
>> gsub("", "//", test2)
>> gsub("", "", test2, fixed = TRUE)
>> gsub("", "", test2)
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>


-- 
Cooperative Fish and Wildlife Research Unit
Zoology and Physiology Dept.
University of Wyoming
joecerad...@gmail.com / 914.707.8506
wyocoopunit.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gsub: replacing slashes in a string

2016-09-14 Thread jim holtman
try this:

> gsub("", "/", test)
[1] "8/24/2016" "8/24/2016" "6/16/2016" "6/16/2016"




Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Wed, Sep 14, 2016 at 12:25 PM, Joe Ceradini 
wrote:

> Hi all,
>
> There are many R help posts out there dealing with slashes in gsub. I
> understand slashes are "escape characters" and thus need to be treated
> differently, and display differently in R. However, I'm still stuck on
> find-replace problem, and would appreciate any tips. Thanks!
>
> GOAL: replace all "\\" with "/", so when export file to csv all slashes are
> the same.
>
> (test <- c("8/24/2016", "8/24/2016", "6/16/2016", "6\\16\\2016"))
>
> Lengths are all the same, I think (?) because of how R displays/deals with
> slashes. However, when I export this to a csv, e.g., there are still double
> slashes, which is a problem for me.
> nchar(test)
>
> Change direction of slashes - works.
> (test2 <- gsub("\\", "//", test, fixed = TRUE))
>
> Now lengths are now not the same
> nchar(test2)
>
> Change from double to single - does not work. Is this because it actually
> is a single slash but R is just displaying it as double? Regardless, when I
> export from R the double slashes do appear.
> gsub("", "//", test2, fixed = TRUE)
> gsub("", "//", test2)
> gsub("", "", test2, fixed = TRUE)
> gsub("", "", test2)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gsub: replacing slashes in a string

2016-09-14 Thread Joe Ceradini
Wow. Thanks David and Rui. I thought I needed to "escape" the replacement
slash as well, which is why I had "//" rather than "/". I swear I had tried
all the slash combos, but missed the obvious one. Much easier than I made
it out to be.

Thanks!
Joe

On Wed, Sep 14, 2016 at 10:59 AM, David L Carlson  wrote:

> Is this what you want?
>
> > (test2 <- gsub("\\", "/", test, fixed = TRUE))
> [1] "8/24/2016" "8/24/2016" "6/16/2016" "6/16/2016"
> > nchar(test2)
> [1] 9 9 9 9
> > write.csv(test2)
> "","x"
> "1","8/24/2016"
> "2","8/24/2016"
> "3","6/16/2016"
> "4","6/16/2016"
>
> -
> David L Carlson
> Department of Anthropology
> Texas A University
> College Station, TX 77840-4352
>
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Joe
> Ceradini
> Sent: Wednesday, September 14, 2016 11:25 AM
> To: Zilefac
> Subject: [R] gsub: replacing slashes in a string
>
> Hi all,
>
> There are many R help posts out there dealing with slashes in gsub. I
> understand slashes are "escape characters" and thus need to be treated
> differently, and display differently in R. However, I'm still stuck on
> find-replace problem, and would appreciate any tips. Thanks!
>
> GOAL: replace all "\\" with "/", so when export file to csv all slashes are
> the same.
>
> (test <- c("8/24/2016", "8/24/2016", "6/16/2016", "6\\16\\2016"))
>
> Lengths are all the same, I think (?) because of how R displays/deals with
> slashes. However, when I export this to a csv, e.g., there are still double
> slashes, which is a problem for me.
> nchar(test)
>
> Change direction of slashes - works.
> (test2 <- gsub("\\", "//", test, fixed = TRUE))
>
> Now lengths are now not the same
> nchar(test2)
>
> Change from double to single - does not work. Is this because it actually
> is a single slash but R is just displaying it as double? Regardless, when I
> export from R the double slashes do appear.
> gsub("", "//", test2, fixed = TRUE)
> gsub("", "//", test2)
> gsub("", "", test2, fixed = TRUE)
> gsub("", "", test2)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Cooperative Fish and Wildlife Research Unit
Zoology and Physiology Dept.
University of Wyoming
joecerad...@gmail.com / 914.707.8506
wyocoopunit.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gsub: replacing slashes in a string

2016-09-14 Thread David L Carlson
Is this what you want?

> (test2 <- gsub("\\", "/", test, fixed = TRUE))
[1] "8/24/2016" "8/24/2016" "6/16/2016" "6/16/2016"
> nchar(test2)
[1] 9 9 9 9
> write.csv(test2)
"","x"
"1","8/24/2016"
"2","8/24/2016"
"3","6/16/2016"
"4","6/16/2016"

-
David L Carlson
Department of Anthropology
Texas A University
College Station, TX 77840-4352


-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Joe Ceradini
Sent: Wednesday, September 14, 2016 11:25 AM
To: Zilefac
Subject: [R] gsub: replacing slashes in a string

Hi all,

There are many R help posts out there dealing with slashes in gsub. I
understand slashes are "escape characters" and thus need to be treated
differently, and display differently in R. However, I'm still stuck on
find-replace problem, and would appreciate any tips. Thanks!

GOAL: replace all "\\" with "/", so when export file to csv all slashes are
the same.

(test <- c("8/24/2016", "8/24/2016", "6/16/2016", "6\\16\\2016"))

Lengths are all the same, I think (?) because of how R displays/deals with
slashes. However, when I export this to a csv, e.g., there are still double
slashes, which is a problem for me.
nchar(test)

Change direction of slashes - works.
(test2 <- gsub("\\", "//", test, fixed = TRUE))

Now lengths are now not the same
nchar(test2)

Change from double to single - does not work. Is this because it actually
is a single slash but R is just displaying it as double? Regardless, when I
export from R the double slashes do appear.
gsub("", "//", test2, fixed = TRUE)
gsub("", "//", test2)
gsub("", "", test2, fixed = TRUE)
gsub("", "", test2)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gsub: replacing slashes in a string

2016-09-14 Thread ruipbarradas

Hello,

I failing to understand the problem, isn't the following what you want?

(test2 <- gsub("\\", "/", test, fixed = TRUE))
[1] "8/24/2016" "8/24/2016" "6/16/2016" "6/16/2016"

Hope this helps,

Rui Barradas


Citando Joe Ceradini :


Hi all,

There are many R help posts out there dealing with slashes in gsub. I
understand slashes are "escape characters" and thus need to be treated
differently, and display differently in R. However, I'm still stuck on
find-replace problem, and would appreciate any tips. Thanks!

GOAL: replace all "\\" with "/", so when export file to csv all slashes are
the same.

(test <- c("8/24/2016", "8/24/2016", "6/16/2016", "6\\16\\2016"))

Lengths are all the same, I think (?) because of how R displays/deals with
slashes. However, when I export this to a csv, e.g., there are still double
slashes, which is a problem for me.
nchar(test)

Change direction of slashes - works.
(test2 <- gsub("\\", "//", test, fixed = TRUE))

Now lengths are now not the same
nchar(test2)

Change from double to single - does not work. Is this because it actually
is a single slash but R is just displaying it as double? Regardless, when I
export from R the double slashes do appear.
gsub("", "//", test2, fixed = TRUE)
gsub("", "//", test2)
gsub("", "", test2, fixed = TRUE)
gsub("", "", test2)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] de pdf a csv

2016-09-14 Thread ignacio holzinger
Saludos.
Te iba a decir lo mismo que Eric. Esas tablas "mal formadas" donde se
fusionan celdas son difíciles de manejar en "piloto automático". Casi
siempre hay que hacer manualidades.
De entre las soluciones que te han aportado esta última es la que suelo
utilizar.
Suerte.

El 14 sept. 2016 18:37, "eric"  escribió:

> Hola Jose, con frecuencia tengo que extraer datos de tablas en articulos
> en PDF tambien, lo que hago es lo siguiente, que no es todo lo automatico
> que uno quisiera pero al menos no tengo que copiar los datos uno a uno:
>
> 1. en linux existe la herramienta pdftotext, que cuando la usas con la
> opcion -layout mantiene, tanto como es posible, el layout original del
> texto, con las tablas me ha funcionado bastante bien
>
> 2. con lo anterior obtienes un archivo de texto plano
>
> 3. abro el archivo y borro todo excepto la tabla que necesito
>
> 4. lo importo en R con read.table() u otra funcion similar
>
>
> Ahora, tu tabla es bastante compleja, quiero decir que para poder usarla
> como un data.frame tendras que hacer algun trabajo extra como incluir
> algunos de los encabezados en columnas adicionales
>
> eso, ojala te sirva.
>
>
> Saludos, Eric.
>
>
>
>
>
> On 09/10/2016 07:30 PM, Dr. José A. Betancourt Bethencourt wrote:
>
>> Estimados
>>
>> En ocasionas hay informaciones epidemiológicas en reportes pdf semanales
>>   como el que adjunto que quisiéramos llevar a csv o txt  USANDO R para
>> poder analizarlas estadísticamente. Apreciaríamos su ayuda si nos diesen
>> un script, el paquete pdftable no me resultó.
>>
>> Saludos
>>
>> José
>>
>>
>>
>> ___
>> R-help-es mailing list
>> R-help-es@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-help-es
>>
>>
> --
> Forest Engineer
> Master in Environmental and Natural Resource Economics
> Ph.D. student in Sciences of Natural Resources at La Frontera University
> Member in AguaDeTemu2030, citizen movement for Temuco with green city
> standards for living
>
> Nota: Las tildes se han omitido para asegurar compatibilidad con algunos
> lectores de correo.
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] de pdf a csv

2016-09-14 Thread eric
Hola Jose, con frecuencia tengo que extraer datos de tablas en articulos 
en PDF tambien, lo que hago es lo siguiente, que no es todo lo 
automatico que uno quisiera pero al menos no tengo que copiar los datos 
uno a uno:


1. en linux existe la herramienta pdftotext, que cuando la usas con la 
opcion -layout mantiene, tanto como es posible, el layout original del 
texto, con las tablas me ha funcionado bastante bien


2. con lo anterior obtienes un archivo de texto plano

3. abro el archivo y borro todo excepto la tabla que necesito

4. lo importo en R con read.table() u otra funcion similar


Ahora, tu tabla es bastante compleja, quiero decir que para poder usarla 
como un data.frame tendras que hacer algun trabajo extra como incluir 
algunos de los encabezados en columnas adicionales


eso, ojala te sirva.


Saludos, Eric.





On 09/10/2016 07:30 PM, Dr. José A. Betancourt Bethencourt wrote:

Estimados

En ocasionas hay informaciones epidemiológicas en reportes pdf semanales
  como el que adjunto que quisiéramos llevar a csv o txt  USANDO R para
poder analizarlas estadísticamente. Apreciaríamos su ayuda si nos diesen
un script, el paquete pdftable no me resultó.

Saludos

José



___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es



--
Forest Engineer
Master in Environmental and Natural Resource Economics
Ph.D. student in Sciences of Natural Resources at La Frontera University
Member in AguaDeTemu2030, citizen movement for Temuco with green city 
standards for living


Nota: Las tildes se han omitido para asegurar compatibilidad con algunos 
lectores de correo.


___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[R] gsub: replacing slashes in a string

2016-09-14 Thread Joe Ceradini
Hi all,

There are many R help posts out there dealing with slashes in gsub. I
understand slashes are "escape characters" and thus need to be treated
differently, and display differently in R. However, I'm still stuck on
find-replace problem, and would appreciate any tips. Thanks!

GOAL: replace all "\\" with "/", so when export file to csv all slashes are
the same.

(test <- c("8/24/2016", "8/24/2016", "6/16/2016", "6\\16\\2016"))

Lengths are all the same, I think (?) because of how R displays/deals with
slashes. However, when I export this to a csv, e.g., there are still double
slashes, which is a problem for me.
nchar(test)

Change direction of slashes - works.
(test2 <- gsub("\\", "//", test, fixed = TRUE))

Now lengths are now not the same
nchar(test2)

Change from double to single - does not work. Is this because it actually
is a single slash but R is just displaying it as double? Regardless, when I
export from R the double slashes do appear.
gsub("", "//", test2, fixed = TRUE)
gsub("", "//", test2)
gsub("", "", test2, fixed = TRUE)
gsub("", "", test2)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread Bert Gunter
This is partially a matter of subjectve opinion, and so pointless; but
I would point out that data frames are the canonical structure for a
great many of R's modeling and graphics functions, e.g. lm, xyplot,
etc.

As for mutate() etc., that's about UI's and user friendliness, and
imho my ho is meaningless.

Best,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help  wrote:
> Hi all,I have seen data.frames and operations from the mutate package getting 
> really popular. In the last years I have been using extensively lists, is 
> there any reason to not use lists and use other data types for data 
> manipulation and storage?
> Any article that describe their differences? I would like to thank you for 
> your replyRegardsAlex
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread Marc Schwartz

> On Sep 14, 2016, at 8:01 AM, Alaios via R-help  wrote:
> 
> Hi all,I have seen data.frames and operations from the mutate package getting 
> really popular. In the last years I have been using extensively lists, is 
> there any reason to not use lists and use other data types for data 
> manipulation and storage?
> Any article that describe their differences? I would like to thank you for 
> your replyRegardsAlex

Hi,

Presuming that you are referring to the mutate() **function**, which is in the 
dplyr package on CRAN, that package provides a variety of functions to 
manipulate data in R.

Data frames **are** lists with a data.frame class attribute, but with the 
proviso that each column in the data frame, which is a list element, has the 
same length, but like a list, may have different data types (e.g. character, 
numeric, etc.). 

Thus, a data frame is effectively a rectangular data structure, conceptually in 
the same manner as an Excel worksheet.

A list, which is a more generic data structure, can contain list elements of 
variable lengths and data types. 

You might want to begin by reviewing:

  
https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Lists-and-data-frames

which is a section on lists and data frames in the Introduction To R Manual.

It would be surprising, to me at least, that you have been using R for several 
years and have not come across data frames, since they are used in many typical 
operations, including regression models and the like.

Regards,

Marc Schwartz
 
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cross-classified multilevel binary logistic regression model with random effects at level 2

2016-09-14 Thread David Winsemius

> On Sep 14, 2016, at 6:05 AM, MACDOUGALL Margaret 
>  wrote:
> 
> Hello
> 
> I am not a seasoned R user and am therefore keen to identify a book chapter 
> that can provide structured advice on setting up the type of model I am 
> interested in using R. I would like to run a cross-classified multilevel 
> binary logistic regression model. The model contains two level 2 random 
> effects variables. These variables are crossed to form a cross-classified 
> design. The model has subjects at level one and these subjects are nested 
> within each of the two level two variables. I understand that the R package 
> lme4 may be suitable for running my model and that there are several 
> published books on running mixed models in R. If a list member is able to 
> kindly recommend whether one of these books is particularly helpful in 
> helping less experienced R users fully understand how to use this (or an 
> alternative) program specifically for the model I have outlined above, I 
> would be most grateful for recommendations.

You would get the widest and most knowledgeable audience for this question at 
the MixedModels mailing list.

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- 


David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drill down reports in R

2016-09-14 Thread Manohar Reddy
Hi ,



  It would be great if someone can share the links how to generate the
drill down reports/tables using combination of  “R & Javascript/Ajax/some
other packages”.



Note : I tried with *shinytree* package but it’s not met my requirement.



Manu.

On Tue, Sep 13, 2016 at 11:02 PM, Greg Snow <538...@gmail.com> wrote:

> As has been mentioned, this really requires a GUI tool beyond just R.
> Luckily there are many GUI tools that have been linked to R and if
> Shiny (the shiniest of them) does not have something like this easily
> available so far then you may want to look elsewhere in the meantime.
>
> One option is the tcltk package which interfaces with the Tk GUI tools
> (and the tcl language) which does have the pieces to build in the
> expandable/drill down interface.  One implementation of this is the
> TkListView function in the TeachingDemos package which can be used to
> view list objects in this manner (it starts with the top level list
> objects, then when you click on a + symbol it will open that piece and
> show one level down).
>
> One possibility would be to create all of your results in a list, then
> view it with TkListView  (this will require all computations up front,
> whether anyone looks at them or not).
>
> Another option would be to start with TkListView and rewrite it to do
> things more dynamically and with the output that you want to show.
>
> There is also the shinyTree package on CRAN that allows a tree type
> object in shiny that may do what you want (I don't know what all
> options it allows and what it can display beyond what is in the
> Readme).
>
>
>
> On Tue, Sep 13, 2016 at 4:46 AM, Manohar Reddy 
> wrote:
> > Hi,
> >
> >
> >
> >   How to generate “Drill down reports ”  (like please refer below url)
> in R
> > using any package ? I did lot of research in google but I didn’t found
> > suitable link .
> >
> >  Can anyone help how to do that in R ?
> >
> >
> >
> > url :  http://bhushan.extreme-advice.com/drilldown-report-in-ssrs/
> >
> >
> >
> > Thanks in Advance !
> >
> > Manu.
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> 538...@gmail.com
>



-- 


Thanks,
Manohar Reddy P
+91-9705302062.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] why data.frame, mutate package and not lists

2016-09-14 Thread Alaios via R-help
Hi all,I have seen data.frames and operations from the mutate package getting 
really popular. In the last years I have been using extensively lists, is there 
any reason to not use lists and use other data types for data manipulation and 
storage?
Any article that describe their differences? I would like to thank you for your 
replyRegardsAlex
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Upgrade R 3.2 to 3.3 using tar.gz file on Ubuntu 16.04

2016-09-14 Thread Alain Guillet

Dear Luigi,

You have to modify the /etc/apt/source.list file in order to add a new 
depot to get a new R version. Everything is explained on the page 
https://cran.r-project.org/bin/linux/ubuntu/ .


Alain


On 13/09/16 15:00, Luigi Marongiu wrote:

Dear all,
I am working on Linux Ubuntu 16.04 and I have installed R 3.2. I need
to upgrade to R 3.3 and I tried several options available online with
no success. I downloaded the tar.gz file for R 3.3 and I would like to
ask how can I use this file in order to accomplish the upgrade.
Many thanks,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
.



--
Alain Guillet
Statistician and Computer Scientist

SMCS - IMMAQ - Université catholique de Louvain
http://www.uclouvain.be/smcs

Bureau c.316
Voie du Roman Pays, 20 (bte L1.04.01)
B-1348 Louvain-la-Neuve
Belgium

Tel: +32 10 47 30 50

Accès: http://www.uclouvain.be/323631.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Arules Package: Rules subset with 'empty' left hand side (lhs)

2016-09-14 Thread Michael Hahsler

Hi all,

There is no item with the label "".

> itemLabels(rules)
[1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"

arules::subset(rules, subset=lhs %pin% "") should return an empty set or 
throw an error---I will fix that in the next release of arules.


To get the rules with 0 elements in the lhs do this:

> r0 <- rules [size(lhs(rules))==0]
> inspect(r0)

  lhsrhs support confidence lift
3 {}  => {1} 0.330   0.330  1
2 {}  => {3} 0.326   0.326  1
1 {}  => {2} 0.320   0.320  1

Hope this helps,
Michael

On 09/13/2016 08:30 AM, Tom D. Harray wrote:

Hello Luisfo,

thank you for the hint: Your suggestion

   arules::subset(rules, subset=lhs %pin% "")

gave 18 rules (out of 21) in my example, and not 3, what I have expected.

Surprisingly the negation of the subset condition

   arules::subset(x = rules, subset = !(lhs %pin% ""))

returns the 3 rules with empty lhs.


Hello Martin,

I add you to this thread, because the arules::subset() behaviour
appears to me to be a bug in arules. And I'd like to suggest to add an
explanation/example to arules::subset() help.


Cheers,

Dirk

On 13 September 2016 at 05:10, Luisfo  wrote:

Dear Tom,

I think this is the line you need
  arules::subset(rules, subset=lhs %pin% "")
I found the solution here:
http://stackoverflow.com/questions/27926131/how-to-get-items-for-both-lhs-and-rhs-for-only-specific-columns-in-arules

One more thing. For printing the rules, I needed the inspect() command you
didn't provide.

I hope this helps.

Best,

Luisfo Chiroque
PhD Student | PhD Candidate
IMDEA Networks Institute
http://fourier.networks.imdea.org/people/~luis_nunez/

On 09/12/2016 04:39 PM, Tom D. Harray wrote:

Hello,

subsets of association rules (with respect to support, confidence, lift, or
items) can be obtained with the arules::subset() function; e.g.

  rm(list = ls(all.names = TRUE))
  library(arules)
  set.seed(42)

  x <- lapply(X = 1:500, FUN = function(i)
sample(x = 1:10, size = sample(1:5, 1), replace = FALSE)
  )
  x <- as(x, 'transactions')

  rules <- apriori(
data = x,
parameter = list(target = 'rules', minlen = 1, maxlen = 2,
  support = 0.10, confidence = 0.32)
  )
  rules <- arules::sort(x = rules, decreasing = TRUE, by ='support')

gives the rules
3  {}  => {1} 0.330   0.330  1.000
2  {}  => {3} 0.326   0.326  1.000
1  {}  => {2} 0.320   0.320  1.000
20 {3} => {1} 0.120   0.3680982  1.1154490
21 {1} => {3} 0.120   0.3636364  1.1154490
16 {4} => {3} 0.114   0.3677419  1.1280427
(...)

However, I cannot figure out (help/web) how to get the subset for the rules
with empty left hand side (lhs) like subset(rules, lhs == ''). I  could run
the
apriori() function twice and adjust the min/maxlen parameters as a band
aid fix.


So my question is: How do I subset() association rules with empty lhs?


Thanks and regards,

Dirk

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
  Michael Hahsler, Assistant Professor
  Department of Engineering Management, Information, and Systems
  Department of Computer Science and Engineering (by courtesy)
  Bobby B. Lyle School of Engineering
  Southern Methodist University, Dallas, Texas

  office: Caruth Hall, suite 337, room 311
  email:  mhahs...@lyle.smu.edu
  web:http://lyle.smu.edu/~mhahsler

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cross-classified multilevel binary logistic regression model with random effects at level 2

2016-09-14 Thread MACDOUGALL Margaret
Hello

I am not a seasoned R user and am therefore keen to identify a book chapter 
that can provide structured advice on setting up the type of model I am 
interested in using R. I would like to run a cross-classified multilevel binary 
logistic regression model. The model contains two level 2 random effects 
variables. These variables are crossed to form a cross-classified design. The 
model has subjects at level one and these subjects are nested within each of 
the two level two variables. I understand that the R package lme4 may be 
suitable for running my model and that there are several published books on 
running mixed models in R. If a list member is able to kindly recommend whether 
one of these books is particularly helpful in helping less experienced R users 
fully understand how to use this (or an alternative) program specifically for 
the model I have outlined above, I would be most grateful for recommendations.

Thank you so much

Best wishes

Margaret

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Maximum # of DLLs reached, or, how to clean up after yourself?

2016-09-14 Thread Jeff Newmiller
I never detach packages. I rarely load more than 6 or 7 packages directly 
before restarting R. I frequently re-run my scripts in new R sessions to 
confirm reproducibility. 
-- 
Sent from my phone. Please excuse my brevity.

On September 14, 2016 1:49:55 AM PDT, Alexander Shenkin  
wrote:
>Hi Henrik,
>
>Thanks for your reply.  I didn't realize that floating DLLs were an 
>issue (good to know).  My query is actually a bit more basic.  I'm 
>actually wondering how folks manage their loading and unloading of 
>packages when calling scripts within scripts.
>
>Quick example:
>Script1:
>   library(package1)
>   source("script2.r")
>   # do stuff reliant on package1
>   detach("package:package1", unload=TRUE)
>
>Script2:
>   library(package1)
>   library(package2)
>   # do stuff reliant on package1 and package2
>   detach("package:package1", unload=TRUE)
>   detach("package:package2", unload=TRUE)
>
>Script2 breaks Script1 by unloading package1 (though unloading package2
>
>is ok).  I will have to test whether each package is loaded in Script2 
>before loading it, and use that list when unloading at the end of the 
>Script2.  *Unless there's a better way to do it* (which is my question
>- 
>is there?).  I'm possibly just pushing the whole procedural scripting 
>thing too far, but I also think that this likely isn't uncommon in R.
>
>Any thoughts greatly appreciated!
>
>Thanks,
>Allie
>
>On 9/13/2016 7:23 PM, Henrik Bengtsson wrote:
>> In R.utils (>= 2.4.0), which I hope to submitted to CRAN today or
>> tomorrow, you can simply call:
>>
>>R.utils::gcDLLs()
>>
>> It will look at base::getLoadedDLLs() and its content and compare to
>> loadedNamespaces() and unregister any "stray" DLLs that remain after
>> corresponding packages have been unloaded.
>>
>> Until the new version is on CRAN, you can install it via
>>
>>
>source("http://callr.org/install#HenrikBengtsson/R.utils@develop;)
>>
>> or alternatively just source() the source file:
>>
>>
>source("https://raw.githubusercontent.com/HenrikBengtsson/R.utils/develop/R/gcDLLs.R;)
>>
>>
>> DISCUSSION:
>> (this might be better suited for R-devel; feel free to move it there)
>>
>> As far as I understand the problem, running into this error / limit
>is
>> _not_ the fault of the user.  Instead, I'd argue that it is the
>> responsibility of package developers to make sure to unregister any
>> registered DLLs of theirs when the package is unloaded.  A developer
>> can do this by adding the following to their package:
>>
>> .onUnload <- function(libpath) {
>> library.dynam.unload(utils::packageName(), libpath)
>>  }
>>
>> That should be all - then the DLL will be unloaded as soon as the
>> package is unloaded.
>>
>> I would like to suggest that 'R CMD check' would include a check that
>> asserts when a package is unloaded it does not leave any registered
>> DLLs behind, e.g.
>>
>> * checking whether the namespace can be unloaded cleanly ... WARNING
>>   Unloading the namespace does not unload DLL
>> * checking loading without being on the library search path ... OK
>>
>> For further details on my thoughts on this, see
>> https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29.
>>
>> Hope this helps
>>
>> Henrik
>>
>> On Tue, Sep 13, 2016 at 6:05 AM, Alexander Shenkin 
>wrote:
>>> Hello all,
>>>
>>> I have a number of analyses that call bunches of sub-scripts, and in
>the
>>> end, I get the "maximal number of DLLs reached" error.  This has
>been asked
>>> before (e.g.
>>>
>http://stackoverflow.com/questions/36974206/r-maximal-number-of-dlls-reached),
>>> and the general answer is, "just clean up after yourself".
>>>
>>> Assuming there are no plans to raise this 100-DLL limit in the near
>future,
>>> my question becomes, what is best practice for cleaning up
>(detaching)
>>> loaded packages in scripts, when those scripts are sometimes called
>from
>>> other scripts?  One can detach all packages at the end of a script
>that were
>>> loaded at the beginning of the script.  However, if a package is
>required in
>>> a calling script, one should really make sure it hadn't been loaded
>prior to
>>> sub-script invocation before detaching it.
>>>
>>> I could write a custom function that pushes and pops package names
>from a
>>> global list, in order to keep track, but maybe there's a better way
>out
>>> there...
>>>
>>> Thanks for any thoughts.
>>>
>>> Allie
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide

Re: [R-es] de pdf a csv

2016-09-14 Thread Mauricio Monsalvo
Hola.
Esta entrada también puede ser útil, al menos como ejemplo:
https://gist.github.com/sdgilley/15ebf67c5b01d12224f4b103c7065625 y tiene
el archivo .pdf que utiliza para descargar, así que puede seguirse el
código completo.
También está basada en pdftools
Saludos

El 12 de septiembre de 2016, 9:15, Carlos Ortega 
escribió:

> Hola,
>
> Otra opción comentada "offline" ha sido la de:
>
> https://cloud.r-project.org/web/packages/pdftables/index.html
>
> Que permite conectar "R" con el servicio online que ofrece
> https://pdftables.com.
>
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es
>
> El 12 de septiembre de 2016, 14:12, Isidro Hidalgo Arellano <
> ihida...@jccm.es> escribió:
>
> > A ver… yo he utilizado el paquete "tm", concretamente la función
> "readPDF".
> >
> > No es tarea fácil, y no por el paquete que vayas a utilizar, sino por la
> > propia codificación interna de un documento "PDF": te bailarán columnas y
> > filas en las tablas, así que hay que tener mucha paciencia y contemplar
> > todos los casos.
> >
> > A riesgo de meterme dónde no me llaman, revisa muy bien los datos
> cargados
> > desde un "PDF" antes de hacer nada con ellos…
> >
> > Paciencia… ¡y suerte!
> >
> >
> >
> >
> >
> > Isidro Hidalgo Arellano
> >
> > Observatorio del Mercado de Trabajo
> >
> > Consejería de Economía, Empresas y Empleo
> >
> >   http://www.castillalamancha.es/
> >
> >
> >
> >
> >
> >
> >
> > De: R-help-es [mailto:r-help-es-boun...@r-project.org] En nombre de Dr.
> > José
> > A. Betancourt Bethencourt
> > Enviado el: domingo, 11 de septiembre de 2016 0:31
> > Para: r-help-es@r-project.org
> > Asunto: [R-es] de pdf a csv
> >
> >
> >
> > Estimados
> >
> >
> >
> > En ocasionas hay informaciones epidemiológicas en reportes pdf semanales
> > como el que adjunto que quisiéramos llevar a csv o txt  USANDO R para
> poder
> > analizarlas estadísticamente. Apreciaríamos su ayuda si nos diesen un
> > script, el paquete pdftable no me resultó.
> >
> > Saludos
> >
> > José
> >
> >
> > [[alternative HTML version deleted]]
> >
> >
> > ___
> > R-help-es mailing list
> > R-help-es@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-help-es
> >
>
>
>
> --
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>



-- 
Mauricio

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R-es] Saltar filas no numericas al importar csv

2016-09-14 Thread javier.ruben.marcuzzi
Estimado

Intente con:

Números <-  as.numeric(data.frame$dondeEstanLosNumeros)

Javier Rubén Marcuzzi

De: Jesús Para Fernández
[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] How to read a grib2 file

2016-09-14 Thread Michael Sumner
On Wed, 14 Sep 2016 at 00:49 Debasish Pai Mazumder 
wrote:

> Hi Mike,
> Thanks again. I am using Mac OS
>
> Here is the required info
>
> > sessionInfo()
> R version 3.2.4 (2016-03-10)
> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> Running under: OS X 10.11.6 (El Capitan)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> other attached packages:
> [1] rNOMADS_2.3.0 rvest_0.3.2   xml2_1.0.0rgdal_1.1-10  raster_2.5-8
> sp_1.2-3
>
> loaded via a namespace (and not attached):
> [1] httr_1.1.0  magrittr_1.5R6_2.1.2rsconnect_0.4.3
> tools_3.2.4
> [6] Rcpp_0.12.4 grid_3.2.4  lattice_0.20-33
>
> I am trying to read " tmax.01.2011040100.daily.grb2" from
> http://nomads.ncdc.noaa.gov/modeldata/cfsv2_forecast_ts_9mon/2011/201104/20110401/2011040100/
>
>

I was a bit surprised, but this does work on Windows - so at the very least
you can run it there with the standard CRAN R and rgdal+raster. Here's the
R code I used, and the resulting session info. (It also works on Debian,
but I'll assume you don't want those details).

f <- "
http://nomads.ncdc.noaa.gov/modeldata/cfsv2_forecast_ts_9mon/2011/201104/20110401/2011040100/tmax.01.2011040100.daily.grb2
"
download.file(f, basename(f), mode = "wb")
library(raster)
raster(basename(f))
# class   : RasterLayer
# band: 1  (of  1224  bands)
# dimensions  : 190, 384, 72960  (nrow, ncol, ncell)
# resolution  : 0.9374987, 0.9473684  (x, y)
# extent  : -0.4687493, 359.5307, -90.24932, 89.75068  (xmin, xmax,
ymin, ymax)
# coord. ref. : +proj=longlat +a=6371229 +b=6371229 +no_defs
# data source : tmax.01.2011040100.daily.grb2
# names   : tmax.01.2011040100.daily

This at least gives you an easy pathway if you can get a Windows machine. I
have nearly no experience with Mac, so you'll have to pursue use of rgdal
in that OS if you really need it.

HTH

R version 3.3.1 Patched (2016-09-09 r71227)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C
[5] LC_TIME=English_Australia.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] raster_2.5-8 sp_1.2-4

loaded via a namespace (and not attached):
[1] rgdal_1.1-10Rcpp_0.12.6 grid_3.3.1  lattice_0.20-33






> with regards
> -Deb
>
>
> On Tue, Sep 13, 2016 at 5:26 AM, Michael Sumner 
> wrote:
>
>> What is your computer system? What is the output of this?
>>
>> sessionInfo()
>>
>> If you point to a file I'll try it so I can tell you the minimum system
>> requirements.
>>
>> Cheers, Mike
>>
>> On Tue, 13 Sep 2016, 08:56 Debasish Pai Mazumder 
>> wrote:
>>
>>> Thanks for your suggestion. I have checked and I don't have JPEG2000 in
>>> ggdalDrivers()). I am pretty new in R. I don't understand how to do I
>>> implement JPEG2000 (JP2OpenJPEG driver) in gdalDrivers() so that I can
>>> read grib2 files
>>>
>>> with regards
>>> -Deb
>>>
>>> On Sat, Sep 10, 2016 at 2:33 AM, Michael Sumner 
>>> wrote:
>>>


 On Sat, 10 Sep 2016 at 07:12 Debasish Pai Mazumder 
 wrote:

> Hi
> I am trying to read a grib2 file in R.
>
> Here is my script
>
> library(rgdal)
> library(sp)
> library(rNOMADS)
> gribfile<-"tmax.01.2011040100.daily.grb2"
> grib <- readGDAL(gribfile)
>
> I am getting following error :
>
> dec_jpeg2000: Unable to open JPEG2000 image within GRIB file.
> Is the JPEG2000 driver available?tmax.01.2011040100.daily.grb2 has GDAL
> driver GRIB
> and has 190 rows and 384 columns
> dec_jpeg2000: Unable to open JPEG2000 image within GRIB file.
> Is the JPEG2000 driver available?dec_jpeg2000: Unable to open JPEG2000
> image within GRIB file.
>
>
 Hi there, please check if JPEG2000 is in the gdalDrivers() list, i.e.
  in

 rgdal::gdalDrivers()$name

 You are looking for one starting with "JP2" as per the list next to the
 "JPEG2000" rows here:

 http://gdal.org/formats_list.html

 I have  JP2OpenJPEG on one system, but not (for example) on the
 Windows CRAN binary for rgdal, which is the only readily available Windows
 build for this package.

 I you don't have it, you might try on a system that has the JP2OpenJPEG 
 driver,
 or ask someone to try on your behalf. You'd want to find out if that will
 enable this read for you before investing time in the Linux configuration.

 It's not too hard to set up a Linux system for this, but does assume a
 bit of experience on your part. Some of the docker images in the
 rockerverse have 

[R] Course: Data exploration, regression, GLM & GAM with introduction to R

2016-09-14 Thread Highland Statistics Ltd


We would like to announce the following statistics course:

Course: Data exploration, regression, GLM & GAM with introduction to R

Where:  Lisbon, Portugal

When:   13-17 February 2017

Course website: http://www.highstat.com/statscourse.htm

Course flyer: http://highstat.com/Courses/Flyers/Flyer2017_02Lisbon_RGG.pdf


Kind regards,

Alain Zuur


Other open courses in 2017:

9-13 January 2017: Data exploration, regression, GLM & GAM

20-24 February 2017: Introduction to Regression Models with Spatial and 
Temporal Correlatio


9-13 October 2017: Linear Mixed Effects Models and GLMM with R. 
Frequentist and Bayesian approaches




--
Dr. Alain F. Zuur

First author of:
1. Beginner's Guide to GAMM with R (2014).
2. Beginner's Guide to GLM and GLMM with R (2013).
3. Beginner's Guide to GAM with R (2012).
4. Zero Inflated Models and GLMM with R (2012).
5. A Beginner's Guide to R (2009).
6. Mixed effects models and extensions in ecology with R (2009).
7. Analysing Ecological Data (2007).

Highland Statistics Ltd.
9 St Clair Wynd
UK - AB41 6DZ Newburgh
Tel:   0044 1358 788177
Email: highs...@highstat.com
URL:   www.highstat.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear Regressions with constraint coefficients

2016-09-14 Thread Aleksandrovic, Aljosa (Pfaeffikon)
Hi all,

I'm using nnls() to run multi-factor regressions with a non-negativity 
constraint on all the coefficients. It works well, but unfortunately the nnls() 
function only returns the parameter estimates, the residual sum-of-squares, the 
residuals (that is response minus fitted values) and the fitted values.

Furthermore, does somebody know how I can get the below outputs using nnls()?

- Coefficient Std. Errors
- t values
- p values

Thanks a lot for your help!

Kind regards,
Aljosa



Aljosa Aleksandrovic, FRM, CAIA
Senior Quantitative Analyst - Convertibles
aljosa.aleksandro...@man.com
Tel +41 55 417 76 03

Man Investments (CH) AG
Huobstrasse 3 | 8808 Pfäffikon SZ | Switzerland

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Aleksandrovic, 
Aljosa (Pfaeffikon)
Sent: Donnerstag, 28. April 2016 15:06
To: Gabor Grothendieck
Cc: r-help@r-project.org
Subject: Re: [R] Linear Regressions with constraint coefficients

Thx a lot Gabor!

Aljosa Aleksandrovic, FRM, CAIA
Quantitative Analyst - Convertibles
aljosa.aleksandro...@man.com
Tel +41 55 417 76 03

Man Investments (CH) AG
Huobstrasse 3 | 8808 Pfäffikon SZ | Switzerland


-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
Sent: Donnerstag, 28. April 2016 14:48
To: Aleksandrovic, Aljosa (Pfaeffikon)
Cc: r-help@r-project.org
Subject: Re: [R] Linear Regressions with constraint coefficients

The nls2 package can be used to get starting values.

On Thu, Apr 28, 2016 at 8:42 AM, Aleksandrovic, Aljosa (Pfaeffikon) 
 wrote:
> Hi Gabor,
>
> Thanks a lot for your help!
>
> I tried to implement your nonlinear least squares solver on my data set. I 
> was just wondering about the argument start. If I would like to force all my 
> coefficients to be inside an interval, let’s say, between 0 and 1, what kind 
> of starting values are normally recommended for the start argument (e.g. 
> Using a 4 factor model with b1, b2, b3 and b4, I tried start = list(b1 = 0.5, 
> b2 = 0.5, b3 = 0.5, b4 = 0.5))? I also tried other starting values ... Hence, 
> the outputs are very sensitive to that start argument?
>
> Thanks a lot for your answer in advance!
>
> Kind regards,
> Aljosa
>
>
>
> Aljosa Aleksandrovic, FRM, CAIA
> Quantitative Analyst - Convertibles
> aljosa.aleksandro...@man.com
> Tel +41 55 417 76 03
>
> Man Investments (CH) AG
> Huobstrasse 3 | 8808 Pfäffikon SZ | Switzerland
>
> -Original Message-
> From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
> Sent: Dienstag, 26. April 2016 17:59
> To: Aleksandrovic, Aljosa (Pfaeffikon)
> Cc: r-help@r-project.org
> Subject: Re: [R] Linear Regressions with constraint coefficients
>
> This is a quadratic programming problem that you can solve using 
> either a quadratic programming solver with constraints or a general 
> nonlinear solver with constraints.  See 
> https://cran.r-project.org/web/views/Optimization.html
> for more info on what is available.
>
> Here is an example using a nonlinear least squares solver and non-negative 
> bound constraints. The constraint that the coefficients sum to 1 is implied 
> by dividing them by their sum and then dividing the coefficients found by 
> their sum at the end:
>
> # test data
> set.seed(123)
> n <- 1000
> X1 <- rnorm(n)
> X2 <- rnorm(n)
> X3 <- rnorm(n)
> Y <- .2 * X1 + .3 * X2 + .5 * X3 + rnorm(n)
>
> # fit
> library(nlmrt)
> fm <- nlxb(Y ~ (b1 * X1 + b2 * X2 + b3 * X3)/(b1 + b2 + b3),
>  data = list(Y = Y, X1 = X1, X2 = X2, X3 = X3),
>  lower = numeric(3),
>  start = list(b1 = 1, b2 = 2, b3 = 3))
>
> giving the following non-negative coefficients which sum to 1 that are 
> reasonably close to the true values of 0.2, 0.3 and 0.5:
>
>> fm$coefficients / sum(fm$coefficients)
>  b1  b2  b3
> 0.18463 0.27887 0.53650
>
>
> On Tue, Apr 26, 2016 at 8:39 AM, Aleksandrovic, Aljosa (Pfaeffikon) 
>  wrote:
>> Hi all,
>>
>> I hope you are doing well?
>>
>> I’m currently using the lm() function from the package stats to fit linear 
>> multifactor regressions.
>>
>> Unfortunately, I didn’t yet find a way to fit linear multifactor regressions 
>> with constraint coefficients? I would like the slope coefficients to be all 
>> inside an interval, let’s say, between 0 and 1. Further, if possible, the 
>> slope coefficients should add up to 1.
>>
>> Is there an elegant and not too complicated way to do such a constraint 
>> regression estimation in R?
>>
>> I would very much appreciate if you could help me with my issue?
>>
>> Thanks a lot in advance and kind regards, Aljosa Aleksandrovic
>>
>>
>>
>> Aljosa Aleksandrovic, FRM, CAIA
>> Quantitative Analyst - Convertibles
>> aljosa.aleksandro...@man.com
>> Tel +41 55 417 7603
>>
>> Man Investments (CH) AG
>> Huobstrasse 3 | 8808 Pfäffikon SZ | Switzerland
>>
>>
>> -Original Message-
>> From: Kevin E. Thorpe [mailto:kevin.tho...@utoronto.ca]
>> 

Re: [R] Maximum # of DLLs reached, or, how to clean up after yourself?

2016-09-14 Thread Alexander Shenkin

Hi Henrik,

Thanks for your reply.  I didn't realize that floating DLLs were an 
issue (good to know).  My query is actually a bit more basic.  I'm 
actually wondering how folks manage their loading and unloading of 
packages when calling scripts within scripts.


Quick example:
Script1:
library(package1)
source("script2.r")
# do stuff reliant on package1
detach("package:package1", unload=TRUE)

Script2:
library(package1)
library(package2)
# do stuff reliant on package1 and package2
detach("package:package1", unload=TRUE)
detach("package:package2", unload=TRUE)

Script2 breaks Script1 by unloading package1 (though unloading package2 
is ok).  I will have to test whether each package is loaded in Script2 
before loading it, and use that list when unloading at the end of the 
Script2.  *Unless there's a better way to do it* (which is my question - 
is there?).  I'm possibly just pushing the whole procedural scripting 
thing too far, but I also think that this likely isn't uncommon in R.


Any thoughts greatly appreciated!

Thanks,
Allie

On 9/13/2016 7:23 PM, Henrik Bengtsson wrote:

In R.utils (>= 2.4.0), which I hope to submitted to CRAN today or
tomorrow, you can simply call:

   R.utils::gcDLLs()

It will look at base::getLoadedDLLs() and its content and compare to
loadedNamespaces() and unregister any "stray" DLLs that remain after
corresponding packages have been unloaded.

Until the new version is on CRAN, you can install it via

source("http://callr.org/install#HenrikBengtsson/R.utils@develop;)

or alternatively just source() the source file:


source("https://raw.githubusercontent.com/HenrikBengtsson/R.utils/develop/R/gcDLLs.R;)


DISCUSSION:
(this might be better suited for R-devel; feel free to move it there)

As far as I understand the problem, running into this error / limit is
_not_ the fault of the user.  Instead, I'd argue that it is the
responsibility of package developers to make sure to unregister any
registered DLLs of theirs when the package is unloaded.  A developer
can do this by adding the following to their package:

.onUnload <- function(libpath) {
library.dynam.unload(utils::packageName(), libpath)
 }

That should be all - then the DLL will be unloaded as soon as the
package is unloaded.

I would like to suggest that 'R CMD check' would include a check that
asserts when a package is unloaded it does not leave any registered
DLLs behind, e.g.

* checking whether the namespace can be unloaded cleanly ... WARNING
  Unloading the namespace does not unload DLL
* checking loading without being on the library search path ... OK

For further details on my thoughts on this, see
https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29.

Hope this helps

Henrik

On Tue, Sep 13, 2016 at 6:05 AM, Alexander Shenkin  wrote:

Hello all,

I have a number of analyses that call bunches of sub-scripts, and in the
end, I get the "maximal number of DLLs reached" error.  This has been asked
before (e.g.
http://stackoverflow.com/questions/36974206/r-maximal-number-of-dlls-reached),
and the general answer is, "just clean up after yourself".

Assuming there are no plans to raise this 100-DLL limit in the near future,
my question becomes, what is best practice for cleaning up (detaching)
loaded packages in scripts, when those scripts are sometimes called from
other scripts?  One can detach all packages at the end of a script that were
loaded at the beginning of the script.  However, if a package is required in
a calling script, one should really make sure it hadn't been loaded prior to
sub-script invocation before detaching it.

I could write a custom function that pushes and pops package names from a
global list, in order to keep track, but maybe there's a better way out
there...

Thanks for any thoughts.

Allie

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] t-test y distribución de variables

2016-09-14 Thread Olivier Nuñez

Entiendo que quieres regularizar tus datos agregandolos.
El test de Shapiro rechaza la normalidad de una uniforme

> set.seed(1234)
> d=runif(100, min = 2, max = 4)
> shapiro.test(d)

Shapiro-Wilk normality test

data:  d
W = 0.94504, p-value = 0.0003966

Pero acepta la normalidad de medias de 5 uniformes:
> i=replicate(100,mean(sample(d,5)))
> shapiro.test(i)

Shapiro-Wilk normality test

data:  i
W = 0.98641, p-value = 0.3991

Ahora bien, en todo caso, deberias contrastar la normalidad de tus "medias" en 
el seno de cada grupo y actualizar los grados de libertad de tu t-test basado 
en medias de medias. Pero mi consejo es optar por una solución más ortodoxa, 
tipo trimm-mean o test no parametrico.
Un saludo. Olivier


- Mensaje original -
De: "JM Arbones" 
Para: r-help-es@r-project.org
Enviados: Martes, 13 de Septiembre 2016 16:13:08
Asunto: [R-es] t-test y distribución de variables

Hola,
Estoy analizando unos datos para una tesis doctoral.
Durante la investigación se han recogido distintas variables clínicas de 
dos grupos (n=30 y n=40). Me encuentro que las comparaciones de medias 
se han realizado mediante t-tests sin preocuparse de estudiar la 
distribución de las variables.  Al revisar si las variables se ajustan a 
la distribución normal, aunque los QQplots no tienen "mala pinta" las 
distribuciones de las variables se alejan de la normalidad 
(Shapiro-Wilks<0.01).
Aunque se podrían utilizar otros métodos (no parametricos, 
permutaciones, etc) para realizar las comparaciones, el doctorando se 
empeña en utilizar las pruebas t (imagino que por no rehacer todos los 
resultados).


Entiendo que el t-test es una prueba bastante robusta y puede tolerar 
desviaciones de la normalidad, también entiendo que el criterio para 
poder aplicar esta prueba es que la distribución de las medias (no de 
las variables) sea normal. Se me ha ocurrido que si  remuestreo (siendo 
d la variable de estudio)

  for (n in 1:500){
   i[n]=mean(sample(d,20))}

y justifico que la distribución de las medias sigue una distribución normal

shapiro.test(i)

podría decir que las pruebas t (utilizando la correccion de Welch por si 
acaso) se hacen "con todas las de la ley".

Me gustaria que los que sabeis de esto me dierais vuestra opinion con 
respecto a este apaño.

Un saludo y muchas gracias

Jose Miguel

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] gene name problem in Excel, and an R analogue

2016-09-14 Thread Jeff Newmiller
What, like the colClasses argument? Darn that ellipsis and its consequent 
deferred documentation... but it _is_ mentioned in passing in ?read.fwf.
-- 
Sent from my phone. Please excuse my brevity.

On September 13, 2016 10:54:44 PM PDT, Erich Neuwirth 
 wrote:
>Since many people commenting on the gene name problem in Excel
>essentially tell us
>This could never have happened with R
>I want to show you a somewhat related issue:
>
>
>ff1 <- tempfile()
>cat(file = ff1, "12345", "1E002", sep = "\n")
>xdf1 <- read.fwf(ff1, widths = 5, stringsAsFactors=FALSE)
>
>ff2 <- tempfile()
>cat(file = ff2, "12345", "1E002","1A010", sep = "\n")
>xdf2 <- read.fwf(ff2, widths = 5, stringsAsFactors=FALSE)
>
>in xdf1, the variable is numeric, in xdf2, it is a character variable.
>Of course, in hindsight this makes sense. But the problem is similar to
>the
>Excel problem where something which could be a date is interpreted as a
>date.
>
>A possible solution with my read.fwf problem would be to have a
>parameter
>forcing variables to be read as strings.
>
>
>
>
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Upgrade R 3.2 to 3.3 using tar.gz file on Ubuntu 16.04

2016-09-14 Thread Jeff Newmiller
No, the default Ubuntu version is behind but that link tells how to properly 
install the current version using the packaging system, which makes it more 
easily maintained than a custom compile would be.  (Thanks, Dirk, Michael and 
Vincent!) Anyway, still OT here.
-- 
Sent from my phone. Please excuse my brevity.

On September 13, 2016 10:57:52 PM PDT, Loris Bennett 
 wrote:
>Jeff Newmiller  writes:
>
>> For this query I would rather recommend [1] as reference, though
>> Marc's suggestion to switch mailing lists is best.
>>
>> [1] https://cran.r-project.org/bin/linux/ubuntu/
>
>But doesn't this merely describe how to install the version of R
>packaged for his version of Ubuntu?  As I understood the OP, he has
>already installed this version, which is 3.2.3, but would like to
>install 3.3.1, which he will have to install from the sources.
>
>If this is the case, then there is no point reposting to r-sig-debian,
>since he just has to do a generic unix-like install.
>
>Cheers,
>
>Loris
>
>-- 
>Dr. Loris Bennett (Mr.)
>ZEDAT, Freie Universität Berlin Email
>loris.benn...@fu-berlin.de
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Upgrade R 3.2 to 3.3 using tar.gz file on Ubuntu 16.04

2016-09-14 Thread Loris Bennett
Jeff Newmiller  writes:

> For this query I would rather recommend [1] as reference, though
> Marc's suggestion to switch mailing lists is best.
>
> [1] https://cran.r-project.org/bin/linux/ubuntu/

But doesn't this merely describe how to install the version of R
packaged for his version of Ubuntu?  As I understood the OP, he has
already installed this version, which is 3.2.3, but would like to
install 3.3.1, which he will have to install from the sources.

If this is the case, then there is no point reposting to r-sig-debian,
since he just has to do a generic unix-like install.

Cheers,

Loris

-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] gene name problem in Excel, and an R analogue

2016-09-14 Thread Erich Neuwirth
Since many people commenting on the gene name problem in Excel essentially tell 
us
This could never have happened with R
I want to show you a somewhat related issue:


ff1 <- tempfile()
cat(file = ff1, "12345", "1E002", sep = "\n")
xdf1 <- read.fwf(ff1, widths = 5, stringsAsFactors=FALSE)

ff2 <- tempfile()
cat(file = ff2, "12345", "1E002","1A010", sep = "\n")
xdf2 <- read.fwf(ff2, widths = 5, stringsAsFactors=FALSE)

in xdf1, the variable is numeric, in xdf2, it is a character variable.
Of course, in hindsight this makes sense. But the problem is similar to the
Excel problem where something which could be a date is interpreted as a date.

A possible solution with my read.fwf problem would be to have a parameter
forcing variables to be read as strings.



signature.asc
Description: Message signed with OpenPGP using GPGMail
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.