date:20160418

Re: [R] Sum of Numeric Values in a DF Column

2016-04-18 Thread Burhan ul haq

Dear Gunter /  Heiberger,

Thanks for the help. This is what I was looking for:

> ... and here is a non-dplyr rsolution:
>
>> z <-gsub("[^[:digit:]]"," ",dd$Lower)
>
>> sapply(strsplit(z," +"),function(x)sum(as.numeric(x),na.rm=TRUE))
> [1] 105  67  60 100  80

And that would explain, why one could not use "unlist" as a grand sum total
was not desired, but rather sum for each of the rows.


Br /

On Mon, Apr 18, 2016 at 10:57 PM, Bert Gunter 
wrote:

> ... and a slightly more efficient non-dplyr 1-liner:
>
> > sapply(strsplit(dd$Lower,"[^[:digit:]]"),
> function(x)sum(as.numeric(x), na.rm=TRUE))
>
> [1] 105  67  60 100  80
>
> Cheers,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Apr 18, 2016 at 10:43 AM, Bert Gunter 
> wrote:
> > ... and here is a non-dplyr rsolution:
> >
> >> z <-gsub("[^[:digit:]]"," ",dd$Lower)
> >
> >> sapply(strsplit(z," +"),function(x)sum(as.numeric(x),na.rm=TRUE))
> > [1] 105  67  60 100  80
> >
> >
> > Cheers,
> > Bert
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> > and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Mon, Apr 18, 2016 at 10:07 AM, Richard M. Heiberger 
> wrote:
> >> ## Continuing with your data
> >>
> >> AA <- stringr::str_extract_all(dd[[2]],"[[:digit:]]+")
> >> BB <- lapply(AA, as.numeric)
> >> ## I think you are looking for one of the following two expressions
> >> sum(unlist(BB))
> >> sapply(BB, sum)
> >>
> >>
> >> On Mon, Apr 18, 2016 at 12:48 PM, Burhan ul haq 
> wrote:
> >>> Hi,
> >>>
> >>> I request help with the following:
> >>>
> >>> INPUT: A data frame where column "Lower" is a character containing
> numeric
> >>> values (different count or occurrences of numeric values in each row,
> >>> mostly 2)
> >>>
>  dput(dd)
> >>> structure(list(State = c("Alabama", "Alaska", "Arizona", "Arkansas",
> >>> "California"), Lower = c("R 72–33", "R/Coalition 27(23 R, 4 D)–12 D, 1
> >>> Ind.",
> >>> "R 36–24", "R 64–35, 1 Ind.", "D 52–28"), Upper = c("R 26–8, 1 Ind.",
> >>> "R/Coalition 15(14 R, 1 D)–5 D", "R 18–12", "R 24–11", "D 26–14"
> >>> )), .Names = c("State", "Lower", "Upper"), row.names = c(NA,
> >>> 5L), class = "data.frame")
> >>>
> >>> PROBLEM: Need to extract all numeric values and sum them. There are few
> >>> exceptions like row2. But these can be ignored and will be fixed
> manually
> >>>
> >>> SOLUTION SO FAR:
> >>> str_extract_all(dd[[2]],"[[:digit:]]+"), returns a list of numbers as
> >>> character. I am unable to unlist it, because it mixes them all
> together, ...
> >>>
> >>> And if I may add, is there a "dplyr" way of doing it ...
> >>>
> >>>
> >>> Thanks
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interquartile Range

2016-04-18 Thread Jim Lemon

Hi Michael,
At a guess, try this:

iqr<-function(x) {
 return(paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-")
}

.col3_Range=iqr(datat$tenure)

Jim



On Tue, Apr 19, 2016 at 11:15 AM, Michael Artz  wrote:
> Hi,
>   I am trying to show an interquartile range while grouping values using
> the function ddply().  So my function call now is like
>
> groupedAll <- ddply(data
>  ,~groupColumn
>  ,summarise
>  ,col1_mean=mean(col1)
>  ,col2_mode=Mode(col2) #Function I wrote for getting the
> mode shown below
>
>  ,col3_Range=paste(as.character(round(quantile(datat$tenure,c(.25,
> as.character(round(quantile(data$tenure,c(.75, sep = "-")
>  )
>
> #custom Mode function
> Mode <- function(x) {
>   ux <- unique(x)
>   ux[which.max(tabulate(match(x, ux)))]
> }
>
> I am not sre what is going wrong on my interquartile range function, it
> works on its own outside of ddply()
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem on simulation code (the loop unable to function effectively)

2016-04-18 Thread Jim Lemon

Hi Si Jie,
Again, please send questions to the list, not me.

Okay, I may have worked out what you are doing. The program runs and
produces what I would expect in the rightmost columns of the result
"g".

You are storing the number of each test for which the p value is less
than 0.05. It looks to me as though the objects storing the results
should be vectors as you are only storing 100 p values at a time.
Using matrices in which the only part of the values are used is a bit
confusing. Perhaps this resulted from my initial suggestion of using
matrices and storing all of the results before calculating the number
of p values less than 0.05, which you didn't do.

If I rewrite your code with vectors for storing the p values, I get
the same results. Essentially, with the scale parameter set the same
for both distributions, you get about five false positives (Type I
errors) per 100 simulations. This is expected. As the scale
parameters, and thus the means, of the groups diverge, you first get a
large number of results less than 0.05. With large differences between
the means, all results are less than 0.05, which seems correct to me.
You have probably been asked to interpret the effect of group size and
difference in means on the number of p values less than 0.05. If the
difference in means is large enough, you are almost guaranteed this.
In the cases where this is not occurring (i.e. the first difference in
scale parameters) you can see the difference in outcomes between the
three significance tests.

Jim

On Tue, Apr 19, 2016 at 10:35 AM,   wrote:
> Hi, i am sorry but that part is not a part of the code.
> I am just listing my variable for simulation out at the beginning in order
> to provide easy understanding.
> i am doing a whole night to find out where have i done wrong.But i just
> keep repeat producing the wrong output ...
>
>
>
>
>
>
>> Hi Jeem,
>> First, please send questions like this to the help list, not me.
>>
>> I assume that you are in a similar position to sjtan who has been
>> sending almost exactly the same questions.
>>
>> The problem is not in the loops (which look rather familiar to me) but
>> in your initial assignments at the top. For instance:
>>
>> scale parameter=(1,1.5,2,2.5,3)
>>
>> produces an error which has nothing to do with the loops. This is a
>> very basic mistake, for:
>>
>> scale_parameter<-c(1,1.5,2,2.5,3)
>>
>> fixes it. I think if you learn a bit about basic R coding you will be
>> able to fix these problems yourself.
>>
>> Jim
>>
>>
>> On Tue, Apr 19, 2016 at 4:03 AM,   wrote:
>>> Greeting dr jim,
>>>
>>> I am student from Malaysia. I am doing R simulation study. For your
>>> information, I have been written a code relating to 2 gamma distribution
>>> with equal skewness.
>>> skewness=1.0
>>> shape parameter=16/9
>>> scale parameter=(1,1.5,2,2.5,3)
>>>
>>> Below are my coding, however, the code have some error and yet after
>>> trying this and that for a whole day, i couldn't spot the mistake. The
>>> output should be greater or smaller than 0.05 while no exceeding too
>>> much
>>> .But the for loop only able to function for scale parameter 1 .I try to
>>> apply another for loop for the scale parameter, but the output become
>>> worsen, as the simulation will even get hang. all of the output is 100.
>>> Please, could you give me some advice ?
>>>
>>> #For gamma disribution with equal skewness 1.5
>>> rm(list=ls())
>>> nSims<-100
>>> alpha<-0.05
>>>
>>> #here we declare the random seed generator
>>> set.seed(3)
>>>
>>> ## Put the samples sizes into matrix then use a loop for sample sizes
>>> sample_sizes<-matrix(c(10,10,10,25,25,25,25,50,25,100,50,25,50,100,100,25,100,100),nrow=2)
>>>
>>> #shape parameter for both gamma distribution for equal skewness
>>> shp<-rep(16/9,each=45)
>>>
>>> #scale parameter for sample 1
>>> #scale paramter for sample 2 set as constant 1
>>> d1<-matrix(c(1,1.5,2,2.5,3),ncol=1)
>>> scp<-rep(d1,9)
>>>
>>> #create a matrix combining the forty five cases of combination of sample
>>> sizes,shape and scale parameter
>>> all<- cbind(rep(sample_sizes[1,],5),rep(sample_sizes[2,],5),scp)
>>>
>>> # name the column samples 1 and 2 and standard deviation
>>> colnames(all) <- c("m", "n","scp")
>>>
>>>
>>> #set empty vector of length to store p-value
>>> equal3<-rep(0,nrow(all))
>>> unequal4<-rep(0,nrow(all))
>>> mann5<-rep(0,nrow(all))
>>>
>>> #set nrow =nsims because wan storing every p-value simulated
>>> #for gamma distribution with equal skewness
>>> matrix3_equal  <-matrix(0,nrow=nSims,ncol=3)
>>> matrix4_unequal<-matrix(0,nrow=nSims,ncol=3)
>>> matrix5_mann   <-matrix(0,nrow=nSims,ncol=3)
>>>
>>>
>>> # this loop steps through the all_combine matrix
>>>  for(ss in 1:nrow(all))
>>>
>>>   {  #generate samples from the first column and second column
>>>  m<-all[ss,1]
>>>  n<-all[ss,2]
>>>
>>>   for ( sim in 1:nSims)
>>>  {
>>> #generate 2 random samples from gamma distribution

[R] Interquartile Range

2016-04-18 Thread Michael Artz

Hi,
  I am trying to show an interquartile range while grouping values using
the function ddply().  So my function call now is like

groupedAll <- ddply(data
 ,~groupColumn
 ,summarise
 ,col1_mean=mean(col1)
 ,col2_mode=Mode(col2) #Function I wrote for getting the
mode shown below

 ,col3_Range=paste(as.character(round(quantile(datat$tenure,c(.25,
as.character(round(quantile(data$tenure,c(.75, sep = "-")
 )

#custom Mode function
Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}

I am not sre what is going wrong on my interquartile range function, it
works on its own outside of ddply()

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem on simulation code (the loop unable to function effectively)

2016-04-18 Thread Jim Lemon

Hi Jeem,
First, please send questions like this to the help list, not me.

I assume that you are in a similar position to sjtan who has been
sending almost exactly the same questions.

The problem is not in the loops (which look rather familiar to me) but
in your initial assignments at the top. For instance:

scale parameter=(1,1.5,2,2.5,3)

produces an error which has nothing to do with the loops. This is a
very basic mistake, for:

scale_parameter<-c(1,1.5,2,2.5,3)

fixes it. I think if you learn a bit about basic R coding you will be
able to fix these problems yourself.

Jim


On Tue, Apr 19, 2016 at 4:03 AM,   wrote:
> Greeting dr jim,
>
> I am student from Malaysia. I am doing R simulation study. For your
> information, I have been written a code relating to 2 gamma distribution
> with equal skewness.
> skewness=1.0
> shape parameter=16/9
> scale parameter=(1,1.5,2,2.5,3)
>
> Below are my coding, however, the code have some error and yet after
> trying this and that for a whole day, i couldn't spot the mistake. The
> output should be greater or smaller than 0.05 while no exceeding too much
> .But the for loop only able to function for scale parameter 1 .I try to
> apply another for loop for the scale parameter, but the output become
> worsen, as the simulation will even get hang. all of the output is 100.
> Please, could you give me some advice ?
>
> #For gamma disribution with equal skewness 1.5
> rm(list=ls())
> nSims<-100
> alpha<-0.05
>
> #here we declare the random seed generator
> set.seed(3)
>
> ## Put the samples sizes into matrix then use a loop for sample sizes
> sample_sizes<-matrix(c(10,10,10,25,25,25,25,50,25,100,50,25,50,100,100,25,100,100),nrow=2)
>
> #shape parameter for both gamma distribution for equal skewness
> shp<-rep(16/9,each=45)
>
> #scale parameter for sample 1
> #scale paramter for sample 2 set as constant 1
> d1<-matrix(c(1,1.5,2,2.5,3),ncol=1)
> scp<-rep(d1,9)
>
> #create a matrix combining the forty five cases of combination of sample
> sizes,shape and scale parameter
> all<- cbind(rep(sample_sizes[1,],5),rep(sample_sizes[2,],5),scp)
>
> # name the column samples 1 and 2 and standard deviation
> colnames(all) <- c("m", "n","scp")
>
>
> #set empty vector of length to store p-value
> equal3<-rep(0,nrow(all))
> unequal4<-rep(0,nrow(all))
> mann5<-rep(0,nrow(all))
>
> #set nrow =nsims because wan storing every p-value simulated
> #for gamma distribution with equal skewness
> matrix3_equal  <-matrix(0,nrow=nSims,ncol=3)
> matrix4_unequal<-matrix(0,nrow=nSims,ncol=3)
> matrix5_mann   <-matrix(0,nrow=nSims,ncol=3)
>
>
> # this loop steps through the all_combine matrix
>  for(ss in 1:nrow(all))
>
>   {  #generate samples from the first column and second column
>  m<-all[ss,1]
>  n<-all[ss,2]
>
>   for ( sim in 1:nSims)
>  {
> #generate 2 random samples from gamma distribution with equal
> skewness
> gamma1<-rgamma(m,16/9,all[ss,3])
> gamma2<-rgamma(n,16/9,1)
> #minus population mean from each sample to maintain the equality of
> null#hypotheses (population mean =scale parameter *shape
> parameter)
> gamma1<-gamma1-16/9*all[ss,3]
> gamma2<-gamma2-16/9
>
>  matrix3_equal[sim,1]<-t.test(gamma1,gamma2,var.equal=TRUE)$p.value
>  matrix4_unequal[sim,2]<-t.test(gamma1,gamma2,var.equal=FALSE)$p.value
>  matrix5_mann[sim,3] <-wilcox.test(gamma1,gamma2)$p.value
>
>   }
> ##store the result
> equal3[ss]<- sum(matrix3_equal[,1]unequal4[ss]<-sum(matrix4_unequal[,2]   mann5[ss]<- sum(matrix5_mann[,3] }
> g<-cbind(all, equal3, unequal4, mann5)
>
> I will be really appreciated for any respond .Thanks.
>
> Your sincerely
> Jeem
>
>
>
>
> Computational Mathematics
> School of Informatics and Applied Mathematics
> Universiti Malaysia Terengganu
> 21030 Kuala Terengganu, Terengganu, Malaysia
> Phone : 017-7799039
> Email : uk31...@student.umt.edu.my
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Indicator Species analysis; trouble with multipatt

2016-04-18 Thread Jim Lemon

Hi Ansley,
Without your data file (or a meaningful subset) we can only guess, but
you may be trying to define groups on the columns rather than the rows
of the data set. Usually rows represent cases and each case must have
a value for the grouping variable.

Jim

On Tue, Apr 19, 2016 at 6:33 AM, Ansley Silva  wrote:
> Hello,
>
> *Error in tx  %*% comb : non-conformable arguments*
>
> Suggestions greatly appreciated.  I am a beginner and this is my first time
> posting.
>
> I would like to get the summary for indicator species analysis, using
> package indicspecies with multipatt.  I am getting errors, I believe, do to
> my data organization.  After reorganizing and reorganizing, nothing has
> helped.
>
>> data<- read.csv(file="Data1.csv", header=TRUE, row.names=1, sep=",")
>> ap<-data[c(1:24, 1:81)]
>> groups<-c(rep(1:4,6))
>> indval<- multipatt(ap, groups, control = how(nperm=999))
> *Error in tx  %*% comb : non-conformable arguments*
>
>
>
> --
> Ansley Silva
>
>
> *"The clearest way into the Universe is through a forest wilderness." John
> Muir*
>
>
> *Graduate Research Assistant*
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] as.Date: fixed

2016-04-18 Thread Ogbos Okike

Dear All,
Many thanks for bailing me out.
Ogbos
On Apr 18, 2016 9:07 PM, "David Winsemius"  wrote:

>
> > On Apr 18, 2016, at 10:44 AM, Ogbos Okike 
> wrote:
> >
> > Dear ALL,
> > Thank you so much for your contributions.
> > I have made some progress. Below is a simple script I gleaned from
> > your kind responses:
> > Sys.setenv(TZ="Etc/GMT")
> > dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")
> > times <- c("23:0:0", "22:0:0", "01:00:00", "18:0:0", "16:0:0")
> > x <- paste(dates, times)
> > aa<-strptime(x, "%m/%d/%y %H:%M:%S")
> > bb<-1:5
> > plot(aa, bb)
> >
> > I tried plotting my result and I got what I am looking for. I think I
> > am almost there.
> >
> > I am, however, stuck here. My data is a large file and the form
> > differs a little from the example I used. The quotation marks in both
> > date and time is my headache now. Such inverted commas are not in my
> > data. I can with awk transform my data to get exactly something like
> > dd/mm/yy. But I wont know how to make the data appear in quotation
> > mark in R.
>
> There are not any quotation marks in an R object that is displayed as
> "02/27/92". The quotation marks are just added by the print function to
> make it clear to the user that it is a character value.
>
> If you read such values in with read.table they would automatically be
> interpreted as character values and then converted to factor class (which
> you do not want). Read up on the use in the read.* functions for colClasses
> and stringsAsFactors to safely input character values.
> --
> David.
>
> > I will once more be glad for any more help.
> > Ogbos
> >
> > PS: I am still afraid of this forum. Please direct me to the right
> > forum if this is not ok. Thanks again.
> >
> >
> > On 4/18/16, peter dalgaard  wrote:
> >> The most important thing is that Date objects by definition do not
> include
> >> time of day. You want to look at ISOdatetime() and as.POSIXct()
> instead. And
> >> beware daylight savings time issues.
> >>
> >> -pd
> >>
> >> On 18 Apr 2016, at 15:09 , Ogbos Okike 
> wrote:
> >>
> >>> Dear All,
> >>>
> >>> I have a data set containing year, month, day and counts as shown
> below:
> >>> data <- read.table("data.txt", col.names = c("year", "month", "day",
> >>> "counts"))
> >>> Using the formula below, I converted the data to as date and plotted.
> >>>
> >>> new.century <- data$year < 70
> >>>
> >>> data$year <- ifelse(new.century, data$year + 2000, data$year + 1900)
> >>>
> >>> data$date <- as.Date(ISOdate(data$year, data$month, data$day))
> >>>
> >>> The form of the data is:
> >>> 16 1 19 9078
> >>> 16 1 20 9060
> >>> 16 1 21 9090
> >>> 16 1 22 9080
> >>> 16 1 23 9121
> >>> 16 1 24 9199
> >>> 16 1 25 9289
> >>> 16 1 26 9285
> >>> 16 1 27 9245
> >>> 16 1 28 9223
> >>> 16 1 29 9298
> >>> 16 1 30 9327
> >>> 16 1 31 9365
> >>>
> >>> Now, I wish to include time (hour) in my data. The new data is of the
> >>> form:
> >>> 05 01 06 143849
> >>> 05 01 06 153845
> >>> 05 01 06 163836
> >>> 05 01 06 173847
> >>> 05 01 06 183850
> >>> 05 01 06 193872
> >>> 05 01 06 203849
> >>> 05 01 06 213860
> >>> 05 01 06 223868
> >>> 05 01 06 233853
> >>> 05 01 07 003839
> >>> 05 01 07 013842
> >>> 05 01 07 023843
> >>> 05 01 07 033865
> >>> 05 01 07 043879
> >>> 05 01 07 053876
> >>> 05 01 07 063867
> >>> 05 01 07 073887
> >>>
> >>> I now read the data as:
> >>> data <- read.table("data.txt", col.names = c("year", "month", "day",
> >>> "counts", "hour")) and also included hour in data$date <-
> >>> as.Date(ISOdate(data$year, data$month, data$day))
> >>> i.e data$date <- as.Date(ISOdate(data$year, data$month, data$day,
> >>> data$hour)).
> >>>
> >>> However, these did not work.
> >>>
> >>> Can you please assist be on how to get this date and time in the right
> >>> format. The right format I got without hour looks like : 2005-12-29"
> >>> "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
> >>> [8696] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
> >>> [8701] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
> >>> [8706] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
> >>>
> >>> I used this in my plot. Please I want this format to include hour.
> >>>
> >>> Many thanks for your help. I am just a newbe. I am not sure if this
> >>> forum is the right one. After registration, I tried to post to Nabble
> >>> forum where I registered but could not succeed.
> >>>
> >>> If there is a mistake, please help/direct me to the right forum.
> >>>
> >>> Best regards
> >>> Ogbos
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>>

[R] Indicator Species analysis; trouble with multipatt

2016-04-18 Thread Ansley Silva

Hello,

*Error in tx  %*% comb : non-conformable arguments*

Suggestions greatly appreciated.  I am a beginner and this is my first time
posting.

I would like to get the summary for indicator species analysis, using
package indicspecies with multipatt.  I am getting errors, I believe, do to
my data organization.  After reorganizing and reorganizing, nothing has
helped.

> data<- read.csv(file="Data1.csv", header=TRUE, row.names=1, sep=",")
> ap<-data[c(1:24, 1:81)]
> groups<-c(rep(1:4,6))
> indval<- multipatt(ap, groups, control = how(nperm=999))
*Error in tx  %*% comb : non-conformable arguments*



-- 
Ansley Silva


*"The clearest way into the Universe is through a forest wilderness." John
Muir*


*Graduate Research Assistant*
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lists and rownames

2016-04-18 Thread jim holtman

You can always add those names to the list:  is this what you are after?

> example.names <- c("con1-1-masked-bottom-green.tsv",
"con1-1-masked-bottom-red.tsv"
+ , "con1-1-masked-top-green.tsv","con1-1-masked-top-red.tsv")
> example.list <- strsplit(example.names, "-")
> names(example.list) <- example.names
> example.df <- as.data.frame(example.list)
>
> example.df
  con1.1.masked.bottom.green.tsv con1.1.masked.bottom.red.tsv
con1.1.masked.top.green.tsv
1   con1 con1
 con1
2  11
1
3 masked   masked
   masked
4 bottom   bottom
  top
5  green.tsv  red.tsv
green.tsv
  con1.1.masked.top.red.tsv
1  con1
2 1
3masked
4   top
5   red.tsv
> str(example.df)
'data.frame':   5 obs. of  4 variables:
 $ con1.1.masked.bottom.green.tsv: Factor w/ 5 levels
"1","bottom","con1",..: 3 1 5 2 4
 $ con1.1.masked.bottom.red.tsv  : Factor w/ 5 levels
"1","bottom","con1",..: 3 1 4 2 5
 $ con1.1.masked.top.green.tsv   : Factor w/ 5 levels
"1","con1","green.tsv",..: 2 1 4 5 3
 $ con1.1.masked.top.red.tsv : Factor w/ 5 levels
"1","con1","masked",..: 2 1 3 5 4



Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Mon, Apr 18, 2016 at 4:21 PM, Ed Siefker  wrote:

> I'm doing some string manipulation on a vector of file names, and noticed
> something curious.  When I strsplit the vector, I get a list of
> character vectors.
> The list is numbered, as lists are.  When I cast that list as a data
> frame with 'as.data.frame()', the resulting columns have names derived
> from the original filenames.
>
> Example code is below.  My question is, where are these names stored
> in the list?  Are there methods that can access this from the list?
> Is there a way to preserve them verbatim?  Thanks
> -Ed
>
> > example.names
> [1] "con1-1-masked-bottom-green.tsv" "con1-1-masked-bottom-red.tsv"
> [3] "con1-1-masked-top-green.tsv""con1-1-masked-top-red.tsv"
> > example.list <- strsplit(example.names, "-")
> > example.list
> [[1]]
> [1] "con1"  "1" "masked""bottom""green.tsv"
>
> [[2]]
> [1] "con1""1"   "masked"  "bottom"  "red.tsv"
>
> [[3]]
> [1] "con1"  "1" "masked""top"   "green.tsv"
>
> [[4]]
> [1] "con1""1"   "masked"  "top" "red.tsv"
>
> > example.df <- as.data.frame(example.list)
> > example.df
>   c..con11maskedbottomgreen.tsv..
> 1con1
> 2   1
> 3  masked
> 4  bottom
> 5   green.tsv
>   c..con11maskedbottomred.tsv..
> 1  con1
> 2 1
> 3masked
> 4bottom
> 5   red.tsv
>   c..con11maskedtopgreen.tsv..
> 1 con1
> 21
> 3   masked
> 4  top
> 5green.tsv
>   c..con11maskedtopred.tsv..
> 1   con1
> 2  1
> 3 masked
> 4top
> 5red.tsv
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lists and rownames

2016-04-18 Thread Sarah Goslee

They aren't being stored, they are being generated on the fly. You can
create the same names using make.names()

example.names <- c("con1-1-masked-bottom-green.tsv",
"con1-1-masked-bottom-red.tsv", "con1-1-masked-top-green.tsv",
"con1-1-masked-top-red.tsv")

example.list <- strsplit(example.names, "-")

as.data.frame(example.list)

> make.names(example.list)
[1] "c..con11maskedbottomgreen.tsv.."
"c..con11maskedbottomred.tsv.."
[3] "c..con11maskedtopgreen.tsv.."
"c..con11maskedtopred.tsv.."


But you'll probably get a more usable result if you set names
explicitly, for instance:

names(example.list) <- example.names
as.data.frame(example.list)

Note that the characters that are not legal in column names are
changed for you. You can disable that behavior with check.names=FALSE
if you use data.frame() rather than as.data.frame().

Sarah



On Mon, Apr 18, 2016 at 4:21 PM, Ed Siefker  wrote:
> I'm doing some string manipulation on a vector of file names, and noticed
> something curious.  When I strsplit the vector, I get a list of
> character vectors.
> The list is numbered, as lists are.  When I cast that list as a data
> frame with 'as.data.frame()', the resulting columns have names derived
> from the original filenames.
>
> Example code is below.  My question is, where are these names stored
> in the list?  Are there methods that can access this from the list?
> Is there a way to preserve them verbatim?  Thanks
> -Ed
>
>> example.names
> [1] "con1-1-masked-bottom-green.tsv" "con1-1-masked-bottom-red.tsv"
> [3] "con1-1-masked-top-green.tsv""con1-1-masked-top-red.tsv"
>> example.list <- strsplit(example.names, "-")
>> example.list
> [[1]]
> [1] "con1"  "1" "masked""bottom""green.tsv"
>
> [[2]]
> [1] "con1""1"   "masked"  "bottom"  "red.tsv"
>
> [[3]]
> [1] "con1"  "1" "masked""top"   "green.tsv"
>
> [[4]]
> [1] "con1""1"   "masked"  "top" "red.tsv"
>
>> example.df <- as.data.frame(example.list)
>> example.df
>   c..con11maskedbottomgreen.tsv..
> 1con1
> 2   1
> 3  masked
> 4  bottom
> 5   green.tsv
>   c..con11maskedbottomred.tsv..
> 1  con1
> 2 1
> 3masked
> 4bottom
> 5   red.tsv
>   c..con11maskedtopgreen.tsv..
> 1 con1
> 21
> 3   masked
> 4  top
> 5green.tsv
>   c..con11maskedtopred.tsv..
> 1   con1
> 2  1
> 3 masked
> 4top
> 5red.tsv
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Add a vertical arrow to a time series graph using ggplot and xts

2016-04-18 Thread jpm miao

Hi,

   I am trying to add a vertical arrow (from top to bottom or from bottom
to up) to a time series plot using ggplot2 and xts. It seems that the
vertical line command "geom_vline" does not work for this purpose (Correct
me if I am wrong). I try the command "geom_segment" as follows, but I got
an error message at the last line "Error: Invalid input: date_trans works
with objects of class Date only".

Sometimes the error message occurs when I run the program, sometimes it
does not occur until I call the plot "p1". How could I add a vertical line
to the plot? Thanks!

Miao


##
library(xts)  # primary
#library(tseries)   # Unit root tests
library(ggplot2)
library(vars)
library(grid)
dt_xts<-xts(x = 1:10, order.by = seq(as.Date("2016-01-01"),
as.Date("2016-01-10"), by = "1 day"))
colnames(dt_xts)<-"gdp"
xmin<-min(index(dt_xts))
xmax<-max(index(dt_xts))
df1<-data.frame(x = index(dt_xts), coredata(dt_xts))
p<-ggplot(data = df1, mapping= aes(x=x, y=gdp))+geom_line()
rg<-ggplot_build(p)$panel$ranges[[1]]$y.range
y1<-rg[1]
y2<-rg[2]
df2<-data.frame(x = "2016-01-05", y1=y1, y2=y2 )
p1<-p+geom_segment(mapping=aes(x=x, y=y1, xend=x, yend=y2), data=df2,
arrow=arrow())

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ZINB multi-level model using MCMCglmm

2016-04-18 Thread Thierry Onkelinx

Please don't crosspost. You already posted this question to
r-sig-mixedmodels which is the appropriate list for your question.

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-04-18 21:25 GMT+02:00 MARIA OLGA VIEDMA SILLERO :

> Hi,
>
>
>
> I am Olga Viedma. I am running a Zero-inflated negative binomial (ZINB)
> multi-level model using MCMCglmm package. I have a doubt. Can I use the
> "Liab" outputs as fitted data, instead of the predicted values from
> "predict"? The liab outputs fit very well with the observed data, whereas
> the predicted values are so bad.
>
>
>
> Thanks in advance,
>
>
>
> Olga Viedma
>
>
>
> D . Olga Viedma Sillero
>
> Profesora Ordenaci n del Territorio
>
> Facultad de Ciencias del Medio Ambiente y Bioqu mica Universidad de
> Castilla-La Mancha Avd/ Carlos III, s/n. 45071 Toledo
>
> Tel: 925 268800 (ext. 5780)
>
> Email: olga.vie...@uclm.es
>
> http://blog.uclm.es/grupofuego
>
>
>
>
> Dª. Olga Viedma Sillero
> Profesora Ordenación del Territorio
> Facultad de Ciencias del Medio Ambiente y Bioquímica
> Universidad de Castilla-La Mancha
> Avd/ Carlos III, s/n. 45071 Toledo
> Tel: 925 268800 (ext. 5780)
> Email: olga.vie...@uclm.es
> http://blog.uclm.es/grupofuego
>
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lists and rownames

2016-04-18 Thread Ed Siefker

I'm doing some string manipulation on a vector of file names, and noticed
something curious.  When I strsplit the vector, I get a list of
character vectors.
The list is numbered, as lists are.  When I cast that list as a data
frame with 'as.data.frame()', the resulting columns have names derived
from the original filenames.

Example code is below.  My question is, where are these names stored
in the list?  Are there methods that can access this from the list?
Is there a way to preserve them verbatim?  Thanks
-Ed

> example.names
[1] "con1-1-masked-bottom-green.tsv" "con1-1-masked-bottom-red.tsv"
[3] "con1-1-masked-top-green.tsv""con1-1-masked-top-red.tsv"
> example.list <- strsplit(example.names, "-")
> example.list
[[1]]
[1] "con1"  "1" "masked""bottom""green.tsv"

[[2]]
[1] "con1""1"   "masked"  "bottom"  "red.tsv"

[[3]]
[1] "con1"  "1" "masked""top"   "green.tsv"

[[4]]
[1] "con1""1"   "masked"  "top" "red.tsv"

> example.df <- as.data.frame(example.list)
> example.df
  c..con11maskedbottomgreen.tsv..
1con1
2   1
3  masked
4  bottom
5   green.tsv
  c..con11maskedbottomred.tsv..
1  con1
2 1
3masked
4bottom
5   red.tsv
  c..con11maskedtopgreen.tsv..
1 con1
21
3   masked
4  top
5green.tsv
  c..con11maskedtopred.tsv..
1   con1
2  1
3 masked
4top
5red.tsv

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ZINB multi-level model using MCMCglmm

2016-04-18 Thread MARIA OLGA VIEDMA SILLERO

Hi,



I am Olga Viedma. I am running a Zero-inflated negative binomial (ZINB) 
multi-level model using MCMCglmm package. I have a doubt. Can I use the "Liab" 
outputs as fitted data, instead of the predicted values from "predict"? The 
liab outputs fit very well with the observed data, whereas the predicted values 
are so bad.



Thanks in advance,



Olga Viedma



D . Olga Viedma Sillero

Profesora Ordenaci n del Territorio

Facultad de Ciencias del Medio Ambiente y Bioqu mica Universidad de Castilla-La 
Mancha Avd/ Carlos III, s/n. 45071 Toledo

Tel: 925 268800 (ext. 5780)

Email: olga.vie...@uclm.es

http://blog.uclm.es/grupofuego




D�. Olga Viedma Sillero
Profesora Ordenaci�n del Territorio
Facultad de Ciencias del Medio Ambiente y Bioqu�mica
Universidad de Castilla-La Mancha
Avd/ Carlos III, s/n. 45071 Toledo
Tel: 925 268800 (ext. 5780)
Email: olga.vie...@uclm.es
http://blog.uclm.es/grupofuego


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] as.Date

2016-04-18 Thread David Winsemius


> On Apr 18, 2016, at 10:44 AM, Ogbos Okike  wrote:
> 
> Dear ALL,
> Thank you so much for your contributions.
> I have made some progress. Below is a simple script I gleaned from
> your kind responses:
> Sys.setenv(TZ="Etc/GMT")
> dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")
> times <- c("23:0:0", "22:0:0", "01:00:00", "18:0:0", "16:0:0")
> x <- paste(dates, times)
> aa<-strptime(x, "%m/%d/%y %H:%M:%S")
> bb<-1:5
> plot(aa, bb)
> 
> I tried plotting my result and I got what I am looking for. I think I
> am almost there.
> 
> I am, however, stuck here. My data is a large file and the form
> differs a little from the example I used. The quotation marks in both
> date and time is my headache now. Such inverted commas are not in my
> data. I can with awk transform my data to get exactly something like
> dd/mm/yy. But I wont know how to make the data appear in quotation
> mark in R.

There are not any quotation marks in an R object that is displayed as 
"02/27/92". The quotation marks are just added by the print function to make it 
clear to the user that it is a character value. 

If you read such values in with read.table they would automatically be 
interpreted as character values and then converted to factor class (which you 
do not want). Read up on the use in the read.* functions for colClasses and 
stringsAsFactors to safely input character values.
-- 
David.

> I will once more be glad for any more help.
> Ogbos
> 
> PS: I am still afraid of this forum. Please direct me to the right
> forum if this is not ok. Thanks again.
> 
> 
> On 4/18/16, peter dalgaard  wrote:
>> The most important thing is that Date objects by definition do not include
>> time of day. You want to look at ISOdatetime() and as.POSIXct() instead. And
>> beware daylight savings time issues.
>> 
>> -pd
>> 
>> On 18 Apr 2016, at 15:09 , Ogbos Okike  wrote:
>> 
>>> Dear All,
>>> 
>>> I have a data set containing year, month, day and counts as shown below:
>>> data <- read.table("data.txt", col.names = c("year", "month", "day",
>>> "counts"))
>>> Using the formula below, I converted the data to as date and plotted.
>>> 
>>> new.century <- data$year < 70
>>> 
>>> data$year <- ifelse(new.century, data$year + 2000, data$year + 1900)
>>> 
>>> data$date <- as.Date(ISOdate(data$year, data$month, data$day))
>>> 
>>> The form of the data is:
>>> 16 1 19 9078
>>> 16 1 20 9060
>>> 16 1 21 9090
>>> 16 1 22 9080
>>> 16 1 23 9121
>>> 16 1 24 9199
>>> 16 1 25 9289
>>> 16 1 26 9285
>>> 16 1 27 9245
>>> 16 1 28 9223
>>> 16 1 29 9298
>>> 16 1 30 9327
>>> 16 1 31 9365
>>> 
>>> Now, I wish to include time (hour) in my data. The new data is of the
>>> form:
>>> 05 01 06 143849
>>> 05 01 06 153845
>>> 05 01 06 163836
>>> 05 01 06 173847
>>> 05 01 06 183850
>>> 05 01 06 193872
>>> 05 01 06 203849
>>> 05 01 06 213860
>>> 05 01 06 223868
>>> 05 01 06 233853
>>> 05 01 07 003839
>>> 05 01 07 013842
>>> 05 01 07 023843
>>> 05 01 07 033865
>>> 05 01 07 043879
>>> 05 01 07 053876
>>> 05 01 07 063867
>>> 05 01 07 073887
>>> 
>>> I now read the data as:
>>> data <- read.table("data.txt", col.names = c("year", "month", "day",
>>> "counts", "hour")) and also included hour in data$date <-
>>> as.Date(ISOdate(data$year, data$month, data$day))
>>> i.e data$date <- as.Date(ISOdate(data$year, data$month, data$day,
>>> data$hour)).
>>> 
>>> However, these did not work.
>>> 
>>> Can you please assist be on how to get this date and time in the right
>>> format. The right format I got without hour looks like : 2005-12-29"
>>> "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
>>> [8696] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
>>> [8701] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
>>> [8706] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
>>> 
>>> I used this in my plot. Please I want this format to include hour.
>>> 
>>> Many thanks for your help. I am just a newbe. I am not sure if this
>>> forum is the right one. After registration, I tried to post to Nabble
>>> forum where I registered but could not succeed.
>>> 
>>> If there is a mistake, please help/direct me to the right forum.
>>> 
>>> Best regards
>>> Ogbos
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> --
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Office: A 4.23
>> Email: pd@cbs.dk  Priv: pda...@gmail.com
>> 
>> 
> 
> __
>

Re: [R] as.Date

2016-04-18 Thread Ogbos Okike

Dear ALL,
Thank you so much for your contributions.
I have made some progress. Below is a simple script I gleaned from
your kind responses:
Sys.setenv(TZ="Etc/GMT")
dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")
 times <- c("23:0:0", "22:0:0", "01:00:00", "18:0:0", "16:0:0")
 x <- paste(dates, times)
 aa<-strptime(x, "%m/%d/%y %H:%M:%S")
bb<-1:5
plot(aa, bb)

I tried plotting my result and I got what I am looking for. I think I
am almost there.

I am, however, stuck here. My data is a large file and the form
differs a little from the example I used. The quotation marks in both
date and time is my headache now. Such inverted commas are not in my
data. I can with awk transform my data to get exactly something like
dd/mm/yy. But I wont know how to make the data appear in quotation
mark in R. I will once more be glad for any more help.
Ogbos

PS: I am still afraid of this forum. Please direct me to the right
forum if this is not ok. Thanks again.


On 4/18/16, peter dalgaard  wrote:
> The most important thing is that Date objects by definition do not include
> time of day. You want to look at ISOdatetime() and as.POSIXct() instead. And
> beware daylight savings time issues.
>
> -pd
>
> On 18 Apr 2016, at 15:09 , Ogbos Okike  wrote:
>
>> Dear All,
>>
>> I have a data set containing year, month, day and counts as shown below:
>> data <- read.table("data.txt", col.names = c("year", "month", "day",
>> "counts"))
>> Using the formula below, I converted the data to as date and plotted.
>>
>> new.century <- data$year < 70
>>
>> data$year <- ifelse(new.century, data$year + 2000, data$year + 1900)
>>
>> data$date <- as.Date(ISOdate(data$year, data$month, data$day))
>>
>> The form of the data is:
>> 16 1 19 9078
>> 16 1 20 9060
>> 16 1 21 9090
>> 16 1 22 9080
>> 16 1 23 9121
>> 16 1 24 9199
>> 16 1 25 9289
>> 16 1 26 9285
>> 16 1 27 9245
>> 16 1 28 9223
>> 16 1 29 9298
>> 16 1 30 9327
>> 16 1 31 9365
>>
>> Now, I wish to include time (hour) in my data. The new data is of the
>> form:
>> 05 01 06 143849
>> 05 01 06 153845
>> 05 01 06 163836
>> 05 01 06 173847
>> 05 01 06 183850
>> 05 01 06 193872
>> 05 01 06 203849
>> 05 01 06 213860
>> 05 01 06 223868
>> 05 01 06 233853
>> 05 01 07 003839
>> 05 01 07 013842
>> 05 01 07 023843
>> 05 01 07 033865
>> 05 01 07 043879
>> 05 01 07 053876
>> 05 01 07 063867
>> 05 01 07 073887
>>
>> I now read the data as:
>> data <- read.table("data.txt", col.names = c("year", "month", "day",
>> "counts", "hour")) and also included hour in data$date <-
>> as.Date(ISOdate(data$year, data$month, data$day))
>> i.e data$date <- as.Date(ISOdate(data$year, data$month, data$day,
>> data$hour)).
>>
>> However, these did not work.
>>
>> Can you please assist be on how to get this date and time in the right
>> format. The right format I got without hour looks like : 2005-12-29"
>> "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
>> [8696] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
>> [8701] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
>> [8706] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
>>
>> I used this in my plot. Please I want this format to include hour.
>>
>> Many thanks for your help. I am just a newbe. I am not sure if this
>> forum is the right one. After registration, I tried to post to Nabble
>> forum where I registered but could not succeed.
>>
>> If there is a mistake, please help/direct me to the right forum.
>>
>> Best regards
>> Ogbos
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] project test data into principal components of training dataset

2016-04-18 Thread olsen

Hi there,

I've a training dataset and a test dataset. My aim is to visually
allocate the test data within the calibrated space reassembled by the
PC's of the training data set, furthermore to keep the training data set
coordinates fixed, so they can serve as ruler for measurement for
additional test datasets coming up.

Please find a minimum working example using the wine dataset below.
Ideally I would like to use ggbiplot as it comes with the elegant
features but it only accepts objects of class prcomp, princomp, PCA, or
lda, which is not fullfilled by the predicted test data.

I'm still slightly wet behind my R ears and the only solution I can
think of is to plot the calibrated space in ggbiplot and the training
data in ggplot and then join them, in the worst case by exporting them
as svg and importing them in inkscape. Which is slightly complicated
plus the scaling is different.

Any indication how this mission can be accomplished very welcome!

Thanks and greets
Olsen

I started a threat on stackoverflow on that issue but know relevant
indications so far.
http://stackoverflow.com/questions/36603268/how-to-plot-training-and-test-validation-data-in-r-using-ggbiplot

##MWE
library(ggbiplot)
data(wine)

##pca on the wine dataset used as training data
wine.pca <- prcomp(wine, center = TRUE, scale. = TRUE)

wine$class <- wine.class

##simulate test data by generating three new wine classes
wine.new.1 <- wine[c(sample(1:nrow(wine), 25)),]
wine.new.2 <- wine[c(sample(1:nrow(wine), 43)),]
wine.new.3 <- wine[c(sample(1:nrow(wine), 36)),]

##Predict PCs for the new classes by transforming
#them using the predict.prcomp function
pred.new.1 <- predict(wine.pca, newdata = wine.new.1)
pred.new.2 <- predict(wine.pca, newdata = wine.new.2)
pred.new.3 <- predict(wine.pca, newdata = wine.new.3)

#simulate the classes for the new sorts
wine.new.1$class <- rep("new.wine.1", nrow(wine.new.1))
wine.new.2$class <- rep("new.wine.2", nrow(wine.new.2))
wine.new.3$class <- rep("new.wine.3", nrow(wine.new.3))
wine.new.bind <- rbind(wine.new.1, wine.new.2, wine.new.3)

##compose the plot by joining the PCA ggbiplot training data with the
testing data from ggplot
#plot the calibrated space resulting from the test data
g.train <- ggbiplot(wine.pca, obs.scale = 1, var.scale = 1, groups =
wine$class, ellipse = TRUE, circle = TRUE)
g.train
#plot the test data resulting from the prediction
df.pred = data.frame(PC1 = wine.new.bind[,1], PC2 = wine.new.bind[,2],
PC3 = wine.new.bind[,3], PC4 = wine.new.bind[,4],
classes = wine.new.bind$class)
g.test <- ggplot(df.pred, aes(PC1, PC2, color = classes, shape =
classes)) +  geom_point() +  stat_ellipse()
g.test





-- 
Our solar system is the cream of the crop
http://hasa-labs.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sum of Numeric Values in a DF Column

2016-04-18 Thread Bert Gunter

... and a slightly more efficient non-dplyr 1-liner:

> sapply(strsplit(dd$Lower,"[^[:digit:]]"),
function(x)sum(as.numeric(x), na.rm=TRUE))

[1] 105  67  60 100  80

Cheers,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Apr 18, 2016 at 10:43 AM, Bert Gunter  wrote:
> ... and here is a non-dplyr rsolution:
>
>> z <-gsub("[^[:digit:]]"," ",dd$Lower)
>
>> sapply(strsplit(z," +"),function(x)sum(as.numeric(x),na.rm=TRUE))
> [1] 105  67  60 100  80
>
>
> Cheers,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Apr 18, 2016 at 10:07 AM, Richard M. Heiberger  
> wrote:
>> ## Continuing with your data
>>
>> AA <- stringr::str_extract_all(dd[[2]],"[[:digit:]]+")
>> BB <- lapply(AA, as.numeric)
>> ## I think you are looking for one of the following two expressions
>> sum(unlist(BB))
>> sapply(BB, sum)
>>
>>
>> On Mon, Apr 18, 2016 at 12:48 PM, Burhan ul haq  wrote:
>>> Hi,
>>>
>>> I request help with the following:
>>>
>>> INPUT: A data frame where column "Lower" is a character containing numeric
>>> values (different count or occurrences of numeric values in each row,
>>> mostly 2)
>>>
 dput(dd)
>>> structure(list(State = c("Alabama", "Alaska", "Arizona", "Arkansas",
>>> "California"), Lower = c("R 72–33", "R/Coalition 27(23 R, 4 D)–12 D, 1
>>> Ind.",
>>> "R 36–24", "R 64–35, 1 Ind.", "D 52–28"), Upper = c("R 26–8, 1 Ind.",
>>> "R/Coalition 15(14 R, 1 D)–5 D", "R 18–12", "R 24–11", "D 26–14"
>>> )), .Names = c("State", "Lower", "Upper"), row.names = c(NA,
>>> 5L), class = "data.frame")
>>>
>>> PROBLEM: Need to extract all numeric values and sum them. There are few
>>> exceptions like row2. But these can be ignored and will be fixed manually
>>>
>>> SOLUTION SO FAR:
>>> str_extract_all(dd[[2]],"[[:digit:]]+"), returns a list of numbers as
>>> character. I am unable to unlist it, because it mixes them all together, ...
>>>
>>> And if I may add, is there a "dplyr" way of doing it ...
>>>
>>>
>>> Thanks
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 'nlme' package not compiling

2016-04-18 Thread David Winsemius

> On Apr 18, 2016, at 12:48 AM, Angelo Varlotta  
> wrote:
> 
> Hi,
> I'm trying to install from source code the 'nlme' package in 
> RStudio. When I try, I get the following error message:
> 
> ld: warning: directory not found for option 
> '-L/usr/local/lib/gcc/x86_64-apple-darwin13.0.0/4.8.2'
> ld: library not found for -lgfortran
> clang: error: linker command failed with exit code 1 (use -v to 
> see invocation)
> make: *** [nlme.so] Error 1
> ERROR: compilation failed for package ‘nlme’
> * removing 
> ‘/Library/Frameworks/R.framework/Versions/3.2/Resources/library/nlme’
> * restoring previous 
> ‘/Library/Frameworks/R.framework/Versions/3.2/Resources/library/nlme’
> Warning in install.packages :
>   installation of package ‘nlme’ had non-zero exit status
> 
> I'm using gfortran 4.8 from Macports and running OS X 10.11.4 
> with RStudio Version 0.99.893. I've tried to use the FLIBS 
> command in R:
> 
> FLIBS="-L/opt/local/lib/gcc48/gcc/x86_64-apple-darwin15/4.8.5/"
> 
> so that it knows where the Fortran libraries are at and compile 
> again, but it still searchesregardless for the directory:
> 
> /usr/local/lib/gcc/x86_64-apple-darwin13.0.0/4.8.2.

This is the fortran against which the Mac R is compiled, provided as a disk 
image:

gfortran-4.2.3.dmg:
http://r.research.att.com/gfortran-4.2.3.dmg

If you had been reading the R-SIG-Mac forum (or searching the archives you 
should have seen repeated warnings against Macports versions.)

The proper site for Mac questions regarding package compilation.
R-SIG-Mac:
https://stat.ethz.ch/mailman/listinfo/r-sig-mac

I'm was using Version 0.99.491 but updated to the same version as you have. The 
install packages dialog doesn't offer compiling from source as an options so 
the first test was to see if the binary could be install easily on version 
3.3.0 of R it did.

I then tried compiling from source and do get the same error as did you. I'm  
not a very capable Unix programmer and am unable to tell why this is happening. 
Perhaps something RStudio is doing? So I tried in the R.app gui with : 

install.packages("~/Downloads/nlme_3.1-127.tar.gz", repo=NULL, type="source")

# result===
* installing *source* package ‘nlme’ ...
** package ‘nlme’ successfully unpacked and MD5 sums checked
** libs
gfortran-4.8   -fPIC  -g -O2  -c chol.f -o chol.o
make: gfortran-4.8: No such file or directory
make: *** [chol.o] Error 1
ERROR: compilation failed for package ‘nlme’
Warning message:
In install.packages("~/Downloads/nlme_3.1-127.tar.gz", repo = NULL,  :
  installation of package ‘/Users/davidwinsemius/Downloads/nlme_3.1-127.tar.gz’ 
had non-zero exit status
* removing ‘/Library/Frameworks/R.framework/Versions/3.3/Resources/library/nlme’
* restoring previous 
‘/Library/Frameworks/R.framework/Versions/3.3/Resources/library/nlme’

Also tried from Terminal.app session with similar error report.

> which of course doesn't exist. Any suggestions?

A) Install from binary. That's certainly seems easiest.

Or B)
-- install gfortran 4.8 in the directory that your installation of R expects it 
to be found. 
Perhaps:

http://r.research.att.com/libs/gfortran-4.8.2-darwin13.tar.bz2

Or at the Unix command line:

curl -s http://r.research.att.com/libs/gfortran-4.8.2-darwin13.tar.bz2 | sudo 
tar fxj - -C /

(As was suggested by Simon Urbanek two years ago and  was reported successful 
by Jason Eyerly on R-SIG-Mac.)

But in any case this thread belongs on R-SIG-MAC so copying there, and if any 
response is needed then when responding you should remove r-help from future 
replies.

-- 

David Winsemius

> 
> Cheers,
> Angelo
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sum of Numeric Values in a DF Column

2016-04-18 Thread Bert Gunter

... and here is a non-dplyr rsolution:

> z <-gsub("[^[:digit:]]"," ",dd$Lower)

> sapply(strsplit(z," +"),function(x)sum(as.numeric(x),na.rm=TRUE))
[1] 105  67  60 100  80


Cheers,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Apr 18, 2016 at 10:07 AM, Richard M. Heiberger  wrote:
> ## Continuing with your data
>
> AA <- stringr::str_extract_all(dd[[2]],"[[:digit:]]+")
> BB <- lapply(AA, as.numeric)
> ## I think you are looking for one of the following two expressions
> sum(unlist(BB))
> sapply(BB, sum)
>
>
> On Mon, Apr 18, 2016 at 12:48 PM, Burhan ul haq  wrote:
>> Hi,
>>
>> I request help with the following:
>>
>> INPUT: A data frame where column "Lower" is a character containing numeric
>> values (different count or occurrences of numeric values in each row,
>> mostly 2)
>>
>>> dput(dd)
>> structure(list(State = c("Alabama", "Alaska", "Arizona", "Arkansas",
>> "California"), Lower = c("R 72–33", "R/Coalition 27(23 R, 4 D)–12 D, 1
>> Ind.",
>> "R 36–24", "R 64–35, 1 Ind.", "D 52–28"), Upper = c("R 26–8, 1 Ind.",
>> "R/Coalition 15(14 R, 1 D)–5 D", "R 18–12", "R 24–11", "D 26–14"
>> )), .Names = c("State", "Lower", "Upper"), row.names = c(NA,
>> 5L), class = "data.frame")
>>
>> PROBLEM: Need to extract all numeric values and sum them. There are few
>> exceptions like row2. But these can be ignored and will be fixed manually
>>
>> SOLUTION SO FAR:
>> str_extract_all(dd[[2]],"[[:digit:]]+"), returns a list of numbers as
>> character. I am unable to unlist it, because it mixes them all together, ...
>>
>> And if I may add, is there a "dplyr" way of doing it ...
>>
>>
>> Thanks
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sum of Numeric Values in a DF Column

2016-04-18 Thread David Winsemius


> On Apr 18, 2016, at 9:48 AM, Burhan ul haq  wrote:
> 
> Hi,
> 
> I request help with the following:
> 
> INPUT: A data frame where column "Lower" is a character containing numeric
> values (different count or occurrences of numeric values in each row,
> mostly 2)
> 
>> dput(dd)
> structure(list(State = c("Alabama", "Alaska", "Arizona", "Arkansas",
> "California"), Lower = c("R 72–33", "R/Coalition 27(23 R, 4 D)–12 D, 1
> Ind.",
> "R 36–24", "R 64–35, 1 Ind.", "D 52–28"), Upper = c("R 26–8, 1 Ind.",
> "R/Coalition 15(14 R, 1 D)–5 D", "R 18–12", "R 24–11", "D 26–14"
> )), .Names = c("State", "Lower", "Upper"), row.names = c(NA,
> 5L), class = "data.frame")
> 
> PROBLEM: Need to extract all numeric values and sum them. There are few
> exceptions like row2. But these can be ignored and will be fixed manually
> 
> SOLUTION SO FAR:
> str_extract_all(dd[[2]],"[[:digit:]]+"), returns a list of numbers as
> character. I am unable to unlist it, because it mixes them all together, ...
> 
> And if I may add, is there a "dplyr" way of doing it ...

I don't understand what is mean by "it mixes them all together". This runs 
without error and appears to deliver what was requested in your natural 
language description:

> sum( as.numeric( unlist(str_extract_all(dd[[2]],"[[:digit:]]+") )))
[1] 412


> 
> 
> Thanks
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sum of Numeric Values in a DF Column

2016-04-18 Thread Richard M. Heiberger

## Continuing with your data

AA <- stringr::str_extract_all(dd[[2]],"[[:digit:]]+")
BB <- lapply(AA, as.numeric)
## I think you are looking for one of the following two expressions
sum(unlist(BB))
sapply(BB, sum)


On Mon, Apr 18, 2016 at 12:48 PM, Burhan ul haq  wrote:
> Hi,
>
> I request help with the following:
>
> INPUT: A data frame where column "Lower" is a character containing numeric
> values (different count or occurrences of numeric values in each row,
> mostly 2)
>
>> dput(dd)
> structure(list(State = c("Alabama", "Alaska", "Arizona", "Arkansas",
> "California"), Lower = c("R 72–33", "R/Coalition 27(23 R, 4 D)–12 D, 1
> Ind.",
> "R 36–24", "R 64–35, 1 Ind.", "D 52–28"), Upper = c("R 26–8, 1 Ind.",
> "R/Coalition 15(14 R, 1 D)–5 D", "R 18–12", "R 24–11", "D 26–14"
> )), .Names = c("State", "Lower", "Upper"), row.names = c(NA,
> 5L), class = "data.frame")
>
> PROBLEM: Need to extract all numeric values and sum them. There are few
> exceptions like row2. But these can be ignored and will be fixed manually
>
> SOLUTION SO FAR:
> str_extract_all(dd[[2]],"[[:digit:]]+"), returns a list of numbers as
> character. I am unable to unlist it, because it mixes them all together, ...
>
> And if I may add, is there a "dplyr" way of doing it ...
>
>
> Thanks
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sum of Numeric Values in a DF Column

2016-04-18 Thread Burhan ul haq

Hi,

I request help with the following:

INPUT: A data frame where column "Lower" is a character containing numeric
values (different count or occurrences of numeric values in each row,
mostly 2)

> dput(dd)
structure(list(State = c("Alabama", "Alaska", "Arizona", "Arkansas",
"California"), Lower = c("R 72–33", "R/Coalition 27(23 R, 4 D)–12 D, 1
Ind.",
"R 36–24", "R 64–35, 1 Ind.", "D 52–28"), Upper = c("R 26–8, 1 Ind.",
"R/Coalition 15(14 R, 1 D)–5 D", "R 18–12", "R 24–11", "D 26–14"
)), .Names = c("State", "Lower", "Upper"), row.names = c(NA,
5L), class = "data.frame")

PROBLEM: Need to extract all numeric values and sum them. There are few
exceptions like row2. But these can be ignored and will be fixed manually

SOLUTION SO FAR:
str_extract_all(dd[[2]],"[[:digit:]]+"), returns a list of numbers as
character. I am unable to unlist it, because it mixes them all together, ...

And if I may add, is there a "dplyr" way of doing it ...


Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] as.Date

2016-04-18 Thread Jeff Newmiller

Date data cannot represent hour data. You need to use POSIXct or perhaps the 
chron class from the chron package. 

To use POSIXct, use ISOdatetime instead of ISOdate. Also be careful which 
timezone you have set as default (in most operating systems calling 
Sys.setenv(TZ="Etc/GMT") or similar will get you started) when you invoke 
ISOdatetime, since daylight savings can complicate things. Of course if 
daylight savings is built into your data already then  you are better off 
choosing a timezone that understands that. See ?DateTimeClasses.
-- 
Sent from my phone. Please excuse my brevity.

On April 18, 2016 6:09:50 AM PDT, Ogbos Okike  wrote:
>Dear All,
>
>I have a data set containing year, month, day and counts as shown
>below:
>data <- read.table("data.txt", col.names = c("year", "month", "day",
>"counts"))
>Using the formula below, I converted the data to as date and plotted.
>
>new.century <- data$year < 70
>
>data$year <- ifelse(new.century, data$year + 2000, data$year + 1900)
>
>data$date <- as.Date(ISOdate(data$year, data$month, data$day))
>
>The form of the data is:
>16 1 19 9078
>16 1 20 9060
>16 1 21 9090
>16 1 22 9080
>16 1 23 9121
>16 1 24 9199
>16 1 25 9289
>16 1 26 9285
>16 1 27 9245
>16 1 28 9223
>16 1 29 9298
>16 1 30 9327
>16 1 31 9365
>
>Now, I wish to include time (hour) in my data. The new data is of the
>form:
>05 01 06 143849
>05 01 06 153845
>05 01 06 163836
>05 01 06 173847
>05 01 06 183850
>05 01 06 193872
>05 01 06 203849
>05 01 06 213860
>05 01 06 223868
>05 01 06 233853
>05 01 07 003839
>05 01 07 013842
>05 01 07 023843
>05 01 07 033865
>05 01 07 043879
>05 01 07 053876
>05 01 07 063867
>05 01 07 073887
>
>I now read the data as:
>data <- read.table("data.txt", col.names = c("year", "month", "day",
>"counts", "hour")) and also included hour in data$date <-
>as.Date(ISOdate(data$year, data$month, data$day))
>i.e data$date <- as.Date(ISOdate(data$year, data$month, data$day,
>data$hour)).
>
>However, these did not work.
>
>Can you please assist be on how to get this date and time in the right
>format. The right format I got without hour looks like : 2005-12-29"
>"2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
>[8696] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
>[8701] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
>[8706] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
>
>I used this in my plot. Please I want this format to include hour.
>
>Many thanks for your help. I am just a newbe. I am not sure if this
>forum is the right one. After registration, I tried to post to Nabble
>forum where I registered but could not succeed.
>
>If there is a mistake, please help/direct me to the right forum.
>
>Best regards
>Ogbos
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] as.Date

2016-04-18 Thread PIKAL Petr

Hi

AFAIK as.Date does not accept hours. Although it is not explicitly written in 
help page, the name as.Date seems to me clear enough that it works only with 
dates.

If you want to use hours, minutes ... you should use strptime for converting 
your values to valid date_time object.

And you should also use ISOdatetime conversion function to use hours etc. in 
your commands.

Cheers
Petr


> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ogbos
> Okike
> Sent: Monday, April 18, 2016 3:10 PM
> To: r-help@r-project.org
> Subject: [R] as.Date
>
> Dear All,
>
> I have a data set containing year, month, day and counts as shown below:
> data <- read.table("data.txt", col.names = c("year", "month", "day",
> "counts")) Using the formula below, I converted the data to as date and
> plotted.
>
> new.century <- data$year < 70
>
> data$year <- ifelse(new.century, data$year + 2000, data$year + 1900)
>
> data$date <- as.Date(ISOdate(data$year, data$month, data$day))
>
> The form of the data is:
> 16 1 19 9078
> 16 1 20 9060
> 16 1 21 9090
> 16 1 22 9080
> 16 1 23 9121
> 16 1 24 9199
> 16 1 25 9289
> 16 1 26 9285
> 16 1 27 9245
> 16 1 28 9223
> 16 1 29 9298
> 16 1 30 9327
> 16 1 31 9365
>
> Now, I wish to include time (hour) in my data. The new data is of the form:
> 05 01 06 143849
> 05 01 06 153845
> 05 01 06 163836
> 05 01 06 173847
> 05 01 06 183850
> 05 01 06 193872
> 05 01 06 203849
> 05 01 06 213860
> 05 01 06 223868
> 05 01 06 233853
> 05 01 07 003839
> 05 01 07 013842
> 05 01 07 023843
> 05 01 07 033865
> 05 01 07 043879
> 05 01 07 053876
> 05 01 07 063867
> 05 01 07 073887
>
> I now read the data as:
> data <- read.table("data.txt", col.names = c("year", "month", "day",
> "counts", "hour")) and also included hour in data$date <-
> as.Date(ISOdate(data$year, data$month, data$day)) i.e data$date <-
> as.Date(ISOdate(data$year, data$month, data$day, data$hour)).
>
> However, these did not work.
>
> Can you please assist be on how to get this date and time in the right format.
> The right format I got without hour looks like : 2005-12-29"
> "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
> [8696] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
> [8701] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
> [8706] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
>
> I used this in my plot. Please I want this format to include hour.
>
> Many thanks for your help. I am just a newbe. I am not sure if this forum is
> the right one. After registration, I tried to post to Nabble forum where I
> registered but could not succeed.
>
> If there is a mistake, please help/direct me to the right forum.
>
> Best regards
> Ogbos
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable

Re: [R] as.Date

2016-04-18 Thread peter dalgaard

The most important thing is that Date objects by definition do not include time 
of day. You want to look at ISOdatetime() and as.POSIXct() instead. And beware 
daylight savings time issues.

-pd

On 18 Apr 2016, at 15:09 , Ogbos Okike  wrote:

> Dear All,
> 
> I have a data set containing year, month, day and counts as shown below:
> data <- read.table("data.txt", col.names = c("year", "month", "day", 
> "counts"))
> Using the formula below, I converted the data to as date and plotted.
> 
> new.century <- data$year < 70
> 
> data$year <- ifelse(new.century, data$year + 2000, data$year + 1900)
> 
> data$date <- as.Date(ISOdate(data$year, data$month, data$day))
> 
> The form of the data is:
> 16 1 19 9078
> 16 1 20 9060
> 16 1 21 9090
> 16 1 22 9080
> 16 1 23 9121
> 16 1 24 9199
> 16 1 25 9289
> 16 1 26 9285
> 16 1 27 9245
> 16 1 28 9223
> 16 1 29 9298
> 16 1 30 9327
> 16 1 31 9365
> 
> Now, I wish to include time (hour) in my data. The new data is of the form:
> 05 01 06 143849
> 05 01 06 153845
> 05 01 06 163836
> 05 01 06 173847
> 05 01 06 183850
> 05 01 06 193872
> 05 01 06 203849
> 05 01 06 213860
> 05 01 06 223868
> 05 01 06 233853
> 05 01 07 003839
> 05 01 07 013842
> 05 01 07 023843
> 05 01 07 033865
> 05 01 07 043879
> 05 01 07 053876
> 05 01 07 063867
> 05 01 07 073887
> 
> I now read the data as:
> data <- read.table("data.txt", col.names = c("year", "month", "day",
> "counts", "hour")) and also included hour in data$date <-
> as.Date(ISOdate(data$year, data$month, data$day))
> i.e data$date <- as.Date(ISOdate(data$year, data$month, data$day, data$hour)).
> 
> However, these did not work.
> 
> Can you please assist be on how to get this date and time in the right
> format. The right format I got without hour looks like : 2005-12-29"
> "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
> [8696] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
> [8701] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
> [8706] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
> 
> I used this in my plot. Please I want this format to include hour.
> 
> Many thanks for your help. I am just a newbe. I am not sure if this
> forum is the right one. After registration, I tried to post to Nabble
> forum where I registered but could not succeed.
> 
> If there is a mistake, please help/direct me to the right forum.
> 
> Best regards
> Ogbos
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] as.Date

2016-04-18 Thread Ogbos Okike

Dear All,

I have a data set containing year, month, day and counts as shown below:
data <- read.table("data.txt", col.names = c("year", "month", "day", "counts"))
Using the formula below, I converted the data to as date and plotted.

new.century <- data$year < 70

data$year <- ifelse(new.century, data$year + 2000, data$year + 1900)

data$date <- as.Date(ISOdate(data$year, data$month, data$day))

The form of the data is:
16 1 19 9078
16 1 20 9060
16 1 21 9090
16 1 22 9080
16 1 23 9121
16 1 24 9199
16 1 25 9289
16 1 26 9285
16 1 27 9245
16 1 28 9223
16 1 29 9298
16 1 30 9327
16 1 31 9365

Now, I wish to include time (hour) in my data. The new data is of the form:
05 01 06 143849
05 01 06 153845
05 01 06 163836
05 01 06 173847
05 01 06 183850
05 01 06 193872
05 01 06 203849
05 01 06 213860
05 01 06 223868
05 01 06 233853
05 01 07 003839
05 01 07 013842
05 01 07 023843
05 01 07 033865
05 01 07 043879
05 01 07 053876
05 01 07 063867
05 01 07 073887

I now read the data as:
data <- read.table("data.txt", col.names = c("year", "month", "day",
"counts", "hour")) and also included hour in data$date <-
as.Date(ISOdate(data$year, data$month, data$day))
i.e data$date <- as.Date(ISOdate(data$year, data$month, data$day, data$hour)).

However, these did not work.

Can you please assist be on how to get this date and time in the right
format. The right format I got without hour looks like : 2005-12-29"
"2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
[8696] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
[8701] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"
[8706] "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29" "2005-12-29"

I used this in my plot. Please I want this format to include hour.

Many thanks for your help. I am just a newbe. I am not sure if this
forum is the right one. After registration, I tried to post to Nabble
forum where I registered but could not succeed.

If there is a mistake, please help/direct me to the right forum.

Best regards
Ogbos

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Random Forest classification

2016-04-18 Thread Liaw, Andy

This is explained in the "Details" section of the help page for partialPlot.

Best
Andy

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jesús Para
> Fernández
> Sent: Tuesday, April 12, 2016 1:17 AM
> To: r-help@r-project.org
> Subject: [R] Random Forest classification
> 
> Hi,
> 
> To evaluate the partial influence of a factor with a random Forest, wich
> response is OK/NOK I�m using partialPlot, being the x axis the factor axis and
> the Y axis is between -1 and 1. What this -1 and 1 means?
> 
> An example:
> 
> https://www.dropbox.com/s/4b92lqxi3592r0d/Captura.JPG?dl=0
> 
> 
> Thanks for all!!!
>   [[alternative HTML version deleted]]

Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (2000 Galloping Hill Road, Kenilworth,
New Jersey, USA 07033), and/or its affiliates Direct contact information
for affiliates is available at 
http://www.merck.com/contact/contacts.html) that may be confidential,
proprietary copyrighted and/or legally privileged. It is intended solely
for the use of the individual or entity named on this message. If you are
not the intended recipient, and have received this message in error,
please notify us immediately by reply e-mail and then delete it from 
your system.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R [coding : do not run for every row ]

2016-04-18 Thread tan sj

yes, i think that must be some mistake. I just noticed that it run for the nine 
sample sizes with the column fill in "1" in the result.
And yet i am still trying to figure out what is happening.


From: Thierry Onkelinx 
Sent: Monday, April 18, 2016 10:03 AM
To: tan sj; r-help@r-project.org
Subject: Re: [R] R [coding : do not run for every row ]

Always keep the mailing list in cc.

The code runs for each row in the data. However I get the feeling that
there is a mismatch between what you think that is in the data and the
actual data.
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey


2016-04-18 10:35 GMT+02:00 tan sj :
> Thanks but it seem like the problem of looping through data is still the 
> samei am really wondering where is the mistake
>
> 
> From: Thierry Onkelinx 
> Sent: Monday, April 18, 2016 7:21 AM
> To: tan sj
> Cc: r-help
> Subject: Re: [R] R [coding : do not run for every row ]
>
> You can make this much more readable with apply functions.
>
> result <- apply(
>   all_combine1,
>   1,
>   function(x){
> p.value <- sapply(
>   seq_len(nSims),
>   function(sim){
> gamma1 <- rgamma(x["m"], x["sp(skewness1.5)"], x["scp1"])
> gamma2 <- rgamma(x["n"], x["scp1"], 1)
> gamma1 <- gamma1 - x["sp(skewness1.5)"] * x["scp1"]
> gamma2 <- gamma2 - x["sp(skewness1.5)"]
> c(
>   equal = t.test(gamma1, gamma2, var.equal=TRUE)$p.value,
>   unequal = t.test(gamma1,gamma2,var.equal=FALSE)$p.value,
>   mann = wilcox.test(gamma1,gamma2)$p.value
> )
>   }
> )
> rowMeans(p.value <= alpha)
>   }
> )
> cbind(all_combine1, t(result))
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> and Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no
> more than asking him to perform a post-mortem examination: he may be
> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does
> not ensure that a reasonable answer can be extracted from a given body
> of data. ~ John Tukey
>
>
> 2016-04-18 9:05 GMT+02:00 tan sj :
>> Hi, i am sorry, the output should be values between 0 and 0.1 and not
>> supposed to be 1.00, it is because they are type 1 error rate. And now i get
>> output 1.00 for several samples,rhis is no correct. The loop do not run for
>> every row. i do not know where is my mistake.  As i use the same concept on
>> normal distribution setup, i get the result.
>>
>> Sent from my phone
>>
>> On Thierry Onkelinx , Apr 18, 2016 2:55 PM wrote:
>> Dear anonymous,
>>
>> The big mistake in the output might be obvious to you but not to
>> others. Please make clear what the correct output should be or at
>> least what is wrong with the current output.
>>
>> And please DO read the posting guide which asks you not to post in HTML.
>> ir. Thierry Onkelinx
>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>> and Forest
>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>> Kliniekstraat 25
>> 1070 Anderlecht
>> Belgium
>>
>> To call in the statistician after the experiment is done may be no
>> more than asking him to perform a post-mortem examination: he may be
>> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does
>> not ensure that a reasonable answer can be extracted from a given body
>> of data. ~ John Tukey
>>
>>
>> 2016-04-17 19:59 GMT+02:00 tan sj :
>>> i have combined all the variables in a matrix, and i wish to conduct a
>>> simulation row by row.
>>>
>>> But i found out the code only works for the every first row after a cycle
>>> of nine samples.
>>>
>>> But after check out the code, i don know where is my mistake...
>>>
>>> can anyone pls help 
>>>
>>>
>>> #For gamma disribution with equal skewness 1.5

[R] A Neural Network question

2016-04-18 Thread Philip Rhoades


People,

I thought I needed to have some familiarity with NNs for some of my 
current (non-profit, brain-related) projects so I started looking at 
various programming environments including R and I got this working:


  http://gekkoquant.com/2012/05/26/neural-networks-with-r-simple-example

however I needed pictures to help understand what was going on and then 
I found this:


  
https://jamesmccaffrey.files.wordpress.com/2012/11/backpropagationcalculations.jpg


which I thought was almost intelligible so I had an idea which I thought 
would help the learning process:


- Create a very simple NN implemented as a spreadsheet where each sheet 
would correspond to an iteration


I started doing this on LibreOffice:

- I think am already starting to get a better idea of how NNs work just 
from the stuff I have done on the spreadsheet already


- I have now transferred my LibreOffice SpreadSheet (SS) to a shared 
Google Docs Calc file and can share it for editing with others


  
https://docs.google.com/spreadsheets/d/1eSCgGU5qeI3_PmQhwZn4RH0NznUekVP5BP7w4MpKSUc/pub?output=pdf


- I think I have the SS calculations correct so far except for the stuff 
in the dashed purple box in the diagram


- I am not sure how to implement the purple box . . so I thought I would 
ask for help on this mailing list


If someone can help me with the last bit of the SS, from there I think I 
can then repeat the FR and BP sheets and see how the Diffs evolve . .


Is anyone interested in helping to get this last bit of the spreadsheet 
working so I can move on to doing actual work with the R packages with 
better understanding?


Thanks,

Phil.
--
Philip Rhoades

PO Box 896
Cowra  NSW  2794
Australia
E-mail:  p...@pricom.com.au

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R [coding : do not run for every row ]

2016-04-18 Thread Thierry Onkelinx

Always keep the mailing list in cc.

The code runs for each row in the data. However I get the feeling that
there is a mismatch between what you think that is in the data and the
actual data.
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey


2016-04-18 10:35 GMT+02:00 tan sj :
> Thanks but it seem like the problem of looping through data is still the 
> samei am really wondering where is the mistake
>
> 
> From: Thierry Onkelinx 
> Sent: Monday, April 18, 2016 7:21 AM
> To: tan sj
> Cc: r-help
> Subject: Re: [R] R [coding : do not run for every row ]
>
> You can make this much more readable with apply functions.
>
> result <- apply(
>   all_combine1,
>   1,
>   function(x){
> p.value <- sapply(
>   seq_len(nSims),
>   function(sim){
> gamma1 <- rgamma(x["m"], x["sp(skewness1.5)"], x["scp1"])
> gamma2 <- rgamma(x["n"], x["scp1"], 1)
> gamma1 <- gamma1 - x["sp(skewness1.5)"] * x["scp1"]
> gamma2 <- gamma2 - x["sp(skewness1.5)"]
> c(
>   equal = t.test(gamma1, gamma2, var.equal=TRUE)$p.value,
>   unequal = t.test(gamma1,gamma2,var.equal=FALSE)$p.value,
>   mann = wilcox.test(gamma1,gamma2)$p.value
> )
>   }
> )
> rowMeans(p.value <= alpha)
>   }
> )
> cbind(all_combine1, t(result))
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> and Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no
> more than asking him to perform a post-mortem examination: he may be
> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does
> not ensure that a reasonable answer can be extracted from a given body
> of data. ~ John Tukey
>
>
> 2016-04-18 9:05 GMT+02:00 tan sj :
>> Hi, i am sorry, the output should be values between 0 and 0.1 and not
>> supposed to be 1.00, it is because they are type 1 error rate. And now i get
>> output 1.00 for several samples,rhis is no correct. The loop do not run for
>> every row. i do not know where is my mistake.  As i use the same concept on
>> normal distribution setup, i get the result.
>>
>> Sent from my phone
>>
>> On Thierry Onkelinx , Apr 18, 2016 2:55 PM wrote:
>> Dear anonymous,
>>
>> The big mistake in the output might be obvious to you but not to
>> others. Please make clear what the correct output should be or at
>> least what is wrong with the current output.
>>
>> And please DO read the posting guide which asks you not to post in HTML.
>> ir. Thierry Onkelinx
>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>> and Forest
>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>> Kliniekstraat 25
>> 1070 Anderlecht
>> Belgium
>>
>> To call in the statistician after the experiment is done may be no
>> more than asking him to perform a post-mortem examination: he may be
>> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does
>> not ensure that a reasonable answer can be extracted from a given body
>> of data. ~ John Tukey
>>
>>
>> 2016-04-17 19:59 GMT+02:00 tan sj :
>>> i have combined all the variables in a matrix, and i wish to conduct a
>>> simulation row by row.
>>>
>>> But i found out the code only works for the every first row after a cycle
>>> of nine samples.
>>>
>>> But after check out the code, i don know where is my mistake...
>>>
>>> can anyone pls help 
>>>
>>>
>>> #For gamma disribution with equal skewness 1.5
>>>
>>> #to evaluate the same R function on many different sets of data
>>> library(parallel)
>>>
>>> nSims<-100
>>> alpha<-0.05
>>>
>>> #set nrow =nsims because wan storing every p-value simulated
>>> #for gamma distribution with equal skewness
>>> matrix2_equal  <-matrix(0,nrow=nSims,ncol=3)
>>> matrix5_unequal<-matrix(0,nrow=nSims,ncol=3)
>>> matrix8_mann   <-matrix(0,nrow=nSims,ncol=3)
>>>
>>> # to ensure the

[R] Help using R Sensitivity

2016-04-18 Thread jody.kelly

Hello,

I am currently using the sensitivity package standard regression coefficient in 
order to rank variable importance in a model. I am new to using R so there may 
be some obvious things I am unaware of, apologies in advance as I am still 
learning.

I am using the following which I have taken straight from the help guide.

# a 100-sample with X1 ~ U(0.5, 1.5)
# X2 ~ U(1.5, 4.5)
# X3 ~ U(4.5, 13.5)

library(boot)
n <- 100
X <- data.frame(X1 = runif(n, 0.5, 1.5),
X2 = runif(n, 1.5, 4.5),
X3 = runif(n, 4.5, 13.5))

# linear model : Y = X1 + X2 + X3

y <- with(X, X1 + X2 + X3)

# sensitivity analysis

x <- src(X, y, nboot = 100)


plot(x)
Print(x)

This gives me ranks of the variables I have defined between -1 - 1. However 
this is the part I am unsure of how to apply to my own model.
I hope some one can give me advice on how to do this based on my own model as 
follows:



Model type: building energy consumption model.
Model Input variables (X): parameters relating to the building (X1 = 1.5-3.5, 
X2 = 7-12, X3 = 0.5 - 3, X4 = 10-15)
Model output variables (Y): Monthly Gas and electricity energy consumption

The spread sheet is as follows:  No of simulations: 1-40, for each simulation a 
new combination of model inputs (X) is used, therefore each simulation output 
(Y) will be different.

The aim of this analysis based on the 40 simulations is to rank input variables 
(X1-X4) based on importance of 1-4 with one being the most influential 
parameter and 4 being the least. What these variables are ranked upon, is their 
effect on the output variable (Y) which is energy consumption. Two variables 
will primarily have an effect on gas energy usage, and two will have an effect 
primarily on electricity energy usage. The aim is to produce a graph with left 
Y axis showing rank importance 1-4, X axis showing months January to December 
and the Y axis right showing the input variables with plots at each month 
showing its rank.

The spread sheet titles are set up as below. There are 40 simulations with 
varying combinations of X1-X4. Below each X value (X1-4) will be the input 
parameter value. Each simulations Y value will also change due to the change in 
variable combinations.

  Variable combinations (X)Y
Simulation No.X1X2X3X4JanFebMarAprMayJunJulAugSepOctNovDec


Thanks for any help in advance, much appreciated.

Jody


This message is intended solely for the addressee and may contain confidential 
and/or legally privileged information. Any use, disclosure or reproduction 
without the sender’s explicit consent is unauthorised and may be unlawful. If 
you have received this message in error, please notify Northumbria University 
immediately and permanently delete it. Any views or opinions expressed in this 
message are solely those of the author and do not necessarily represent those 
of the University. The University cannot guarantee that this message or any 
attachment is virus free or has not been intercepted and/or amended.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R [coding : do not run for every row ]

2016-04-18 Thread tan sj

Hi, i am sorry, the output should be values between 0 and 0.1 and not supposed 
to be 1.00, it is because they are type 1 error rate. And now i get output 1.00 
for several samples,rhis is no correct. The loop do not run for every row. i do 
not know where is my mistake.  As i use the same concept on normal distribution 
setup, i get the result.

Sent from my phone

On Thierry Onkelinx , Apr 18, 2016 2:55 PM wrote:
Dear anonymous,

The big mistake in the output might be obvious to you but not to
others. Please make clear what the correct output should be or at
least what is wrong with the current output.

And please DO read the posting guide which asks you not to post in HTML.
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey


2016-04-17 19:59 GMT+02:00 tan sj :
> i have combined all the variables in a matrix, and i wish to conduct a 
> simulation row by row.
>
> But i found out the code only works for the every first row after a cycle of 
> nine samples.
>
> But after check out the code, i don know where is my mistake...
>
> can anyone pls help 
>
>
> #For gamma disribution with equal skewness 1.5
>
> #to evaluate the same R function on many different sets of data
> library(parallel)
>
> nSims<-100
> alpha<-0.05
>
> #set nrow =nsims because wan storing every p-value simulated
> #for gamma distribution with equal skewness
> matrix2_equal  <-matrix(0,nrow=nSims,ncol=3)
> matrix5_unequal<-matrix(0,nrow=nSims,ncol=3)
> matrix8_mann   <-matrix(0,nrow=nSims,ncol=3)
>
> # to ensure the reproducity of the result
> #here we declare the random seed generator
> set.seed(1)
>
> ## Put the samples sizes into matrix then use a loop for sample sizes
> sample_sizes<-matrix(c(10,10,10,25,25,25,25,50,25,100,50,25,50,100,100,25,100,100),
> nrow=2)
>
> #shape parameter for both gamma distribution for equal skewness
> #forty five cases for each skewness!!
> shp<-rep(16/9,each=5)
>
> #scale parameter for sample 1
> #scale paramter for sample 2 set as constant 1
> scp1<-c(1,1.5,2,2.5,3)
>
> #get all combinations with one row of the sample_sizes matrix
> ##(use expand.grid)to create a data frame from combination of data
>
> ss_sd1<- expand.grid(sample_sizes[2,],shp)
> scp1<-rep(scp1,9)
>
> std2<-rep(sd2,9)
>
> #create a matrix combining the forty five cases of combination of sample 
> sizes,shape and scale parameter
> all_combine1 <- cbind(rep(sample_sizes[1,], 5),ss_sd1,scp1)
>
> # name the column samples 1 and 2 and standard deviation
> colnames(all_combine1) <- c("m", "n","sp(skewness1.5)","scp1")
>
> ##for the samples sizes into matrix then use a loop for sample sizes
>  # this loop steps through the all_combine matrix
>   for(ss in 1:nrow(all_combine1))
>   {
> #generate samples from the first column and second column
>  m<-all_combine1[ss,1]
>  n<-all_combine1[ss,2]
>
>for (sim in 1:nSims)
>{
> #generate 2 random samples from gamma distribution with equal skewness
> gamma1<-rgamma(m,all_combine1[ss,3],all_combine1[ss,4])
> gamma2<-rgamma(n,all_combine1[ss,4],1)
>
> # minus the population mean to ensure that there is no lose of 
> equality of mean
> gamma1<-gamma1-all_combine1[ss,3]*all_combine1[ss,4]
> gamma2<-gamma2-all_combine1[ss,3]
>
> #extract p-value out and store every p-value into matrix
> matrix2_equal[sim,1]<-t.test(gamma1,gamma2,var.equal=TRUE)$p.value
> matrix5_unequal[sim,2]<-t.test(gamma1,gamma2,var.equal=FALSE)$p.value
> matrix8_mann[sim,3] <-wilcox.test(gamma1,gamma2)$p.value
> }
>##store the result
>   equal[ss]<- mean(matrix2_equal[,1]<=alpha)
>   unequal[ss]<-mean(matrix5_unequal[,2]<=alpha)
>   mann[ss]<- mean(matrix8_mann[,3]<=alpha)
>   }
>
> g_equal<-cbind(all_combine1, equal, unequal, mann)
>
> It is my result but it show a very big mistake TT
>  m   n sp(skewness1.5) scp1 equal unequal mann
> 1   10  101.78  1.0  0.360.34 0.34
> 2   10  251.78  1.5  0.840.87 0.90
> 3   25  251.78  2.0  1.001.00 1.00
> 4   25  501.78  2.5  1.001.00 1.00
> 5   25 1001.78  3.0  1.001.00 1.00
> 6   50  251.78  1.0  0.770.77 0.84
> 7   50 1001.78  1.5  1.001.00 1.00
> 8  100  251.78  2.0  1.001.00 1.00
> 9  100 100

[R] 'nlme' package not compiling

2016-04-18 Thread Angelo Varlotta

Hi,
I'm trying to install from source code the 'nlme' package in 
RStudio. When I try, I get the following error message:

ld: warning: directory not found for option 
'-L/usr/local/lib/gcc/x86_64-apple-darwin13.0.0/4.8.2'
ld: library not found for -lgfortran
clang: error: linker command failed with exit code 1 (use -v to 
see invocation)
make: *** [nlme.so] Error 1
ERROR: compilation failed for package ‘nlme’
* removing 
‘/Library/Frameworks/R.framework/Versions/3.2/Resources/library/nlme’
* restoring previous 
‘/Library/Frameworks/R.framework/Versions/3.2/Resources/library/nlme’
Warning in install.packages :
   installation of package ‘nlme’ had non-zero exit status

I'm using gfortran 4.8 from Macports and running OS X 10.11.4 
with RStudio Version 0.99.893. I've tried to use the FLIBS 
command in R:

FLIBS="-L/opt/local/lib/gcc48/gcc/x86_64-apple-darwin15/4.8.5/"

so that it knows where the Fortran libraries are at and compile 
again, but it still searchesregardless for the directory:

/usr/local/lib/gcc/x86_64-apple-darwin13.0.0/4.8.2.

which of course doesn't exist. Any suggestions?

Cheers,
Angelo



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Social Network Simulation

2016-04-18 Thread Suzen, Mehmet

Dear Professor Haenlein,
Have you solved this issue yet? I found this eally interesting problem
I was wondering if it is possible to wrapper "objective function"
around igraph's 'sample_pa' and
'sample_smallworld'. If you have an example data set, I can have a look at this.
Viele Gruesse aus London
Mehmet

On 16 April 2016 at 14:16, Michael Haenlein  wrote:
> Dear all,
>
> I am trying to simulate a series of networks that have characteristics
> similar to real life social networks. Specifically I am interested in
> networks that have (a) a reasonable degree of clustering (as measured by
> the transitivity function in igraph) and (b) a reasonable degree of degree
> polarization (as measured by the average degree of the top 10% nodes with
> highest degree divided by the overall average degree).
>
> Right now I am using two functions from irgaph (sample_pa and
> sample_smallworld) but these are not ideal since they only allow me to vary
> one of the two characteristics. Either the network has good clustering but
> not enough polarization or the other way round.
>
> I looked around and I found some network algorithms that solve the problem
> (E.g., Jackson and Rogers, Meeting Strangers and Friends of Friends), but I
> did not find their implemented in an R package. I also found the R package
> NetSim which seems to be in this spirit, but I cannot get it to work.
>
> Could anyone point me to an R library that I could check out? I do not care
> much about the specific algorithm used as long as it allows me to vary
> clustering and degree polarization in certain ranges.
>
> Thanks,
>
> Michael
>
>
> Michael Haenlein
> Professor of Marketing
> ESCP Europe, Paris
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R [coding : do not run for every row ]

2016-04-18 Thread Thierry Onkelinx

You can make this much more readable with apply functions.

result <- apply(
  all_combine1,
  1,
  function(x){
p.value <- sapply(
  seq_len(nSims),
  function(sim){
gamma1 <- rgamma(x["m"], x["sp(skewness1.5)"], x["scp1"])
gamma2 <- rgamma(x["n"], x["scp1"], 1)
gamma1 <- gamma1 - x["sp(skewness1.5)"] * x["scp1"]
gamma2 <- gamma2 - x["sp(skewness1.5)"]
c(
  equal = t.test(gamma1, gamma2, var.equal=TRUE)$p.value,
  unequal = t.test(gamma1,gamma2,var.equal=FALSE)$p.value,
  mann = wilcox.test(gamma1,gamma2)$p.value
)
  }
)
rowMeans(p.value <= alpha)
  }
)
cbind(all_combine1, t(result))
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey


2016-04-18 9:05 GMT+02:00 tan sj :
> Hi, i am sorry, the output should be values between 0 and 0.1 and not
> supposed to be 1.00, it is because they are type 1 error rate. And now i get
> output 1.00 for several samples,rhis is no correct. The loop do not run for
> every row. i do not know where is my mistake.  As i use the same concept on
> normal distribution setup, i get the result.
>
> Sent from my phone
>
> On Thierry Onkelinx , Apr 18, 2016 2:55 PM wrote:
> Dear anonymous,
>
> The big mistake in the output might be obvious to you but not to
> others. Please make clear what the correct output should be or at
> least what is wrong with the current output.
>
> And please DO read the posting guide which asks you not to post in HTML.
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> and Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no
> more than asking him to perform a post-mortem examination: he may be
> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does
> not ensure that a reasonable answer can be extracted from a given body
> of data. ~ John Tukey
>
>
> 2016-04-17 19:59 GMT+02:00 tan sj :
>> i have combined all the variables in a matrix, and i wish to conduct a
>> simulation row by row.
>>
>> But i found out the code only works for the every first row after a cycle
>> of nine samples.
>>
>> But after check out the code, i don know where is my mistake...
>>
>> can anyone pls help 
>>
>>
>> #For gamma disribution with equal skewness 1.5
>>
>> #to evaluate the same R function on many different sets of data
>> library(parallel)
>>
>> nSims<-100
>> alpha<-0.05
>>
>> #set nrow =nsims because wan storing every p-value simulated
>> #for gamma distribution with equal skewness
>> matrix2_equal  <-matrix(0,nrow=nSims,ncol=3)
>> matrix5_unequal<-matrix(0,nrow=nSims,ncol=3)
>> matrix8_mann   <-matrix(0,nrow=nSims,ncol=3)
>>
>> # to ensure the reproducity of the result
>> #here we declare the random seed generator
>> set.seed(1)
>>
>> ## Put the samples sizes into matrix then use a loop for sample sizes
>>
>> sample_sizes<-matrix(c(10,10,10,25,25,25,25,50,25,100,50,25,50,100,100,25,100,100),
>> nrow=2)
>>
>> #shape parameter for both gamma distribution for equal skewness
>> #forty five cases for each skewness!!
>> shp<-rep(16/9,each=5)
>>
>> #scale parameter for sample 1
>> #scale paramter for sample 2 set as constant 1
>> scp1<-c(1,1.5,2,2.5,3)
>>
>> #get all combinations with one row of the sample_sizes matrix
>> ##(use expand.grid)to create a data frame from combination of data
>>
>> ss_sd1<- expand.grid(sample_sizes[2,],shp)
>> scp1<-rep(scp1,9)
>>
>> std2<-rep(sd2,9)
>>
>> #create a matrix combining the forty five cases of combination of sample
>> sizes,shape and scale parameter
>> all_combine1 <- cbind(rep(sample_sizes[1,], 5),ss_sd1,scp1)
>>
>> # name the column samples 1 and 2 and standard deviation
>> colnames(all_combine1) <- c("m", "n","sp(skewness1.5)","scp1")
>>
>> ##for the samples sizes into matrix then use a loop for sample sizes
>>  # this loop steps through the all_combine matrix
>>   for(ss in 1:nrow(all_combine1))
>>   {
>> #generate samples from the first column and second column
>>  m<-all_combine1[ss,1]
>>  n<-all_combine1[ss,2]
>>
>>for (sim in 1:nSims)
>>

Re: [R] R [coding : do not run for every row ]

2016-04-18 Thread Thierry Onkelinx

Dear anonymous,

The big mistake in the output might be obvious to you but not to
others. Please make clear what the correct output should be or at
least what is wrong with the current output.

And please DO read the posting guide which asks you not to post in HTML.
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey


2016-04-17 19:59 GMT+02:00 tan sj :
> i have combined all the variables in a matrix, and i wish to conduct a 
> simulation row by row.
>
> But i found out the code only works for the every first row after a cycle of 
> nine samples.
>
> But after check out the code, i don know where is my mistake...
>
> can anyone pls help 
>
>
> #For gamma disribution with equal skewness 1.5
>
> #to evaluate the same R function on many different sets of data
> library(parallel)
>
> nSims<-100
> alpha<-0.05
>
> #set nrow =nsims because wan storing every p-value simulated
> #for gamma distribution with equal skewness
> matrix2_equal  <-matrix(0,nrow=nSims,ncol=3)
> matrix5_unequal<-matrix(0,nrow=nSims,ncol=3)
> matrix8_mann   <-matrix(0,nrow=nSims,ncol=3)
>
> # to ensure the reproducity of the result
> #here we declare the random seed generator
> set.seed(1)
>
> ## Put the samples sizes into matrix then use a loop for sample sizes
> sample_sizes<-matrix(c(10,10,10,25,25,25,25,50,25,100,50,25,50,100,100,25,100,100),
> nrow=2)
>
> #shape parameter for both gamma distribution for equal skewness
> #forty five cases for each skewness!!
> shp<-rep(16/9,each=5)
>
> #scale parameter for sample 1
> #scale paramter for sample 2 set as constant 1
> scp1<-c(1,1.5,2,2.5,3)
>
> #get all combinations with one row of the sample_sizes matrix
> ##(use expand.grid)to create a data frame from combination of data
>
> ss_sd1<- expand.grid(sample_sizes[2,],shp)
> scp1<-rep(scp1,9)
>
> std2<-rep(sd2,9)
>
> #create a matrix combining the forty five cases of combination of sample 
> sizes,shape and scale parameter
> all_combine1 <- cbind(rep(sample_sizes[1,], 5),ss_sd1,scp1)
>
> # name the column samples 1 and 2 and standard deviation
> colnames(all_combine1) <- c("m", "n","sp(skewness1.5)","scp1")
>
> ##for the samples sizes into matrix then use a loop for sample sizes
>  # this loop steps through the all_combine matrix
>   for(ss in 1:nrow(all_combine1))
>   {
> #generate samples from the first column and second column
>  m<-all_combine1[ss,1]
>  n<-all_combine1[ss,2]
>
>for (sim in 1:nSims)
>{
> #generate 2 random samples from gamma distribution with equal skewness
> gamma1<-rgamma(m,all_combine1[ss,3],all_combine1[ss,4])
> gamma2<-rgamma(n,all_combine1[ss,4],1)
>
> # minus the population mean to ensure that there is no lose of 
> equality of mean
> gamma1<-gamma1-all_combine1[ss,3]*all_combine1[ss,4]
> gamma2<-gamma2-all_combine1[ss,3]
>
> #extract p-value out and store every p-value into matrix
> matrix2_equal[sim,1]<-t.test(gamma1,gamma2,var.equal=TRUE)$p.value
> matrix5_unequal[sim,2]<-t.test(gamma1,gamma2,var.equal=FALSE)$p.value
> matrix8_mann[sim,3] <-wilcox.test(gamma1,gamma2)$p.value
> }
>##store the result
>   equal[ss]<- mean(matrix2_equal[,1]<=alpha)
>   unequal[ss]<-mean(matrix5_unequal[,2]<=alpha)
>   mann[ss]<- mean(matrix8_mann[,3]<=alpha)
>   }
>
> g_equal<-cbind(all_combine1, equal, unequal, mann)
>
> It is my result but it show a very big mistake TT
>  m   n sp(skewness1.5) scp1 equal unequal mann
> 1   10  101.78  1.0  0.360.34 0.34
> 2   10  251.78  1.5  0.840.87 0.90
> 3   25  251.78  2.0  1.001.00 1.00
> 4   25  501.78  2.5  1.001.00 1.00
> 5   25 1001.78  3.0  1.001.00 1.00
> 6   50  251.78  1.0  0.770.77 0.84
> 7   50 1001.78  1.5  1.001.00 1.00
> 8  100  251.78  2.0  1.001.00 1.00
> 9  100 1001.78  2.5  1.001.00 1.00
> 10  10  101.78  3.0  1.001.00 1.00
> 11  10  251.78  1.0  0.480.30 0.55
> 12  25  251.78  1.5  0.990.99 1.00
> 13  25  501.78  2.0  1.001.00 1.00
> 14  25 1001.78  2.5  1.001.00 1.00
> 15  50  251.78  3.0  1.001.00 1.00
> 16  50 1001.78  1.0  0.970.97 1.00
> 17 100  251.78  1.5  1.00

37 matches

Mail list logo