Re: [R] Problem with data distribution

2022-02-17 Thread Ebert,Timothy Aaron
Maybe what you want is to recode your data differently.
One data set has bug versus no bug. What is the probability of having one or 
more bugs?
The other data set has bugs only. Given that I have bugs how many will I get?

Tim

-Original Message-
From: R-help  On Behalf Of Neha gupta
Sent: Thursday, February 17, 2022 4:54 PM
To: Bert Gunter 
Cc: r-help mailing list 
Subject: Re: [R] Problem with data distribution

[External Email]

:) :)

On Thu, Feb 17, 2022 at 10:37 PM Bert Gunter  wrote:

> imo, with such simple data, a plot is mere chartjunk. A simple table(= 
> the distribution) would suffice and be more informative:
>
> > table(bug) ## bug is a vector. No data frame is needed
>
>   0   1 23   4   5   7   ## bug count
> 162  40   9   7   2   1   1   ## nmbr of cases with the given count
>
> You or others may disagree, of course.
>
> Bert Gunter
>
>
>
> On Thu, Feb 17, 2022 at 11:56 AM Neha gupta 
> wrote:
> >
> > Ebert and Rui, thank you for providing the tips (in fact, for 
> > providing
> the
> > answer I needed).
> >
> > Yes, you are right that boxplot of all zero values will not make sense.
> > Maybe histogram will work.
> >
> > I am providing a few details of my data here and the context of the 
> > question I asked.
> >
> > My data is about bugs/defects in different classes of a large 
> > software system. I have to predict which class will contain bugs and 
> > which will be free of bugs (bug=0). I trained ML models and predict 
> > but my advisor
> asked
> > me to provide first the data distribution about bugs e.g details of 
> > how many classes with bugs (bug > 0) and how many are free of bugs (bug=0).
> >
> > That is why I need to provide the data distribution of both types of
> values
> > (i.e. bug=0 and bug >0)
> >
> > Thank you again.
> >
> > On Thu, Feb 17, 2022 at 8:28 PM Rui Barradas 
> wrote:
> >
> > > Hello,
> > >
> > > In your original post you read the same file "synapse.arff" twice, 
> > > apparently to filter each of them by its own criterion. You don't 
> > > need to do that, read once and filter that one by different criteria.
> > >
> > > As for the data as posted, I have read it in with the following code:
> > >
> > >
> > > x <- "
> > > 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 
> > > 0 0 0
> > > 4 1 0
> > > 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 
> > > 0 0 0
> > > 0 0 0
> > > 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 
> > > 0 0 7
> > > 0 0 1
> > > 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 
> > > 0 0 0
> > > 1 0 0
> > > 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 
> > > 1 0 0
> > > 0 0 1
> > > 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0 "
> > > bug <- scan(text = x)
> > > data <- data.frame(bug)
> > >
> > >
> > > This is not the right way to post data, the posting guide asks to 
> > > post the output of
> > >
> > >
> > > dput(data)
> > > structure(list(bug = c(0, 1, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 
> > > 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 
> > > 4, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 3, 2, 0, 0, 0, 0, 3, 0, 0, 
> > > 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 
> > > 2, 1, 0, 1, 0, 0, 0, 2, 2, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 
> > > 0, 1, 0, 0, 5, 0, 0, 0, 0, 0, 0, 7, 0, 0, 1, 0, 1, 1, 0, 2, 0, 3, 
> > > 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 3, 2, 1, 1, 
> > > 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 
> > > 0, 0, 3, 0, 0, 1, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 4, 1, 1, 
> > > 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
> > > 0, 0, 3, 0, 1, 0, 0, 0, 0, 0)), class = "data.frame", row.names = 
> > > c(NA, -222L))
> > >
> > >
> > >
> > > This can be copied into an R session and the data set recreated 
> > > with
> > >
> > > data <- structure(etc)
> > >
> > >
> > > Now the boxplots.
> > >
> > > (Why would you want to plot a vector of all zeros, btw?)
> > >
> > >
> > >
> > > library(dplyr)
> > >
> > > boxplot(filter(data, bug == 0))# nonsense
> > > boxplot(filt

Re: [R] Problem with data distribution

2022-02-17 Thread Neha gupta
:) :)

On Thu, Feb 17, 2022 at 10:37 PM Bert Gunter  wrote:

> imo, with such simple data, a plot is mere chartjunk. A simple table(=
> the distribution) would suffice and be more informative:
>
> > table(bug) ## bug is a vector. No data frame is needed
>
>   0   1 23   4   5   7   ## bug count
> 162  40   9   7   2   1   1   ## nmbr of cases with the given count
>
> You or others may disagree, of course.
>
> Bert Gunter
>
>
>
> On Thu, Feb 17, 2022 at 11:56 AM Neha gupta 
> wrote:
> >
> > Ebert and Rui, thank you for providing the tips (in fact, for providing
> the
> > answer I needed).
> >
> > Yes, you are right that boxplot of all zero values will not make sense.
> > Maybe histogram will work.
> >
> > I am providing a few details of my data here and the context of the
> > question I asked.
> >
> > My data is about bugs/defects in different classes of a large software
> > system. I have to predict which class will contain bugs and which will be
> > free of bugs (bug=0). I trained ML models and predict but my advisor
> asked
> > me to provide first the data distribution about bugs e.g details of how
> > many classes with bugs (bug > 0) and how many are free of bugs (bug=0).
> >
> > That is why I need to provide the data distribution of both types of
> values
> > (i.e. bug=0 and bug >0)
> >
> > Thank you again.
> >
> > On Thu, Feb 17, 2022 at 8:28 PM Rui Barradas 
> wrote:
> >
> > > Hello,
> > >
> > > In your original post you read the same file "synapse.arff" twice,
> > > apparently to filter each of them by its own criterion. You don't need
> > > to do that, read once and filter that one by different criteria.
> > >
> > > As for the data as posted, I have read it in with the following code:
> > >
> > >
> > > x <- "
> > > 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0 0
> > > 4 1 0
> > > 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 0 0 0
> > > 0 0 0
> > > 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0 7
> > > 0 0 1
> > > 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 0 0 0
> > > 1 0 0
> > > 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 1 0 0
> > > 0 0 1
> > > 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0
> > > "
> > > bug <- scan(text = x)
> > > data <- data.frame(bug)
> > >
> > >
> > > This is not the right way to post data, the posting guide asks to post
> > > the output of
> > >
> > >
> > > dput(data)
> > > structure(list(bug = c(0, 1, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0,
> > > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0,
> > > 0, 0, 4, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 3, 2, 0, 0, 0, 0,
> > > 3, 0, 0, 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0,
> > > 0, 0, 1, 1, 2, 1, 0, 1, 0, 0, 0, 2, 2, 1, 1, 0, 0, 0, 0, 0, 0,
> > > 1, 0, 0, 1, 0, 0, 1, 0, 0, 5, 0, 0, 0, 0, 0, 0, 7, 0, 0, 1, 0,
> > > 1, 1, 0, 2, 0, 3, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0,
> > > 0, 1, 0, 3, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
> > > 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 3, 0, 0, 1, 0, 1, 3, 0, 0, 0, 0,
> > > 0, 0, 0, 0, 1, 0, 4, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
> > > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 1, 0, 0, 0, 0, 0)),
> > > class = "data.frame", row.names = c(NA, -222L))
> > >
> > >
> > >
> > > This can be copied into an R session and the data set recreated with
> > >
> > > data <- structure(etc)
> > >
> > >
> > > Now the boxplots.
> > >
> > > (Why would you want to plot a vector of all zeros, btw?)
> > >
> > >
> > >
> > > library(dplyr)
> > >
> > > boxplot(filter(data, bug == 0))# nonsense
> > > boxplot(filter(data, bug > 0), range = 0)
> > >
> > > # Another way
> > > data %>%
> > >filter(bug > 0) %>%
> > >boxplot(range = 0)
> > >
> > >
> > > Hope this helps,
> > >
> > > Rui Barradas
> > >
> > >
> > > Às 19:03 de 17/02/2022, Neha gupta escreveu:
> > > > That is all the code I have. How can I provide a  reproducible code ?
> > > >
> > > > How can I save this result?
> > > >
> > > > On Thu, Feb 17, 2022 at 8:00 PM Ebert,Timothy Aaron 
> > > wrote:
> > > >
> > > >> You pipe the filter but do not save the result. A reproducible
> example
> > > >> might help.
> > > >> Tim
> > > >>
> > > >> -Original Message-
> > > >> From: R-help  On Behalf Of Neha gupta
> > > >> Sent: Thursday, February 17, 2022 1:55 PM
> > > >> To: r-help mailing list 
> > > >> Subject: [R] Problem with data distribution
> > > >>
> > > >> [External Email]
> > > >>
> > > >> Hello everyone
> > > >>
> > > >> I have a dataset with output variable "bug" having the following
> values
> > > >> (at the bottom of this email). My advisor asked me to provide data
> > > >> distribution of bugs with 0 values and bugs with more than 0 values.
> > > >>
> > > >> data = readARFF("synapse.arff")
> > > >> data2 = readARFF("synapse.arff")
> > > >> data$bug
> > > >> library(tidyverse)
> > > >> data %>%
> > > >>filter(bug == 0)
> > > >> data2 %>%
> > > >>filter(bug >= 1)
> > > >> 

Re: [R] Problem with data distribution

2022-02-17 Thread Neha gupta
Ebert and Rui, thank you for providing the tips (in fact, for providing the
answer I needed).

Yes, you are right that boxplot of all zero values will not make sense.
Maybe histogram will work.

I am providing a few details of my data here and the context of the
question I asked.

My data is about bugs/defects in different classes of a large software
system. I have to predict which class will contain bugs and which will be
free of bugs (bug=0). I trained ML models and predict but my advisor asked
me to provide first the data distribution about bugs e.g details of how
many classes with bugs (bug > 0) and how many are free of bugs (bug=0).

That is why I need to provide the data distribution of both types of values
(i.e. bug=0 and bug >0)

Thank you again.

On Thu, Feb 17, 2022 at 8:28 PM Rui Barradas  wrote:

> Hello,
>
> In your original post you read the same file "synapse.arff" twice,
> apparently to filter each of them by its own criterion. You don't need
> to do that, read once and filter that one by different criteria.
>
> As for the data as posted, I have read it in with the following code:
>
>
> x <- "
> 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0 0
> 4 1 0
> 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 0 0 0
> 0 0 0
> 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0 7
> 0 0 1
> 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 0 0 0
> 1 0 0
> 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 1 0 0
> 0 0 1
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0
> "
> bug <- scan(text = x)
> data <- data.frame(bug)
>
>
> This is not the right way to post data, the posting guide asks to post
> the output of
>
>
> dput(data)
> structure(list(bug = c(0, 1, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0,
> 0, 0, 4, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 3, 2, 0, 0, 0, 0,
> 3, 0, 0, 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0,
> 0, 0, 1, 1, 2, 1, 0, 1, 0, 0, 0, 2, 2, 1, 1, 0, 0, 0, 0, 0, 0,
> 1, 0, 0, 1, 0, 0, 1, 0, 0, 5, 0, 0, 0, 0, 0, 0, 7, 0, 0, 1, 0,
> 1, 1, 0, 2, 0, 3, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0,
> 0, 1, 0, 3, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 3, 0, 0, 1, 0, 1, 3, 0, 0, 0, 0,
> 0, 0, 0, 0, 1, 0, 4, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 1, 0, 0, 0, 0, 0)),
> class = "data.frame", row.names = c(NA, -222L))
>
>
>
> This can be copied into an R session and the data set recreated with
>
> data <- structure(etc)
>
>
> Now the boxplots.
>
> (Why would you want to plot a vector of all zeros, btw?)
>
>
>
> library(dplyr)
>
> boxplot(filter(data, bug == 0))# nonsense
> boxplot(filter(data, bug > 0), range = 0)
>
> # Another way
> data %>%
>filter(bug > 0) %>%
>boxplot(range = 0)
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> Às 19:03 de 17/02/2022, Neha gupta escreveu:
> > That is all the code I have. How can I provide a  reproducible code ?
> >
> > How can I save this result?
> >
> > On Thu, Feb 17, 2022 at 8:00 PM Ebert,Timothy Aaron 
> wrote:
> >
> >> You pipe the filter but do not save the result. A reproducible example
> >> might help.
> >> Tim
> >>
> >> -Original Message-
> >> From: R-help  On Behalf Of Neha gupta
> >> Sent: Thursday, February 17, 2022 1:55 PM
> >> To: r-help mailing list 
> >> Subject: [R] Problem with data distribution
> >>
> >> [External Email]
> >>
> >> Hello everyone
> >>
> >> I have a dataset with output variable "bug" having the following values
> >> (at the bottom of this email). My advisor asked me to provide data
> >> distribution of bugs with 0 values and bugs with more than 0 values.
> >>
> >> data = readARFF("synapse.arff")
> >> data2 = readARFF("synapse.arff")
> >> data$bug
> >> library(tidyverse)
> >> data %>%
> >>filter(bug == 0)
> >> data2 %>%
> >>filter(bug >= 1)
> >> boxplot(data2$bug, data$bug, range=0)
> >>
> >> But both the graphs are exactly the same, how is it possible? Where I am
> >> doing wrong?
> >>
> >>
> >> data$bug
> >>[1] 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0
> 0 0 0
> >> 0 4 1 0
> >>   [40] 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1
> 1 0 0
> >> 0 0 0 0
> >>   [79] 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0
> 0 0 0
> >> 7 0 0 1
> >> [118] 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0
> 0 0
> >> 0 1 0 0
> >> [157] 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1
> 1 0
> >> 0 0 0 1
> >> [196] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0
> >>
> >>  [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>
> 

Re: [R] Problem with data distribution

2022-02-17 Thread Neha gupta
Dear John, thanks a lot for the detailed answer.

Yes, I am not an expert in R language and when a problem comes in, I google
it or post it on these forums. (I have just a little bit experience of ML
in R).



On Thu, Feb 17, 2022 at 8:21 PM John Fox  wrote:

> Dear Nega gupta,
>
> On 2022-02-17 1:54 p.m., Neha gupta wrote:
> > Hello everyone
> >
> > I have a dataset with output variable "bug" having the following values
> (at
> > the bottom of this email). My advisor asked me to provide data
> distribution
> > of bugs with 0 values and bugs with more than 0 values.
> >
> > data = readARFF("synapse.arff")
> > data2 = readARFF("synapse.arff")
> > data$bug
> > library(tidyverse)
> > data %>%
> >filter(bug == 0)
> > data2 %>%
> >filter(bug >= 1)
> > boxplot(data2$bug, data$bug, range=0)
> >
> > But both the graphs are exactly the same, how is it possible? Where I am
> > doing wrong?
>
> As it turns out, you're doing several things wrong.
>
> First, you're not using pipes and filter() correctly. That is, you don't
> do anything with the filtered versions of the data sets. You're
> apparently under the incorrect impression that filtering modifies the
> original data set.
>
> Second, you're greatly complicating a simple problem. You don't need to
> read the data twice and keep two versions of the data set. As well,
> processing the data with pipes and filter() is entirely unnecessary. The
> following code works:
>
> with(data, boxplot(bug[bug == 0], bug[bug >= 1], range=0))
>
> Third, and most fundamentally, the parallel boxplots you're apparently
> trying to construct don't really make sense. The first "boxplot" is just
> a horizontal line at 0 and so conveys no information. Why not just plot
> the nonzero values if that's what you're interested in?
>
> Fourth, you didn't share your data in a convenient form. I was able to
> reconstruct them via
>
>bug <- scan()
>0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0
>0 4 1 0
>0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 0 0
>0 0 0 0
>1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0
>7 0 0 1
>0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 0 0
>0 1 0 0
>0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 1 0
>0 0 0 1
>0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0
>
>data <- data.frame(bug)
>
> Finally, it's better not to post to the list in plain-text email, rather
> than html (as the posting guide suggests).
>
> I hope this helps,
>   John
>
> >
> >
> > data$bug
> >[1] 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0
> 0 0
> > 0 4 1 0
> >   [40] 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1
> 0 0
> > 0 0 0 0
> >   [79] 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0
> 0 0
> > 7 0 0 1
> > [118] 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0
> 0 0
> > 0 1 0 0
> > [157] 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1
> 1 0
> > 0 0 0 1
> > [196] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> --
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with data distribution

2022-02-17 Thread John Fox

Dear Nega gupta,

In the last point, I meant to say, "Finally, it's better to post to the 
list in plain-text email, rather than html (as the posting guide 
suggests)." (I accidentally inserted a "not" in this sentence.)


Sorry,
 John

On 2022-02-17 2:21 p.m., John Fox wrote:

Dear Nega gupta,

On 2022-02-17 1:54 p.m., Neha gupta wrote:

Hello everyone

I have a dataset with output variable "bug" having the following 
values (at
the bottom of this email). My advisor asked me to provide data 
distribution

of bugs with 0 values and bugs with more than 0 values.

data = readARFF("synapse.arff")
data2 = readARFF("synapse.arff")
data$bug
library(tidyverse)
data %>%
   filter(bug == 0)
data2 %>%
   filter(bug >= 1)
boxplot(data2$bug, data$bug, range=0)

But both the graphs are exactly the same, how is it possible? Where I am
doing wrong?


As it turns out, you're doing several things wrong.

First, you're not using pipes and filter() correctly. That is, you don't 
do anything with the filtered versions of the data sets. You're 
apparently under the incorrect impression that filtering modifies the 
original data set.


Second, you're greatly complicating a simple problem. You don't need to 
read the data twice and keep two versions of the data set. As well, 
processing the data with pipes and filter() is entirely unnecessary. The 
following code works:


    with(data, boxplot(bug[bug == 0], bug[bug >= 1], range=0))

Third, and most fundamentally, the parallel boxplots you're apparently 
trying to construct don't really make sense. The first "boxplot" is just 
a horizontal line at 0 and so conveys no information. Why not just plot 
the nonzero values if that's what you're interested in?


Fourth, you didn't share your data in a convenient form. I was able to 
reconstruct them via


   bug <- scan()
   0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0
   0 4 1 0
   0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 0 0
   0 0 0 0
   1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0
   7 0 0 1
   0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 0 0
   0 1 0 0
   0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 1 0
   0 0 0 1
   0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0

   data <- data.frame(bug)

Finally, it's better not to post to the list in plain-text email, rather 
than html (as the posting guide suggests).


I hope this helps,
  John




data$bug
   [1] 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 
0 0 0

0 4 1 0
  [40] 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 
1 0 0

0 0 0 0
  [79] 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 
0 0 0

7 0 0 1
[118] 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 
0 0 0

0 1 0 0
[157] 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 
1 1 0

0 0 0 1
[196] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with data distribution

2022-02-17 Thread John Fox

Dear Neha gupta,

I hope that I'm not overstepping my role when I say that googling 
solutions to specific problems isn't an inefficient way to learn a 
programming language, and will probably waste your time in the long run. 
There are many good introductions to R.


Best,
 John

On 2022-02-17 2:27 p.m., Neha gupta wrote:

Dear John, thanks a lot for the detailed answer.

Yes, I am not an expert in R language and when a problem comes in, I 
google it or post it on these forums. (I have just a little bit 
experience of ML in R).




On Thu, Feb 17, 2022 at 8:21 PM John Fox > wrote:


Dear Nega gupta,

On 2022-02-17 1:54 p.m., Neha gupta wrote:
 > Hello everyone
 >
 > I have a dataset with output variable "bug" having the following
values (at
 > the bottom of this email). My advisor asked me to provide data
distribution
 > of bugs with 0 values and bugs with more than 0 values.
 >
 > data = readARFF("synapse.arff")
 > data2 = readARFF("synapse.arff")
 > data$bug
 > library(tidyverse)
 > data %>%
 >    filter(bug == 0)
 > data2 %>%
 >    filter(bug >= 1)
 > boxplot(data2$bug, data$bug, range=0)
 >
 > But both the graphs are exactly the same, how is it possible?
Where I am
 > doing wrong?

As it turns out, you're doing several things wrong.

First, you're not using pipes and filter() correctly. That is, you
don't
do anything with the filtered versions of the data sets. You're
apparently under the incorrect impression that filtering modifies the
original data set.

Second, you're greatly complicating a simple problem. You don't need to
read the data twice and keep two versions of the data set. As well,
processing the data with pipes and filter() is entirely unnecessary.
The
following code works:

     with(data, boxplot(bug[bug == 0], bug[bug >= 1], range=0))

Third, and most fundamentally, the parallel boxplots you're apparently
trying to construct don't really make sense. The first "boxplot" is
just
a horizontal line at 0 and so conveys no information. Why not just plot
the nonzero values if that's what you're interested in?

Fourth, you didn't share your data in a convenient form. I was able to
reconstruct them via

    bug <- scan()
    0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0
0 0 0
    0 4 1 0
    0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1
1 0 0
    0 0 0 0
    1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0
0 0 0
    7 0 0 1
    0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0
0 0 0
    0 1 0 0
    0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4
1 1 0
    0 0 0 1
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0

    data <- data.frame(bug)

Finally, it's better not to post to the list in plain-text email,
rather
than html (as the posting guide suggests).

I hope this helps,
   John

 >
 >
 > data$bug
 >    [1] 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0
1 0 0 0 0 0
 > 0 4 1 0
 >   [40] 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0
0 1 1 1 0 0
 > 0 0 0 0
 >   [79] 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5
0 0 0 0 0 0
 > 7 0 0 1
 > [118] 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0
0 0 0 0 0
 > 0 1 0 0
 > [157] 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1
0 4 1 1 0
 > 0 0 0 1
 > [196] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0
 >
 >       [[alternative HTML version deleted]]
 >
 > __
 > R-help@r-project.org  mailing list
-- To UNSUBSCRIBE and more, see
 > https://stat.ethz.ch/mailman/listinfo/r-help

 > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html

 > and provide commented, minimal, self-contained, reproducible code.
-- 
John Fox, Professor Emeritus

McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/



--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with data distribution

2022-02-17 Thread Rui Barradas

Hello,

In your original post you read the same file "synapse.arff" twice, 
apparently to filter each of them by its own criterion. You don't need 
to do that, read once and filter that one by different criteria.


As for the data as posted, I have read it in with the following code:


x <- "
0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0 0 
4 1 0
0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 0 0 0 
0 0 0
1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0 7 
0 0 1
0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 0 0 0 
1 0 0
0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 1 0 0 
0 0 1

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0
"
bug <- scan(text = x)
data <- data.frame(bug)


This is not the right way to post data, the posting guide asks to post 
the output of



dput(data)
structure(list(bug = c(0, 1, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 4, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 3, 2, 0, 0, 0, 0,
3, 0, 0, 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0,
0, 0, 1, 1, 2, 1, 0, 1, 0, 0, 0, 2, 2, 1, 1, 0, 0, 0, 0, 0, 0,
1, 0, 0, 1, 0, 0, 1, 0, 0, 5, 0, 0, 0, 0, 0, 0, 7, 0, 0, 1, 0,
1, 1, 0, 2, 0, 3, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0,
0, 1, 0, 3, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 3, 0, 0, 1, 0, 1, 3, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 4, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 1, 0, 0, 0, 0, 0)),
class = "data.frame", row.names = c(NA, -222L))



This can be copied into an R session and the data set recreated with

data <- structure(etc)


Now the boxplots.

(Why would you want to plot a vector of all zeros, btw?)



library(dplyr)

boxplot(filter(data, bug == 0))# nonsense
boxplot(filter(data, bug > 0), range = 0)

# Another way
data %>%
  filter(bug > 0) %>%
  boxplot(range = 0)


Hope this helps,

Rui Barradas


Às 19:03 de 17/02/2022, Neha gupta escreveu:

That is all the code I have. How can I provide a  reproducible code ?

How can I save this result?

On Thu, Feb 17, 2022 at 8:00 PM Ebert,Timothy Aaron  wrote:


You pipe the filter but do not save the result. A reproducible example
might help.
Tim

-Original Message-
From: R-help  On Behalf Of Neha gupta
Sent: Thursday, February 17, 2022 1:55 PM
To: r-help mailing list 
Subject: [R] Problem with data distribution

[External Email]

Hello everyone

I have a dataset with output variable "bug" having the following values
(at the bottom of this email). My advisor asked me to provide data
distribution of bugs with 0 values and bugs with more than 0 values.

data = readARFF("synapse.arff")
data2 = readARFF("synapse.arff")
data$bug
library(tidyverse)
data %>%
   filter(bug == 0)
data2 %>%
   filter(bug >= 1)
boxplot(data2$bug, data$bug, range=0)

But both the graphs are exactly the same, how is it possible? Where I am
doing wrong?


data$bug
   [1] 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0
0 4 1 0
  [40] 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 0 0
0 0 0 0
  [79] 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0
7 0 0 1
[118] 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 0 0
0 1 0 0
[157] 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 1 0
0 0 0 1
[196] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp=DwICAg=sJ6xIWYx-zLMB3EPkvcnVg=9PEhQh2kVeAsRzsn7AkP-g=TZx8pDTF9x1Tu4QZW3x_99uu9RowVjAna39KcjCXSElI1AOk1C_6L2pR8YIVfiod=NxfkBJHBnd8naYPQTd9Z8dZ2m-RCwh_lpGvHVQ8MwYQ=
PLEASE do read the posting guide
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html=DwICAg=sJ6xIWYx-zLMB3EPkvcnVg=9PEhQh2kVeAsRzsn7AkP-g=TZx8pDTF9x1Tu4QZW3x_99uu9RowVjAna39KcjCXSElI1AOk1C_6L2pR8YIVfiod=exznSElUW1tc6ajt0C8uw5cR8ZqwHRD6tUPAarFYdYo=
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with data distribution

2022-02-17 Thread Neha gupta
That is all the code I have. How can I provide a  reproducible code ?

How can I save this result?

On Thu, Feb 17, 2022 at 8:00 PM Ebert,Timothy Aaron  wrote:

> You pipe the filter but do not save the result. A reproducible example
> might help.
> Tim
>
> -Original Message-
> From: R-help  On Behalf Of Neha gupta
> Sent: Thursday, February 17, 2022 1:55 PM
> To: r-help mailing list 
> Subject: [R] Problem with data distribution
>
> [External Email]
>
> Hello everyone
>
> I have a dataset with output variable "bug" having the following values
> (at the bottom of this email). My advisor asked me to provide data
> distribution of bugs with 0 values and bugs with more than 0 values.
>
> data = readARFF("synapse.arff")
> data2 = readARFF("synapse.arff")
> data$bug
> library(tidyverse)
> data %>%
>   filter(bug == 0)
> data2 %>%
>   filter(bug >= 1)
> boxplot(data2$bug, data$bug, range=0)
>
> But both the graphs are exactly the same, how is it possible? Where I am
> doing wrong?
>
>
> data$bug
>   [1] 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0
> 0 4 1 0
>  [40] 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 0 0
> 0 0 0 0
>  [79] 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0
> 7 0 0 1
> [118] 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 0 0
> 0 1 0 0
> [157] 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 1 0
> 0 0 0 1
> [196] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp=DwICAg=sJ6xIWYx-zLMB3EPkvcnVg=9PEhQh2kVeAsRzsn7AkP-g=TZx8pDTF9x1Tu4QZW3x_99uu9RowVjAna39KcjCXSElI1AOk1C_6L2pR8YIVfiod=NxfkBJHBnd8naYPQTd9Z8dZ2m-RCwh_lpGvHVQ8MwYQ=
> PLEASE do read the posting guide
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html=DwICAg=sJ6xIWYx-zLMB3EPkvcnVg=9PEhQh2kVeAsRzsn7AkP-g=TZx8pDTF9x1Tu4QZW3x_99uu9RowVjAna39KcjCXSElI1AOk1C_6L2pR8YIVfiod=exznSElUW1tc6ajt0C8uw5cR8ZqwHRD6tUPAarFYdYo=
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with data distribution

2022-02-17 Thread Ebert,Timothy Aaron
You pipe the filter but do not save the result. A reproducible example might 
help.
Tim

-Original Message-
From: R-help  On Behalf Of Neha gupta
Sent: Thursday, February 17, 2022 1:55 PM
To: r-help mailing list 
Subject: [R] Problem with data distribution

[External Email]

Hello everyone

I have a dataset with output variable "bug" having the following values (at the 
bottom of this email). My advisor asked me to provide data distribution of bugs 
with 0 values and bugs with more than 0 values.

data = readARFF("synapse.arff")
data2 = readARFF("synapse.arff")
data$bug
library(tidyverse)
data %>%
  filter(bug == 0)
data2 %>%
  filter(bug >= 1)
boxplot(data2$bug, data$bug, range=0)

But both the graphs are exactly the same, how is it possible? Where I am doing 
wrong?


data$bug
  [1] 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0
0 4 1 0
 [40] 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 0 0
0 0 0 0
 [79] 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0
7 0 0 1
[118] 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 0 0
0 1 0 0
[157] 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 1 0
0 0 0 1
[196] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp=DwICAg=sJ6xIWYx-zLMB3EPkvcnVg=9PEhQh2kVeAsRzsn7AkP-g=TZx8pDTF9x1Tu4QZW3x_99uu9RowVjAna39KcjCXSElI1AOk1C_6L2pR8YIVfiod=NxfkBJHBnd8naYPQTd9Z8dZ2m-RCwh_lpGvHVQ8MwYQ=
PLEASE do read the posting guide 
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html=DwICAg=sJ6xIWYx-zLMB3EPkvcnVg=9PEhQh2kVeAsRzsn7AkP-g=TZx8pDTF9x1Tu4QZW3x_99uu9RowVjAna39KcjCXSElI1AOk1C_6L2pR8YIVfiod=exznSElUW1tc6ajt0C8uw5cR8ZqwHRD6tUPAarFYdYo=
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.