Re: [R] Plotting graph for Missing values

2009-01-26 Thread Petr PIKAL
Hi Jim

r-help-boun...@r-project.org napsal dne 26.01.2009 15:44:32:

> >From your original posting:
> 
> > I tried the code which u provided.
> > In place of "dos" in command "pat1 <- rbinom(length(dos), 1, .5)  # 
generate
> > some data"
> > I added "patientinformation1" variable and then I gave the command for
> > "tapply" but its giving me the following error:
> >
> > Error in tapply(pat1, format(dos, "%Y%m"), function(x) sum(x == 0)) :
> >   arguments must have same length
> 
> I would say that "pat1" and "dos" were not of the same length.  Check
> your code and objects to verify this; that is what the error message
> is saying.  You said you added the "patientinformation1" variable, but
> it does not seem to appear in the error message.

You are really patient. I presume Shreyasee does not know much about data 
structures and function use in R. It probably could help a lot if s/he 
looked into same basic documents like R intro.

If I understand correctly what was done is

pat1 <- rbinom(length(patientinformation1), 1, .5)

what does not make much sense as it code an artificial data as well and 
most probably there is "dos" version in memory which was constructed 
during testing your code and which has length 335. This could result in 
mentioned error

> > Error in tapply(pat1, format(dos, "%Y%m"), function(x) sum(x == 0)) :
> >   arguments must have same length

Then note

> > ds <- read.csv(file="D:/Shreyasee laptop data/ASC Dataset/Subset of 
the ASC
> > Dataset.csv", header=TRUE)
> >> attach(ds)
> >> str(dos)


if str(ds) is issued, it could reveal what kind of data s/he has. 
Also format(dos, ...) would not work as dos is factor not Date

> >> str(dos)
> >
> > I am getting the following message:
> >
> >  Factor w/ 12 levels "-00-00","6-Aug",..: 6 6 6 6 6 6 6 6 6 6 ...

If it was

> aggregate(ds[,-1], list(format(ds$dos, "%Y%m")), function(x) sum(x==0))
   Group.1 pat1 pat2
1   200605   12   16
2   200606   20   18
3   200607   12   13
4   200608   18   15
5   200609   18   11
6   200610   17   15
7   200611   19   17
8   200612   14   15
9   200701   14   18
10  200702   13   13
11  200703   16   19

could do the trick if patientinformation variables had the same structure 
as you anticipate which is not true

> >> >> >> >> > *for(i in 1:length(dos))
> >> >> >> >> > for(j in 1:length(patientinformation1)
> >> >> >> >> > if(dos[i]=="May-06" && patientinformation1[j]=="")
> >> >> >> >> > a <- j+1

Well, if Shreyasee manage to redefine dos to Date mode (which will not be 
straightforward if "dos" has awkward structure), then something like

aggregate(ds[,-1], list(format(ds$dos, "%Y%m")), function(x) sum(x==""))

could do the trick.

Regards
Petr

> 
> On Sun, Jan 25, 2009 at 11:48 PM, Shreyasee 
 wrote:
> > Hi Jim,
> >
> > I run the following code
> >
> > ds <- read.csv(file="D:/Shreyasee laptop data/ASC Dataset/Subset of 
the ASC
> > Dataset.csv", header=TRUE)
> >> attach(ds)
> >> str(dos)
> >
> > I am getting the following message:
> >
> >  Factor w/ 12 levels "-00-00","6-Aug",..: 6 6 6 6 6 6 6 6 6 6 ...
> >
> > Thanks,
> > Shreyasee
> >
> >
> >
> > On Mon, Jan 26, 2009 at 12:20 PM, jim holtman  
wrote:
> >>
> >> do:
> >>
> >> str(dos)
> >> str(patientinformation1)
> >>
> >> They must be the same length for the command to work: must be a one 
to
> >> one match of the data.
> >>
> >> On Sun, Jan 25, 2009 at 10:23 PM, Shreyasee 

> >> wrote:
> >> > Hi Jim,
> >> >
> >> > I tried the code which u provided.
> >> > In place of "dos" in command "pat1 <- rbinom(length(dos), 1, .5)  #
> >> > generate
> >> > some data"
> >> > I added "patientinformation1" variable and then I gave the command 
for
> >> > "tapply" but its giving me the following error:
> >> >
> >> > Error in tapply(pat1, format(dos, "%Y%m"), function(x) sum(x == 0)) 
:
> >> >   arguments must have same length
> >> >
> >> >
> >> > Thanks,
> >> > Shreyasee
> >> >
> >> >
> >> >
> >> > On Mon, Jan 26, 2009 at 10:50 AM, jim holtman 
> >> > wrote:
> >> >>
> >> >> YOu can save the output of the tapply and then replicate it for 
each
> >> >> of the variables.  The data can be used to plot the graphs.
> >> >>
> >> >> On Sun, Jan 25, 2009 at 9:38 PM, Shreyasee
> >> >> 
> >> >> wrote:
> >> >> > Hi Jim,
> >> >> >
> >> >> > I need to calculate the missing values in variable
> >> >> > "patientinformation1"
> >> >> > for
> >> >> > the period of May 2006 to March 2007 and then plot the graph of 
the
> >> >> > percentage of the missing values over these months.
> >> >> > This has to be done for each variable.
> >> >> > The code which you have provided, calculates the missing values 
for
> >> >> > the
> >> >> > months variable, am I right?
> >> >> > I need to calculate for all the variables for each month.
> >> >> >
> >> >> > Thanks,
> >> >> > Shreyasee
> >> >> >
> >> >> >
> >> >> > On Mon, Jan 26, 2009 at 10:29 AM, jim holtman 

> >> >> > wrote:
> >> >> >>
> >> >> >> Here is an example of how you might approach it:
> >> >> >>
> >> >> >> > dos <- seq(

Re: [R] Plotting graph for Missing values

2009-01-26 Thread bartjoosen

> I added "patientinformation1" variable and then I gave the command for
> "tapply" but its giving me the following error:
>
> Error in tapply(pat1, format(dos, "%Y%m"), function(x) sum(x == 0)) :
>   arguments must have same length



seems like you added patientinformation1, but still use pat1 in the tapply
call.

Bart
-- 
View this message in context: 
http://www.nabble.com/Plotting-graph-for-Missing-values-tp21659322p21666790.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting graph for Missing values

2009-01-26 Thread jim holtman
>From your original posting:

> I tried the code which u provided.
> In place of "dos" in command "pat1 <- rbinom(length(dos), 1, .5)  # generate
> some data"
> I added "patientinformation1" variable and then I gave the command for
> "tapply" but its giving me the following error:
>
> Error in tapply(pat1, format(dos, "%Y%m"), function(x) sum(x == 0)) :
>   arguments must have same length

I would say that "pat1" and "dos" were not of the same length.  Check
your code and objects to verify this; that is what the error message
is saying.  You said you added the "patientinformation1" variable, but
it does not seem to appear in the error message.

On Sun, Jan 25, 2009 at 11:48 PM, Shreyasee  wrote:
> Hi Jim,
>
> I run the following code
>
> ds <- read.csv(file="D:/Shreyasee laptop data/ASC Dataset/Subset of the ASC
> Dataset.csv", header=TRUE)
>> attach(ds)
>> str(dos)
>
> I am getting the following message:
>
>  Factor w/ 12 levels "-00-00","6-Aug",..: 6 6 6 6 6 6 6 6 6 6 ...
>
> Thanks,
> Shreyasee
>
>
>
> On Mon, Jan 26, 2009 at 12:20 PM, jim holtman  wrote:
>>
>> do:
>>
>> str(dos)
>> str(patientinformation1)
>>
>> They must be the same length for the command to work: must be a one to
>> one match of the data.
>>
>> On Sun, Jan 25, 2009 at 10:23 PM, Shreyasee 
>> wrote:
>> > Hi Jim,
>> >
>> > I tried the code which u provided.
>> > In place of "dos" in command "pat1 <- rbinom(length(dos), 1, .5)  #
>> > generate
>> > some data"
>> > I added "patientinformation1" variable and then I gave the command for
>> > "tapply" but its giving me the following error:
>> >
>> > Error in tapply(pat1, format(dos, "%Y%m"), function(x) sum(x == 0)) :
>> >   arguments must have same length
>> >
>> >
>> > Thanks,
>> > Shreyasee
>> >
>> >
>> >
>> > On Mon, Jan 26, 2009 at 10:50 AM, jim holtman 
>> > wrote:
>> >>
>> >> YOu can save the output of the tapply and then replicate it for each
>> >> of the variables.  The data can be used to plot the graphs.
>> >>
>> >> On Sun, Jan 25, 2009 at 9:38 PM, Shreyasee
>> >> 
>> >> wrote:
>> >> > Hi Jim,
>> >> >
>> >> > I need to calculate the missing values in variable
>> >> > "patientinformation1"
>> >> > for
>> >> > the period of May 2006 to March 2007 and then plot the graph of the
>> >> > percentage of the missing values over these months.
>> >> > This has to be done for each variable.
>> >> > The code which you have provided, calculates the missing values for
>> >> > the
>> >> > months variable, am I right?
>> >> > I need to calculate for all the variables for each month.
>> >> >
>> >> > Thanks,
>> >> > Shreyasee
>> >> >
>> >> >
>> >> > On Mon, Jan 26, 2009 at 10:29 AM, jim holtman 
>> >> > wrote:
>> >> >>
>> >> >> Here is an example of how you might approach it:
>> >> >>
>> >> >> > dos <- seq(as.Date('2006-05-01'), as.Date('2007-03-31'), by='1
>> >> >> > day')
>> >> >> > pat1 <- rbinom(length(dos), 1, .5)  # generate some data
>> >> >> > # partition by month and then list out the number of zero values
>> >> >> > (missing)
>> >> >> > tapply(pat1, format(dos, "%Y%m"), function(x) sum(x==0))
>> >> >> 200605 200606 200607 200608 200609 200610 200611 200612 200701
>> >> >> 200702
>> >> >> 200703
>> >> >>21 22 16 18 16 15 16 17 14 16
>> >> >> 13
>> >> >> >
>> >> >>
>> >> >>
>> >> >> On Sun, Jan 25, 2009 at 8:51 PM, Shreyasee
>> >> >> 
>> >> >> wrote:
>> >> >> > Hi Jim,
>> >> >> >
>> >> >> > The dataset has 4 variables (dos, patientinformation1,
>> >> >> > patientinformation2,
>> >> >> > patientinformation3).
>> >> >> > In dos variable ther are months (May 2006 to March 2007) when the
>> >> >> > surgeries
>> >> >> > were formed.
>> >> >> > I need to calculate the percentage of missing values for each
>> >> >> > variable
>> >> >> > (patientinformation1, patientinformation2, patientinformation3)
>> >> >> > for
>> >> >> > each
>> >> >> > month.
>> >> >> > I need a common script to calculate that for each variable.
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Shreyasee
>> >> >> >
>> >> >> >
>> >> >> > On Mon, Jan 26, 2009 at 9:46 AM, jim holtman 
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> What does you data look like?  You could use 'split' and then
>> >> >> >> examine
>> >> >> >> the data in each range to count the number missing.  Would have
>> >> >> >> to
>> >> >> >> have some actual data to suggest a solution.
>> >> >> >>
>> >> >> >> On Sun, Jan 25, 2009 at 8:30 PM, Shreyasee
>> >> >> >> 
>> >> >> >> wrote:
>> >> >> >> > Hi,
>> >> >> >> >
>> >> >> >> > I have imported one dataset in R.
>> >> >> >> > I want to calculate the percentage of missing values for each
>> >> >> >> > month
>> >> >> >> > (May
>> >> >> >> > 2006 to March 2007) for each variable.
>> >> >> >> > Just to begin with I tried the following code :
>> >> >> >> >
>> >> >> >> > *for(i in 1:length(dos))
>> >> >> >> > for(j in 1:length(patientinformation1)
>> >> >> >> > if(dos[i]=="May-06" && patientinformation1[j]=="")
>> >> >> >> > a <- j+1
>> >> >> >> > a*
>> >> >> >> >
>> >> >> >> > The ab

Re: [R] Plotting graph for Missing values

2009-01-25 Thread Shreyasee
Hi Jim,

I run the following code

*ds <- read.csv(file="D:/Shreyasee laptop data/ASC Dataset/Subset of the ASC
Dataset.csv", header=TRUE)
> attach(ds)
> str(dos)*

I am getting the following message:

 *Factor w/ 12 levels "-00-00","6-Aug",..: 6 6 6 6 6 6 6 6 6 6 ...*

Thanks,
Shreyasee



On Mon, Jan 26, 2009 at 12:20 PM, jim holtman  wrote:

> do:
>
> str(dos)
> str(patientinformation1)
>
> They must be the same length for the command to work: must be a one to
> one match of the data.
>
> On Sun, Jan 25, 2009 at 10:23 PM, Shreyasee 
> wrote:
> > Hi Jim,
> >
> > I tried the code which u provided.
> > In place of "dos" in command "pat1 <- rbinom(length(dos), 1, .5)  #
> generate
> > some data"
> > I added "patientinformation1" variable and then I gave the command for
> > "tapply" but its giving me the following error:
> >
> > Error in tapply(pat1, format(dos, "%Y%m"), function(x) sum(x == 0)) :
> >   arguments must have same length
> >
> >
> > Thanks,
> > Shreyasee
> >
> >
> >
> > On Mon, Jan 26, 2009 at 10:50 AM, jim holtman 
> wrote:
> >>
> >> YOu can save the output of the tapply and then replicate it for each
> >> of the variables.  The data can be used to plot the graphs.
> >>
> >> On Sun, Jan 25, 2009 at 9:38 PM, Shreyasee  >
> >> wrote:
> >> > Hi Jim,
> >> >
> >> > I need to calculate the missing values in variable
> "patientinformation1"
> >> > for
> >> > the period of May 2006 to March 2007 and then plot the graph of the
> >> > percentage of the missing values over these months.
> >> > This has to be done for each variable.
> >> > The code which you have provided, calculates the missing values for
> the
> >> > months variable, am I right?
> >> > I need to calculate for all the variables for each month.
> >> >
> >> > Thanks,
> >> > Shreyasee
> >> >
> >> >
> >> > On Mon, Jan 26, 2009 at 10:29 AM, jim holtman 
> >> > wrote:
> >> >>
> >> >> Here is an example of how you might approach it:
> >> >>
> >> >> > dos <- seq(as.Date('2006-05-01'), as.Date('2007-03-31'), by='1
> day')
> >> >> > pat1 <- rbinom(length(dos), 1, .5)  # generate some data
> >> >> > # partition by month and then list out the number of zero values
> >> >> > (missing)
> >> >> > tapply(pat1, format(dos, "%Y%m"), function(x) sum(x==0))
> >> >> 200605 200606 200607 200608 200609 200610 200611 200612 200701 200702
> >> >> 200703
> >> >>21 22 16 18 16 15 16 17 14 16
> >> >> 13
> >> >> >
> >> >>
> >> >>
> >> >> On Sun, Jan 25, 2009 at 8:51 PM, Shreyasee
> >> >> 
> >> >> wrote:
> >> >> > Hi Jim,
> >> >> >
> >> >> > The dataset has 4 variables (dos, patientinformation1,
> >> >> > patientinformation2,
> >> >> > patientinformation3).
> >> >> > In dos variable ther are months (May 2006 to March 2007) when the
> >> >> > surgeries
> >> >> > were formed.
> >> >> > I need to calculate the percentage of missing values for each
> >> >> > variable
> >> >> > (patientinformation1, patientinformation2, patientinformation3) for
> >> >> > each
> >> >> > month.
> >> >> > I need a common script to calculate that for each variable.
> >> >> >
> >> >> > Thanks,
> >> >> > Shreyasee
> >> >> >
> >> >> >
> >> >> > On Mon, Jan 26, 2009 at 9:46 AM, jim holtman 
> >> >> > wrote:
> >> >> >>
> >> >> >> What does you data look like?  You could use 'split' and then
> >> >> >> examine
> >> >> >> the data in each range to count the number missing.  Would have to
> >> >> >> have some actual data to suggest a solution.
> >> >> >>
> >> >> >> On Sun, Jan 25, 2009 at 8:30 PM, Shreyasee
> >> >> >> 
> >> >> >> wrote:
> >> >> >> > Hi,
> >> >> >> >
> >> >> >> > I have imported one dataset in R.
> >> >> >> > I want to calculate the percentage of missing values for each
> >> >> >> > month
> >> >> >> > (May
> >> >> >> > 2006 to March 2007) for each variable.
> >> >> >> > Just to begin with I tried the following code :
> >> >> >> >
> >> >> >> > *for(i in 1:length(dos))
> >> >> >> > for(j in 1:length(patientinformation1)
> >> >> >> > if(dos[i]=="May-06" && patientinformation1[j]=="")
> >> >> >> > a <- j+1
> >> >> >> > a*
> >> >> >> >
> >> >> >> > The above code was written to calculate the number of missing
> >> >> >> > values
> >> >> >> > for
> >> >> >> > May
> >> >> >> > 2006, but I am not getting the correct results.
> >> >> >> > Can anybody help me?
> >> >> >> >
> >> >> >> > Thanks,
> >> >> >> > Shreyasee
> >> >> >> >
> >> >> >> >[[alternative HTML version deleted]]
> >> >> >> >
> >> >> >> > __
> >> >> >> > R-help@r-project.org mailing list
> >> >> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> >> >> > PLEASE do read the posting guide
> >> >> >> > http://www.R-project.org/posting-guide.html
> >> >> >> > and provide commented, minimal, self-contained, reproducible
> code.
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> Jim Holtman
> >> >> >> Cincinnati, OH
> >> >> >> +1 513 646 9390
> >> >> >>
> >> >> >> What is the problem that you are trying to solve?
> >> >

Re: [R] Plotting graph for Missing values

2009-01-25 Thread jim holtman
do:

str(dos)
str(patientinformation1)

They must be the same length for the command to work: must be a one to
one match of the data.

On Sun, Jan 25, 2009 at 10:23 PM, Shreyasee  wrote:
> Hi Jim,
>
> I tried the code which u provided.
> In place of "dos" in command "pat1 <- rbinom(length(dos), 1, .5)  # generate
> some data"
> I added "patientinformation1" variable and then I gave the command for
> "tapply" but its giving me the following error:
>
> Error in tapply(pat1, format(dos, "%Y%m"), function(x) sum(x == 0)) :
>   arguments must have same length
>
>
> Thanks,
> Shreyasee
>
>
>
> On Mon, Jan 26, 2009 at 10:50 AM, jim holtman  wrote:
>>
>> YOu can save the output of the tapply and then replicate it for each
>> of the variables.  The data can be used to plot the graphs.
>>
>> On Sun, Jan 25, 2009 at 9:38 PM, Shreyasee 
>> wrote:
>> > Hi Jim,
>> >
>> > I need to calculate the missing values in variable "patientinformation1"
>> > for
>> > the period of May 2006 to March 2007 and then plot the graph of the
>> > percentage of the missing values over these months.
>> > This has to be done for each variable.
>> > The code which you have provided, calculates the missing values for the
>> > months variable, am I right?
>> > I need to calculate for all the variables for each month.
>> >
>> > Thanks,
>> > Shreyasee
>> >
>> >
>> > On Mon, Jan 26, 2009 at 10:29 AM, jim holtman 
>> > wrote:
>> >>
>> >> Here is an example of how you might approach it:
>> >>
>> >> > dos <- seq(as.Date('2006-05-01'), as.Date('2007-03-31'), by='1 day')
>> >> > pat1 <- rbinom(length(dos), 1, .5)  # generate some data
>> >> > # partition by month and then list out the number of zero values
>> >> > (missing)
>> >> > tapply(pat1, format(dos, "%Y%m"), function(x) sum(x==0))
>> >> 200605 200606 200607 200608 200609 200610 200611 200612 200701 200702
>> >> 200703
>> >>21 22 16 18 16 15 16 17 14 16
>> >> 13
>> >> >
>> >>
>> >>
>> >> On Sun, Jan 25, 2009 at 8:51 PM, Shreyasee
>> >> 
>> >> wrote:
>> >> > Hi Jim,
>> >> >
>> >> > The dataset has 4 variables (dos, patientinformation1,
>> >> > patientinformation2,
>> >> > patientinformation3).
>> >> > In dos variable ther are months (May 2006 to March 2007) when the
>> >> > surgeries
>> >> > were formed.
>> >> > I need to calculate the percentage of missing values for each
>> >> > variable
>> >> > (patientinformation1, patientinformation2, patientinformation3) for
>> >> > each
>> >> > month.
>> >> > I need a common script to calculate that for each variable.
>> >> >
>> >> > Thanks,
>> >> > Shreyasee
>> >> >
>> >> >
>> >> > On Mon, Jan 26, 2009 at 9:46 AM, jim holtman 
>> >> > wrote:
>> >> >>
>> >> >> What does you data look like?  You could use 'split' and then
>> >> >> examine
>> >> >> the data in each range to count the number missing.  Would have to
>> >> >> have some actual data to suggest a solution.
>> >> >>
>> >> >> On Sun, Jan 25, 2009 at 8:30 PM, Shreyasee
>> >> >> 
>> >> >> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> > I have imported one dataset in R.
>> >> >> > I want to calculate the percentage of missing values for each
>> >> >> > month
>> >> >> > (May
>> >> >> > 2006 to March 2007) for each variable.
>> >> >> > Just to begin with I tried the following code :
>> >> >> >
>> >> >> > *for(i in 1:length(dos))
>> >> >> > for(j in 1:length(patientinformation1)
>> >> >> > if(dos[i]=="May-06" && patientinformation1[j]=="")
>> >> >> > a <- j+1
>> >> >> > a*
>> >> >> >
>> >> >> > The above code was written to calculate the number of missing
>> >> >> > values
>> >> >> > for
>> >> >> > May
>> >> >> > 2006, but I am not getting the correct results.
>> >> >> > Can anybody help me?
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Shreyasee
>> >> >> >
>> >> >> >[[alternative HTML version deleted]]
>> >> >> >
>> >> >> > __
>> >> >> > R-help@r-project.org mailing list
>> >> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> > PLEASE do read the posting guide
>> >> >> > http://www.R-project.org/posting-guide.html
>> >> >> > and provide commented, minimal, self-contained, reproducible code.
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Jim Holtman
>> >> >> Cincinnati, OH
>> >> >> +1 513 646 9390
>> >> >>
>> >> >> What is the problem that you are trying to solve?
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Jim Holtman
>> >> Cincinnati, OH
>> >> +1 513 646 9390
>> >>
>> >> What is the problem that you are trying to solve?
>> >
>> >
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem that you are trying to solve?
>
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contain

Re: [R] Plotting graph for Missing values

2009-01-25 Thread Shreyasee
Hi Jim,

I tried the code which u provided.
In place of "dos" in command "pat1 <- rbinom(length(dos), 1, .5)  # generate
some data"
I added "patientinformation1" variable and then I gave the command for
"tapply" but its giving me the following error:

*Error in tapply(pat1, format(dos, "%Y%m"), function(x) sum(x == 0)) :
  arguments must have same length*


Thanks,
Shreyasee



On Mon, Jan 26, 2009 at 10:50 AM, jim holtman  wrote:

> YOu can save the output of the tapply and then replicate it for each
> of the variables.  The data can be used to plot the graphs.
>
> On Sun, Jan 25, 2009 at 9:38 PM, Shreyasee 
> wrote:
> > Hi Jim,
> >
> > I need to calculate the missing values in variable "patientinformation1"
> for
> > the period of May 2006 to March 2007 and then plot the graph of the
> > percentage of the missing values over these months.
> > This has to be done for each variable.
> > The code which you have provided, calculates the missing values for the
> > months variable, am I right?
> > I need to calculate for all the variables for each month.
> >
> > Thanks,
> > Shreyasee
> >
> >
> > On Mon, Jan 26, 2009 at 10:29 AM, jim holtman 
> wrote:
> >>
> >> Here is an example of how you might approach it:
> >>
> >> > dos <- seq(as.Date('2006-05-01'), as.Date('2007-03-31'), by='1 day')
> >> > pat1 <- rbinom(length(dos), 1, .5)  # generate some data
> >> > # partition by month and then list out the number of zero values
> >> > (missing)
> >> > tapply(pat1, format(dos, "%Y%m"), function(x) sum(x==0))
> >> 200605 200606 200607 200608 200609 200610 200611 200612 200701 200702
> >> 200703
> >>21 22 16 18 16 15 16 17 14 16
> >> 13
> >> >
> >>
> >>
> >> On Sun, Jan 25, 2009 at 8:51 PM, Shreyasee  >
> >> wrote:
> >> > Hi Jim,
> >> >
> >> > The dataset has 4 variables (dos, patientinformation1,
> >> > patientinformation2,
> >> > patientinformation3).
> >> > In dos variable ther are months (May 2006 to March 2007) when the
> >> > surgeries
> >> > were formed.
> >> > I need to calculate the percentage of missing values for each variable
> >> > (patientinformation1, patientinformation2, patientinformation3) for
> each
> >> > month.
> >> > I need a common script to calculate that for each variable.
> >> >
> >> > Thanks,
> >> > Shreyasee
> >> >
> >> >
> >> > On Mon, Jan 26, 2009 at 9:46 AM, jim holtman 
> wrote:
> >> >>
> >> >> What does you data look like?  You could use 'split' and then examine
> >> >> the data in each range to count the number missing.  Would have to
> >> >> have some actual data to suggest a solution.
> >> >>
> >> >> On Sun, Jan 25, 2009 at 8:30 PM, Shreyasee
> >> >> 
> >> >> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > I have imported one dataset in R.
> >> >> > I want to calculate the percentage of missing values for each month
> >> >> > (May
> >> >> > 2006 to March 2007) for each variable.
> >> >> > Just to begin with I tried the following code :
> >> >> >
> >> >> > *for(i in 1:length(dos))
> >> >> > for(j in 1:length(patientinformation1)
> >> >> > if(dos[i]=="May-06" && patientinformation1[j]=="")
> >> >> > a <- j+1
> >> >> > a*
> >> >> >
> >> >> > The above code was written to calculate the number of missing
> values
> >> >> > for
> >> >> > May
> >> >> > 2006, but I am not getting the correct results.
> >> >> > Can anybody help me?
> >> >> >
> >> >> > Thanks,
> >> >> > Shreyasee
> >> >> >
> >> >> >[[alternative HTML version deleted]]
> >> >> >
> >> >> > __
> >> >> > R-help@r-project.org mailing list
> >> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> >> > PLEASE do read the posting guide
> >> >> > http://www.R-project.org/posting-guide.html
> >> >> > and provide commented, minimal, self-contained, reproducible code.
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Jim Holtman
> >> >> Cincinnati, OH
> >> >> +1 513 646 9390
> >> >>
> >> >> What is the problem that you are trying to solve?
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Jim Holtman
> >> Cincinnati, OH
> >> +1 513 646 9390
> >>
> >> What is the problem that you are trying to solve?
> >
> >
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting graph for Missing values

2009-01-25 Thread jim holtman
YOu can save the output of the tapply and then replicate it for each
of the variables.  The data can be used to plot the graphs.

On Sun, Jan 25, 2009 at 9:38 PM, Shreyasee  wrote:
> Hi Jim,
>
> I need to calculate the missing values in variable "patientinformation1" for
> the period of May 2006 to March 2007 and then plot the graph of the
> percentage of the missing values over these months.
> This has to be done for each variable.
> The code which you have provided, calculates the missing values for the
> months variable, am I right?
> I need to calculate for all the variables for each month.
>
> Thanks,
> Shreyasee
>
>
> On Mon, Jan 26, 2009 at 10:29 AM, jim holtman  wrote:
>>
>> Here is an example of how you might approach it:
>>
>> > dos <- seq(as.Date('2006-05-01'), as.Date('2007-03-31'), by='1 day')
>> > pat1 <- rbinom(length(dos), 1, .5)  # generate some data
>> > # partition by month and then list out the number of zero values
>> > (missing)
>> > tapply(pat1, format(dos, "%Y%m"), function(x) sum(x==0))
>> 200605 200606 200607 200608 200609 200610 200611 200612 200701 200702
>> 200703
>>21 22 16 18 16 15 16 17 14 16
>> 13
>> >
>>
>>
>> On Sun, Jan 25, 2009 at 8:51 PM, Shreyasee 
>> wrote:
>> > Hi Jim,
>> >
>> > The dataset has 4 variables (dos, patientinformation1,
>> > patientinformation2,
>> > patientinformation3).
>> > In dos variable ther are months (May 2006 to March 2007) when the
>> > surgeries
>> > were formed.
>> > I need to calculate the percentage of missing values for each variable
>> > (patientinformation1, patientinformation2, patientinformation3) for each
>> > month.
>> > I need a common script to calculate that for each variable.
>> >
>> > Thanks,
>> > Shreyasee
>> >
>> >
>> > On Mon, Jan 26, 2009 at 9:46 AM, jim holtman  wrote:
>> >>
>> >> What does you data look like?  You could use 'split' and then examine
>> >> the data in each range to count the number missing.  Would have to
>> >> have some actual data to suggest a solution.
>> >>
>> >> On Sun, Jan 25, 2009 at 8:30 PM, Shreyasee
>> >> 
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > I have imported one dataset in R.
>> >> > I want to calculate the percentage of missing values for each month
>> >> > (May
>> >> > 2006 to March 2007) for each variable.
>> >> > Just to begin with I tried the following code :
>> >> >
>> >> > *for(i in 1:length(dos))
>> >> > for(j in 1:length(patientinformation1)
>> >> > if(dos[i]=="May-06" && patientinformation1[j]=="")
>> >> > a <- j+1
>> >> > a*
>> >> >
>> >> > The above code was written to calculate the number of missing values
>> >> > for
>> >> > May
>> >> > 2006, but I am not getting the correct results.
>> >> > Can anybody help me?
>> >> >
>> >> > Thanks,
>> >> > Shreyasee
>> >> >
>> >> >[[alternative HTML version deleted]]
>> >> >
>> >> > __
>> >> > R-help@r-project.org mailing list
>> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> > PLEASE do read the posting guide
>> >> > http://www.R-project.org/posting-guide.html
>> >> > and provide commented, minimal, self-contained, reproducible code.
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Jim Holtman
>> >> Cincinnati, OH
>> >> +1 513 646 9390
>> >>
>> >> What is the problem that you are trying to solve?
>> >
>> >
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem that you are trying to solve?
>
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting graph for Missing values

2009-01-25 Thread Shreyasee
Hi Jim,

I need to calculate the missing values in variable "patientinformation1" for
the period of May 2006 to March 2007 and then plot the graph of the
percentage of the missing values over these months.
This has to be done for each variable.
The code which you have provided, calculates the missing values for the
months variable, am I right?
I need to calculate for all the variables for each month.

Thanks,
Shreyasee


On Mon, Jan 26, 2009 at 10:29 AM, jim holtman  wrote:

> Here is an example of how you might approach it:
>
> > dos <- seq(as.Date('2006-05-01'), as.Date('2007-03-31'), by='1 day')
> > pat1 <- rbinom(length(dos), 1, .5)  # generate some data
> > # partition by month and then list out the number of zero values
> (missing)
> > tapply(pat1, format(dos, "%Y%m"), function(x) sum(x==0))
> 200605 200606 200607 200608 200609 200610 200611 200612 200701 200702
> 200703
>21 22 16 18 16 15 16 17 14 16 13
> >
>
>
> On Sun, Jan 25, 2009 at 8:51 PM, Shreyasee 
> wrote:
> > Hi Jim,
> >
> > The dataset has 4 variables (dos, patientinformation1,
> patientinformation2,
> > patientinformation3).
> > In dos variable ther are months (May 2006 to March 2007) when the
> surgeries
> > were formed.
> > I need to calculate the percentage of missing values for each variable
> > (patientinformation1, patientinformation2, patientinformation3) for each
> > month.
> > I need a common script to calculate that for each variable.
> >
> > Thanks,
> > Shreyasee
> >
> >
> > On Mon, Jan 26, 2009 at 9:46 AM, jim holtman  wrote:
> >>
> >> What does you data look like?  You could use 'split' and then examine
> >> the data in each range to count the number missing.  Would have to
> >> have some actual data to suggest a solution.
> >>
> >> On Sun, Jan 25, 2009 at 8:30 PM, Shreyasee  >
> >> wrote:
> >> > Hi,
> >> >
> >> > I have imported one dataset in R.
> >> > I want to calculate the percentage of missing values for each month
> (May
> >> > 2006 to March 2007) for each variable.
> >> > Just to begin with I tried the following code :
> >> >
> >> > *for(i in 1:length(dos))
> >> > for(j in 1:length(patientinformation1)
> >> > if(dos[i]=="May-06" && patientinformation1[j]=="")
> >> > a <- j+1
> >> > a*
> >> >
> >> > The above code was written to calculate the number of missing values
> for
> >> > May
> >> > 2006, but I am not getting the correct results.
> >> > Can anybody help me?
> >> >
> >> > Thanks,
> >> > Shreyasee
> >> >
> >> >[[alternative HTML version deleted]]
> >> >
> >> > __
> >> > R-help@r-project.org mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide
> >> > http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >> >
> >>
> >>
> >>
> >> --
> >> Jim Holtman
> >> Cincinnati, OH
> >> +1 513 646 9390
> >>
> >> What is the problem that you are trying to solve?
> >
> >
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting graph for Missing values

2009-01-25 Thread jim holtman
Here is an example of how you might approach it:

> dos <- seq(as.Date('2006-05-01'), as.Date('2007-03-31'), by='1 day')
> pat1 <- rbinom(length(dos), 1, .5)  # generate some data
> # partition by month and then list out the number of zero values (missing)
> tapply(pat1, format(dos, "%Y%m"), function(x) sum(x==0))
200605 200606 200607 200608 200609 200610 200611 200612 200701 200702 200703
21 22 16 18 16 15 16 17 14 16 13
>


On Sun, Jan 25, 2009 at 8:51 PM, Shreyasee  wrote:
> Hi Jim,
>
> The dataset has 4 variables (dos, patientinformation1, patientinformation2,
> patientinformation3).
> In dos variable ther are months (May 2006 to March 2007) when the surgeries
> were formed.
> I need to calculate the percentage of missing values for each variable
> (patientinformation1, patientinformation2, patientinformation3) for each
> month.
> I need a common script to calculate that for each variable.
>
> Thanks,
> Shreyasee
>
>
> On Mon, Jan 26, 2009 at 9:46 AM, jim holtman  wrote:
>>
>> What does you data look like?  You could use 'split' and then examine
>> the data in each range to count the number missing.  Would have to
>> have some actual data to suggest a solution.
>>
>> On Sun, Jan 25, 2009 at 8:30 PM, Shreyasee 
>> wrote:
>> > Hi,
>> >
>> > I have imported one dataset in R.
>> > I want to calculate the percentage of missing values for each month (May
>> > 2006 to March 2007) for each variable.
>> > Just to begin with I tried the following code :
>> >
>> > *for(i in 1:length(dos))
>> > for(j in 1:length(patientinformation1)
>> > if(dos[i]=="May-06" && patientinformation1[j]=="")
>> > a <- j+1
>> > a*
>> >
>> > The above code was written to calculate the number of missing values for
>> > May
>> > 2006, but I am not getting the correct results.
>> > Can anybody help me?
>> >
>> > Thanks,
>> > Shreyasee
>> >
>> >[[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem that you are trying to solve?
>
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting graph for Missing values

2009-01-25 Thread Shreyasee
Hi Jim,

The dataset has 4 variables (dos, patientinformation1, patientinformation2,
patientinformation3).
In dos variable ther are months (May 2006 to March 2007) when the surgeries
were formed.
I need to calculate the percentage of missing values for each variable
(patientinformation1, patientinformation2, patientinformation3) for each
month.
I need a common script to calculate that for each variable.

Thanks,
Shreyasee


On Mon, Jan 26, 2009 at 9:46 AM, jim holtman  wrote:

> What does you data look like?  You could use 'split' and then examine
> the data in each range to count the number missing.  Would have to
> have some actual data to suggest a solution.
>
> On Sun, Jan 25, 2009 at 8:30 PM, Shreyasee 
> wrote:
> > Hi,
> >
> > I have imported one dataset in R.
> > I want to calculate the percentage of missing values for each month (May
> > 2006 to March 2007) for each variable.
> > Just to begin with I tried the following code :
> >
> > *for(i in 1:length(dos))
> > for(j in 1:length(patientinformation1)
> > if(dos[i]=="May-06" && patientinformation1[j]=="")
> > a <- j+1
> > a*
> >
> > The above code was written to calculate the number of missing values for
> May
> > 2006, but I am not getting the correct results.
> > Can anybody help me?
> >
> > Thanks,
> > Shreyasee
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting graph for Missing values

2009-01-25 Thread jim holtman
What does you data look like?  You could use 'split' and then examine
the data in each range to count the number missing.  Would have to
have some actual data to suggest a solution.

On Sun, Jan 25, 2009 at 8:30 PM, Shreyasee  wrote:
> Hi,
>
> I have imported one dataset in R.
> I want to calculate the percentage of missing values for each month (May
> 2006 to March 2007) for each variable.
> Just to begin with I tried the following code :
>
> *for(i in 1:length(dos))
> for(j in 1:length(patientinformation1)
> if(dos[i]=="May-06" && patientinformation1[j]=="")
> a <- j+1
> a*
>
> The above code was written to calculate the number of missing values for May
> 2006, but I am not getting the correct results.
> Can anybody help me?
>
> Thanks,
> Shreyasee
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.