Re: [R] Is this a bug or am I making a mistake?

2014-01-12 Thread Patrick Connolly
On Mon, 06-Jan-2014 at 07:38PM +, William Dunlap wrote:

| You could compare the outputs of
| z1 - with(dd, dd$EVYEAR==2012  dd$EVMONTH=='02')

Wouldn't with(dd, EVYEAR==2012  EVMONTH=='02')
be sufficient when using with()?



| (which is like subset()) and that of
| z2 - dd$EVYEAR==2012  dd$EVMONTH=='02'
| (evaluated from within the same context) with
|  table(z1, z2, exclude=NULL)
| That may show something useful.
| 
| Bill Dunlap
| Spotfire, TIBCO Software
| wdunlap tibco.com
| 
| 
|  -Original Message-
|  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] 
On Behalf
|  Of Walter Anderson
|  Sent: Monday, January 06, 2014 11:17 AM
|  To: Sarah Goslee
|  Cc: R Help
|  Subject: Re: [R] Is this a bug or am I making a mistake?
|  
|  On 01/06/2014 11:14 AM, Sarah Goslee wrote:
|   Hi Walter,
|  
|   I can't reproduce your results. Please provide some data that
|   demonstrates the problem.
|  
|   
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-
|  example
|  
|   subset() and [ differ in their handling of NA values, and you don't
|   need the dd$ in the arguments to subset().
|  
|   But those don't explain your result given the information provided.
|   Please provide more information.
|  
|   Sarah
|  
|  
|   On Mon, Jan 6, 2014 at 12:06 PM, Walter Anderson wandrso...@gmail.com 
wrote:
|   I have a data frame that I am extracting some records from and noticed 
the
|   following issue
|  
|   I originally used tmp - subset(dd, dd$EVYEAR==2012  dd$EVMONTH=='02')
|  
|   and noticed that I wasn't ending up with all of the records I should 
have;
|   however, when I used
|  
|   tmp - dd[dd$EVYEAR==2012  dd$EVMONTH=='02',]
|  
|   I did get all of the records I should have.
|  
|   I thought the two forms were equivalent, am I mistaken?
|  
|  Thanks everyone for the response.  I didn't provide a reproducible test,
|  since the data I experienced this issue with was   quite large ( 40MB)
|  and I have not been able to reproduce the problem with any other data
|  set.  I have also performed the subset using Microsoft Access on the
|  original dbf file I use for the data frame and confirmed that the second
|  query format (dd[QUERY,]) is producing the correct results.  It doesn't
|  appear that any of the impacted (or any in the data frame) contain NA
|  records.
|  
|  I am not really looking for any particular solution, but was surprised
|  by the different results from what I presumed to be the same query.  If
|  it is believed to be a possible bug, I would be glad to package up the
|  data that is generating the issue, but not sure where to place such a
|  large data set.
|  
|  __
|  R-help@r-project.org mailing list
|  https://stat.ethz.ch/mailman/listinfo/r-help
|  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
|  and provide commented, minimal, self-contained, reproducible code.
| 
| __
| R-help@r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-help
| PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
| and provide commented, minimal, self-contained, reproducible code.

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is this a bug or am I making a mistake?

2014-01-12 Thread William Dunlap
 Wouldn't with(dd, EVYEAR==2012  EVMONTH=='02')
 be sufficient when using with()?

It probably would be sufficient to get the right answer, but I
thought the OP was wondering why there was a difference.
Comparing the results of his original code with new code
would help uncover the reason.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: Patrick Connolly [mailto:p_conno...@slingshot.co.nz]
 Sent: Sunday, January 12, 2014 12:56 AM
 To: William Dunlap
 Cc: Walter Anderson; Sarah Goslee; R Help
 Subject: Re: [R] Is this a bug or am I making a mistake?
 
 On Mon, 06-Jan-2014 at 07:38PM +, William Dunlap wrote:
 
 | You could compare the outputs of
 | z1 - with(dd, dd$EVYEAR==2012  dd$EVMONTH=='02')
 
 Wouldn't with(dd, EVYEAR==2012  EVMONTH=='02')
 be sufficient when using with()?
 
 
 
 | (which is like subset()) and that of
 | z2 - dd$EVYEAR==2012  dd$EVMONTH=='02'
 | (evaluated from within the same context) with
 |  table(z1, z2, exclude=NULL)
 | That may show something useful.
 |
 | Bill Dunlap
 | Spotfire, TIBCO Software
 | wdunlap tibco.com
 |
 |
 |  -Original Message-
 |  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] 
 On
 Behalf
 |  Of Walter Anderson
 |  Sent: Monday, January 06, 2014 11:17 AM
 |  To: Sarah Goslee
 |  Cc: R Help
 |  Subject: Re: [R] Is this a bug or am I making a mistake?
 | 
 |  On 01/06/2014 11:14 AM, Sarah Goslee wrote:
 |   Hi Walter,
 |  
 |   I can't reproduce your results. Please provide some data that
 |   demonstrates the problem.
 |  
 |   http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-
 reproducible-
 |  example
 |  
 |   subset() and [ differ in their handling of NA values, and you don't
 |   need the dd$ in the arguments to subset().
 |  
 |   But those don't explain your result given the information provided.
 |   Please provide more information.
 |  
 |   Sarah
 |  
 |  
 |   On Mon, Jan 6, 2014 at 12:06 PM, Walter Anderson wandrso...@gmail.com
 wrote:
 |   I have a data frame that I am extracting some records from and 
 noticed the
 |   following issue
 |  
 |   I originally used tmp - subset(dd, dd$EVYEAR==2012  
 dd$EVMONTH=='02')
 |  
 |   and noticed that I wasn't ending up with all of the records I should 
 have;
 |   however, when I used
 |  
 |   tmp - dd[dd$EVYEAR==2012  dd$EVMONTH=='02',]
 |  
 |   I did get all of the records I should have.
 |  
 |   I thought the two forms were equivalent, am I mistaken?
 |  
 |  Thanks everyone for the response.  I didn't provide a reproducible test,
 |  since the data I experienced this issue with was   quite large ( 40MB)
 |  and I have not been able to reproduce the problem with any other data
 |  set.  I have also performed the subset using Microsoft Access on the
 |  original dbf file I use for the data frame and confirmed that the second
 |  query format (dd[QUERY,]) is producing the correct results.  It doesn't
 |  appear that any of the impacted (or any in the data frame) contain NA
 |  records.
 | 
 |  I am not really looking for any particular solution, but was surprised
 |  by the different results from what I presumed to be the same query.  If
 |  it is believed to be a possible bug, I would be glad to package up the
 |  data that is generating the issue, but not sure where to place such a
 |  large data set.
 | 
 |  __
 |  R-help@r-project.org mailing list
 |  https://stat.ethz.ch/mailman/listinfo/r-help
 |  PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 |  and provide commented, minimal, self-contained, reproducible code.
 |
 | __
 | R-help@r-project.org mailing list
 | https://stat.ethz.ch/mailman/listinfo/r-help
 | PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 | and provide commented, minimal, self-contained, reproducible code.
 
 --
 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
___Patrick Connolly
  {~._.~}   Great minds discuss ideas
  _( Y )_   Average minds discuss events
 (:_~*~_:)  Small minds discuss people
  (_)-(_). Eleanor Roosevelt
 
 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is this a bug or am I making a mistake?

2014-01-12 Thread David Winsemius

On Jan 6, 2014, at 11:16 AM, Walter Anderson wrote:

 On 01/06/2014 11:14 AM, Sarah Goslee wrote:
 Hi Walter,
 
 I can't reproduce your results. Please provide some data that
 demonstrates the problem.
 
 http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
 
 subset() and [ differ in their handling of NA values, and you don't
 need the dd$ in the arguments to subset().
 
 But those don't explain your result given the information provided.
 Please provide more information.
 
 Sarah
 
 
 On Mon, Jan 6, 2014 at 12:06 PM, Walter Anderson wandrso...@gmail.com 
 wrote:
 I have a data frame that I am extracting some records from and noticed the
 following issue
 
 I originally used tmp - subset(dd, dd$EVYEAR==2012  dd$EVMONTH=='02')
 
 and noticed that I wasn't ending up with all of the records I should have;
 however, when I used
 
 tmp - dd[dd$EVYEAR==2012  dd$EVMONTH=='02',]
 
 I did get all of the records I should have.
 
 I thought the two forms were equivalent, am I mistaken?
 
 Thanks everyone for the response.  I didn't provide a reproducible test, 
 since the data I experienced this issue with was   quite large ( 40MB) and I 
 have not been able to reproduce the problem with any other data set.  I have 
 also performed the subset using Microsoft Access on the original dbf file I 
 use for the data frame and confirmed that the second query format 
 (dd[QUERY,]) is producing the correct results.  It doesn't appear that any of 
 the impacted (or any in the data frame) contain NA records.

What does it mean to say it doesn't appear that any of the impacted (or any in 
the data frame) contain NA records? Where is the code and output to support 
that appearance.

What does this show?

table( is.na(dd$EVYEAR==2012, is.na(dd$EVMONTH=='02') )

The other difference between [ and subset is that drop=FALSE in `subset` 
although how that would affect results is not clear.

 
 I am not really looking for any particular solution, but was surprised by the 
 different results from what I presumed to be the same query.  If it is 
 believed to be a possible bug, I would be glad to package up the data that is 
 generating the issue, but not sure where to place such a large data set.

I don't think you have yet demonstrated a bug.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Is this a bug or am I making a mistake?

2014-01-06 Thread Walter Anderson
I have a data frame that I am extracting some records from and noticed 
the following issue


I originally used tmp - subset(dd, dd$EVYEAR==2012  dd$EVMONTH=='02')

and noticed that I wasn't ending up with all of the records I should 
have; however, when I used


tmp - dd[dd$EVYEAR==2012  dd$EVMONTH=='02',]

I did get all of the records I should have.

I thought the two forms were equivalent, am I mistaken?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is this a bug or am I making a mistake?

2014-01-06 Thread Sarah Goslee
Hi Walter,

I can't reproduce your results. Please provide some data that
demonstrates the problem.

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

subset() and [ differ in their handling of NA values, and you don't
need the dd$ in the arguments to subset().

But those don't explain your result given the information provided.
Please provide more information.

Sarah


On Mon, Jan 6, 2014 at 12:06 PM, Walter Anderson wandrso...@gmail.com wrote:
 I have a data frame that I am extracting some records from and noticed the
 following issue

 I originally used tmp - subset(dd, dd$EVYEAR==2012  dd$EVMONTH=='02')

 and noticed that I wasn't ending up with all of the records I should have;
 however, when I used

 tmp - dd[dd$EVYEAR==2012  dd$EVMONTH=='02',]

 I did get all of the records I should have.

 I thought the two forms were equivalent, am I mistaken?


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is this a bug or am I making a mistake?

2014-01-06 Thread Walter Anderson

On 01/06/2014 11:14 AM, Sarah Goslee wrote:

Hi Walter,

I can't reproduce your results. Please provide some data that
demonstrates the problem.

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

subset() and [ differ in their handling of NA values, and you don't
need the dd$ in the arguments to subset().

But those don't explain your result given the information provided.
Please provide more information.

Sarah


On Mon, Jan 6, 2014 at 12:06 PM, Walter Anderson wandrso...@gmail.com wrote:

I have a data frame that I am extracting some records from and noticed the
following issue

I originally used tmp - subset(dd, dd$EVYEAR==2012  dd$EVMONTH=='02')

and noticed that I wasn't ending up with all of the records I should have;
however, when I used

tmp - dd[dd$EVYEAR==2012  dd$EVMONTH=='02',]

I did get all of the records I should have.

I thought the two forms were equivalent, am I mistaken?

Thanks everyone for the response.  I didn't provide a reproducible test, 
since the data I experienced this issue with was   quite large ( 40MB) 
and I have not been able to reproduce the problem with any other data 
set.  I have also performed the subset using Microsoft Access on the 
original dbf file I use for the data frame and confirmed that the second 
query format (dd[QUERY,]) is producing the correct results.  It doesn't 
appear that any of the impacted (or any in the data frame) contain NA 
records.


I am not really looking for any particular solution, but was surprised 
by the different results from what I presumed to be the same query.  If 
it is believed to be a possible bug, I would be glad to package up the 
data that is generating the issue, but not sure where to place such a 
large data set.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is this a bug or am I making a mistake?

2014-01-06 Thread William Dunlap
You could compare the outputs of
z1 - with(dd, dd$EVYEAR==2012  dd$EVMONTH=='02')
(which is like subset()) and that of
z2 - dd$EVYEAR==2012  dd$EVMONTH=='02'
(evaluated from within the same context) with
 table(z1, z2, exclude=NULL)
That may show something useful.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Walter Anderson
 Sent: Monday, January 06, 2014 11:17 AM
 To: Sarah Goslee
 Cc: R Help
 Subject: Re: [R] Is this a bug or am I making a mistake?
 
 On 01/06/2014 11:14 AM, Sarah Goslee wrote:
  Hi Walter,
 
  I can't reproduce your results. Please provide some data that
  demonstrates the problem.
 
  http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-
 example
 
  subset() and [ differ in their handling of NA values, and you don't
  need the dd$ in the arguments to subset().
 
  But those don't explain your result given the information provided.
  Please provide more information.
 
  Sarah
 
 
  On Mon, Jan 6, 2014 at 12:06 PM, Walter Anderson wandrso...@gmail.com 
  wrote:
  I have a data frame that I am extracting some records from and noticed the
  following issue
 
  I originally used tmp - subset(dd, dd$EVYEAR==2012  dd$EVMONTH=='02')
 
  and noticed that I wasn't ending up with all of the records I should have;
  however, when I used
 
  tmp - dd[dd$EVYEAR==2012  dd$EVMONTH=='02',]
 
  I did get all of the records I should have.
 
  I thought the two forms were equivalent, am I mistaken?
 
 Thanks everyone for the response.  I didn't provide a reproducible test,
 since the data I experienced this issue with was   quite large ( 40MB)
 and I have not been able to reproduce the problem with any other data
 set.  I have also performed the subset using Microsoft Access on the
 original dbf file I use for the data frame and confirmed that the second
 query format (dd[QUERY,]) is producing the correct results.  It doesn't
 appear that any of the impacted (or any in the data frame) contain NA
 records.
 
 I am not really looking for any particular solution, but was surprised
 by the different results from what I presumed to be the same query.  If
 it is believed to be a possible bug, I would be glad to package up the
 data that is generating the issue, but not sure where to place such a
 large data set.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.