Re: [R] why data.frame, mutate package and not lists

2016-09-15 Thread Rolf Turner

On 15/09/16 14:04, S Ellison wrote:




If you want
to add variable to data.frame you have to use attach, detach. Right?


I'd have said "not at all", not "not quite". attach and detach have
almost exactly nothing to do with adding to a data frame. You can add
to a data frame using dfrm$newvar <-  dfrm['newvar'] <-
 cbind(dfrm, newvar=) #adds a new variable
called 'newvar' rbind #to add rows merge #to add columns and/or rows
from another data frame ... and a few other things.

The only relevance of attach/detach is to do with the behaviour of
attached objects, not to do with adding to data frames. If you have
attach()ed something, changing the original object does not
automatically update the copy of its variables in the current
environment, or vice versa, because attach(), as documented, creates
a _copy_. So _if_ you have attach()ed a data frame - or a list - you
can't change the copy by changing the original object and you can't
change the original object by changing the copy.  Only if you need to
change both do you need to detach and reattach.

As a rule, I generally avoid attach() for that and other reasons
(most of which are listed in ?attach). attach()is only sensible if
you have already completed all the manipulation needed on the
attached object first. Even then, using with() is safer.


Extremely well and clearly put.  This is one of those "I wish *I* had 
said that!" posts.


cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why data.frame, mutate package and not lists

2016-09-15 Thread S Ellison


>If you want
> to add variable to data.frame you have to use attach, detach. Right?

I'd have said "not at all", not "not quite". attach and detach have almost 
exactly nothing to do with adding to a data frame. 
You can add to a data frame using  
dfrm$newvar <- 
dfrm['newvar'] <-  
cbind(dfrm, newvar=) #adds a new variable called 'newvar'
rbind #to add rows
merge #to add columns and/or rows from another data frame
... and a few other things.

The only relevance of attach/detach is to do with the behaviour of attached 
objects, not to do with adding to data frames. If you have attach()ed 
something, changing the original object does not automatically update the copy 
of its variables in the current environment, or vice versa, because attach(), 
as documented, creates a _copy_. So _if_ you have attach()ed a data frame - or 
a list - you can't change the copy by changing the original object and you 
can't change the original object by changing the copy.  Only if you need to 
change both do you need to detach and reattach.

As a rule, I generally avoid attach() for that and other reasons (most of which 
are listed in ?attach). attach()is only sensible if you have already completed 
all the manipulation needed on the attached object first. Even then, using 
with() is safer.

S Ellison


***
This email and any attachments are confidential. Any use, copying or
disclosure other than by the intended recipient is unauthorised. If 
you have received this message in error, please notify the sender 
immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com 
and delete this message and any copies from your computer and network. 
LGC Limited. Registered in England 2991879. 
Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread Duncan Murdoch

On 14/09/2016 2:40 PM, jeremiah rounds wrote:

"If you want to add variable to data.frame you have to use attach, detach.
Right?"

Not quite.  Use it like a list to add a variable to a data.frame

e.g.
df = list()
df$var1 = 1:10
df = as.data.frame(df)
df$var2 = 1:10
df[["var3"]] = 1:10
df
df = as.list(df)
df$var4 = 1:10
as.data.frame(df)

Ironically the primary reason to use a data.frame in my head is to signal
that you are thinking of your data as a row-oriented tabular storage.
  "Ironic" because in technical detail that is not a requirement to be a
data.frame, but when I reflect on the typical way a seasoned R programmer
approaches list and data.frames that is basically what they are
communicating.


I believe it is intended to be a requirement.  You can construct things 
with class "data.frame" that don't have that structure, but lots of 
stuff will go wrong if you do.


Duncan Murdoch


I was going to post that a reason to use data.frames is to take advantages
of optimizations and syntax sugar for data.frames, but in reality if code
does not assume a row-oriented data structure in a data.frame there is not
much I can think of that exists in the way of optimization.  For example,
we could point to "subset" and say that is a reason to use data.frames and
not list, but that only works if you use data.frame in a conventional way.

In the end, my advice to you is if it is a table make it a data.frame and
if it is not easily thought of as a table or row-oriented data structure
keep it as a list.

Thanks,
Jeremiah





On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help 
wrote:

> thanks for all the answers. I think also ggplot2 requires data.frames.If
> you want to add variable to data.frame you have to use attach, detach.
> Right?Any more links that discuss thoe two different approaches?Alex
>
> On Wednesday, September 14, 2016 5:34 PM, Bert Gunter <
> bgunter.4...@gmail.com> wrote:
>
>
>  This is partially a matter of subjectve opinion, and so pointless; but
> I would point out that data frames are the canonical structure for a
> great many of R's modeling and graphics functions, e.g. lm, xyplot,
> etc.
>
> As for mutate() etc., that's about UI's and user friendliness, and
> imho my ho is meaningless.
>
> Best,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help 
> wrote:
> > Hi all,I have seen data.frames and operations from the mutate package
> getting really popular. In the last years I have been using extensively
> lists, is there any reason to not use lists and use other data types for
> data manipulation and storage?
> > Any article that describe their differences? I would like to thank you
> for your replyRegardsAlex
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread jeremiah rounds
There is also this syntax for adding variables
df[, "var5"] = 1:10

and the syntax sugar for row-oriented storage:
df[1:5,]

On Wed, Sep 14, 2016 at 11:40 AM, jeremiah rounds 
wrote:

> "If you want to add variable to data.frame you have to use attach, detach.
> Right?"
>
> Not quite.  Use it like a list to add a variable to a data.frame
>
> e.g.
> df = list()
> df$var1 = 1:10
> df = as.data.frame(df)
> df$var2 = 1:10
> df[["var3"]] = 1:10
> df
> df = as.list(df)
> df$var4 = 1:10
> as.data.frame(df)
>
> Ironically the primary reason to use a data.frame in my head is to signal
> that you are thinking of your data as a row-oriented tabular storage.
>  "Ironic" because in technical detail that is not a requirement to be a
> data.frame, but when I reflect on the typical way a seasoned R programmer
> approaches list and data.frames that is basically what they are
> communicating.
>
> I was going to post that a reason to use data.frames is to take advantages
> of optimizations and syntax sugar for data.frames, but in reality if code
> does not assume a row-oriented data structure in a data.frame there is not
> much I can think of that exists in the way of optimization.  For example,
> we could point to "subset" and say that is a reason to use data.frames and
> not list, but that only works if you use data.frame in a conventional way.
>
> In the end, my advice to you is if it is a table make it a data.frame and
> if it is not easily thought of as a table or row-oriented data structure
> keep it as a list.
>
> Thanks,
> Jeremiah
>
>
>
>
>
> On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help 
> wrote:
>
>> thanks for all the answers. I think also ggplot2 requires data.frames.If
>> you want to add variable to data.frame you have to use attach, detach.
>> Right?Any more links that discuss thoe two different approaches?Alex
>>
>> On Wednesday, September 14, 2016 5:34 PM, Bert Gunter <
>> bgunter.4...@gmail.com> wrote:
>>
>>
>>  This is partially a matter of subjectve opinion, and so pointless; but
>> I would point out that data frames are the canonical structure for a
>> great many of R's modeling and graphics functions, e.g. lm, xyplot,
>> etc.
>>
>> As for mutate() etc., that's about UI's and user friendliness, and
>> imho my ho is meaningless.
>>
>> Best,
>> Bert
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help 
>> wrote:
>> > Hi all,I have seen data.frames and operations from the mutate package
>> getting really popular. In the last years I have been using extensively
>> lists, is there any reason to not use lists and use other data types for
>> data manipulation and storage?
>> > Any article that describe their differences? I would like to thank you
>> for your replyRegardsAlex
>> >[[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread jeremiah rounds
"If you want to add variable to data.frame you have to use attach, detach.
Right?"

Not quite.  Use it like a list to add a variable to a data.frame

e.g.
df = list()
df$var1 = 1:10
df = as.data.frame(df)
df$var2 = 1:10
df[["var3"]] = 1:10
df
df = as.list(df)
df$var4 = 1:10
as.data.frame(df)

Ironically the primary reason to use a data.frame in my head is to signal
that you are thinking of your data as a row-oriented tabular storage.
 "Ironic" because in technical detail that is not a requirement to be a
data.frame, but when I reflect on the typical way a seasoned R programmer
approaches list and data.frames that is basically what they are
communicating.

I was going to post that a reason to use data.frames is to take advantages
of optimizations and syntax sugar for data.frames, but in reality if code
does not assume a row-oriented data structure in a data.frame there is not
much I can think of that exists in the way of optimization.  For example,
we could point to "subset" and say that is a reason to use data.frames and
not list, but that only works if you use data.frame in a conventional way.

In the end, my advice to you is if it is a table make it a data.frame and
if it is not easily thought of as a table or row-oriented data structure
keep it as a list.

Thanks,
Jeremiah





On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help 
wrote:

> thanks for all the answers. I think also ggplot2 requires data.frames.If
> you want to add variable to data.frame you have to use attach, detach.
> Right?Any more links that discuss thoe two different approaches?Alex
>
> On Wednesday, September 14, 2016 5:34 PM, Bert Gunter <
> bgunter.4...@gmail.com> wrote:
>
>
>  This is partially a matter of subjectve opinion, and so pointless; but
> I would point out that data frames are the canonical structure for a
> great many of R's modeling and graphics functions, e.g. lm, xyplot,
> etc.
>
> As for mutate() etc., that's about UI's and user friendliness, and
> imho my ho is meaningless.
>
> Best,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help 
> wrote:
> > Hi all,I have seen data.frames and operations from the mutate package
> getting really popular. In the last years I have been using extensively
> lists, is there any reason to not use lists and use other data types for
> data manipulation and storage?
> > Any article that describe their differences? I would like to thank you
> for your replyRegardsAlex
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread Alaios via R-help
thanks for all the answers. I think also ggplot2 requires data.frames.If you 
want to add variable to data.frame you have to use attach, detach. Right?Any 
more links that discuss thoe two different approaches?Alex 

On Wednesday, September 14, 2016 5:34 PM, Bert Gunter 
 wrote:
 

 This is partially a matter of subjectve opinion, and so pointless; but
I would point out that data frames are the canonical structure for a
great many of R's modeling and graphics functions, e.g. lm, xyplot,
etc.

As for mutate() etc., that's about UI's and user friendliness, and
imho my ho is meaningless.

Best,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help  wrote:
> Hi all,I have seen data.frames and operations from the mutate package getting 
> really popular. In the last years I have been using extensively lists, is 
> there any reason to not use lists and use other data types for data 
> manipulation and storage?
> Any article that describe their differences? I would like to thank you for 
> your replyRegardsAlex
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread Bert Gunter
This is partially a matter of subjectve opinion, and so pointless; but
I would point out that data frames are the canonical structure for a
great many of R's modeling and graphics functions, e.g. lm, xyplot,
etc.

As for mutate() etc., that's about UI's and user friendliness, and
imho my ho is meaningless.

Best,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help  wrote:
> Hi all,I have seen data.frames and operations from the mutate package getting 
> really popular. In the last years I have been using extensively lists, is 
> there any reason to not use lists and use other data types for data 
> manipulation and storage?
> Any article that describe their differences? I would like to thank you for 
> your replyRegardsAlex
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread Marc Schwartz

> On Sep 14, 2016, at 8:01 AM, Alaios via R-help  wrote:
> 
> Hi all,I have seen data.frames and operations from the mutate package getting 
> really popular. In the last years I have been using extensively lists, is 
> there any reason to not use lists and use other data types for data 
> manipulation and storage?
> Any article that describe their differences? I would like to thank you for 
> your replyRegardsAlex

Hi,

Presuming that you are referring to the mutate() **function**, which is in the 
dplyr package on CRAN, that package provides a variety of functions to 
manipulate data in R.

Data frames **are** lists with a data.frame class attribute, but with the 
proviso that each column in the data frame, which is a list element, has the 
same length, but like a list, may have different data types (e.g. character, 
numeric, etc.). 

Thus, a data frame is effectively a rectangular data structure, conceptually in 
the same manner as an Excel worksheet.

A list, which is a more generic data structure, can contain list elements of 
variable lengths and data types. 

You might want to begin by reviewing:

  
https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Lists-and-data-frames

which is a section on lists and data frames in the Introduction To R Manual.

It would be surprising, to me at least, that you have been using R for several 
years and have not come across data frames, since they are used in many typical 
operations, including regression models and the like.

Regards,

Marc Schwartz
 
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] why data.frame, mutate package and not lists

2016-09-14 Thread Alaios via R-help
Hi all,I have seen data.frames and operations from the mutate package getting 
really popular. In the last years I have been using extensively lists, is there 
any reason to not use lists and use other data types for data manipulation and 
storage?
Any article that describe their differences? I would like to thank you for your 
replyRegardsAlex
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.