Re: [R] Making objects global in a package

2018-07-16 Thread Michael Hannon
Thanks to all for your replies.  So far as I can see, there was
nothing wrong with my original approach, but I've decided to stuff all
the relevant definitions into a function (or functions), as this seems
to make "devtools::check()" happier.

-- Mike


On Fri, Jul 13, 2018 at 6:54 PM, Jeff Newmiller
 wrote:
> Avoiding rda files because they don't track well with version control seems 
> weak to me, since you should be creating the rda with an R file in the tools 
> directory.
>
> On July 13, 2018 6:50:31 PM PDT, William Dunlap  wrote:
>>What the OP is doing looks fine to me.
>>
>>The environment holding the data vectors is not necessary, but it helps
>>organize things - you know where to look for this sort of data vector.
>>
>>I would avoid the *.rda file, since it is not text, hence not readily
>>editable
>>or trackable with most source control systems.
>>
>>
>>Bill Dunlap
>>TIBCO Software
>>wdunlap tibco.com
>>
>>On Fri, Jul 13, 2018 at 6:17 PM, Jeff Newmiller
>>
>>wrote:
>>
>>> a) There is a mailing list for package development questions:
>>> R-package-devel.
>>>
>>> b) This seems like a job for the sysdata.rda file... no explicit
>>> environments needed. See the Writing R Extensions manual.
>>>
>>> On July 13, 2018 5:51:06 PM PDT, Michael Hannon <
>>> jmhannon.ucda...@gmail.com> wrote:
>>> >Greetings.  I'm putting together a small package in which I use
>>> >`dplyr::read_csv()` to read CSV files from several different
>>sources.
>>> >I do
>>> >this in several different files, but with various kinds of
>>subsequent
>>> >processing, depending on the file.
>>> >
>>> >I find it useful to specify column types, as the apparent data type
>>of
>>> >a given
>>> >column sometimes changes unexpectedly deep into the file.  I.e., a
>>> >field that
>>> >consistently looks like an integer, suddenly becomes a fraction:
>>> >
>>> >1, 1, ..., 1, 1/2, 1, ...
>>> >
>>> >Hence, the column type has to be treated as a character, rather than
>>as
>>> >an
>>> >integer (with the possibility of later conversion to double, if
>>> >necessary).
>>> >(This is just an example.)
>>> >
>>> >Therefore I use the `col_types` argument in all of the calls to
>>> >`read_csv()`.
>>> >
>>> >These calls are spread over several files, but I want the keep all
>>of
>>> >the
>>> >column types in a single place, yet have them available in each of
>>the
>>> >several
>>> >files.  This is just for the sake of maintainability.
>>> >
>>> >At the moment I do this by putting the column-type definitions into
>>a
>>> >single,
>>> >file:
>>> >
>>> >000_define_data_attributes.R
>>> >
>>> >that:
>>> >
>>> >(1) is named so that it's parsed first by `devtools::build()`
>>> >(2) sets up an environment and stuffs the column types into it:
>>> >
>>> >data_env <- new.env(parent=emptyenv())
>>> >data_env$col_types_alpha <- list(
>>> >Date = col_date(),
>>> >var1 = col_double(),
>>> >...
>>> >)
>>> >
>>> >There are a few other things that go into the file as well.
>>> >
>>> >Then I pick off the appropriate stuff from the environment in the
>>other
>>> >files:
>>> >
>>> >foo_alpha <- read_csv("alpha.csv", col_types =
>>> >data_env$col_types_alpha)
>>> >
>>> >This seems to work, but it doesn't "feel" right to me.  (If this
>>were
>>> >Python,
>>> >people would accuse me of being "non-pythonic").
>>> >
>>> >Hence, I'm seeking suggestions for the best practice for this kind
>>of
>>> >thing.
>>> >
>>> >BTW, I note that both the sources of data ("alpha", etc.) and the
>>> >column types
>>> >are more or less guaranteed to be static for the foreseeable future.
>>> >Hence,
>>> >there really isn't much danger in just replicating the column-type
>>> >definitions
>>> >in each of the various files, which would obviate the need for the
>>> >"000..."
>>> >file.  In other words, this is mostly a style thing.
>>> >
>>> >Thanks for any advice you can provide.
>>> >
>>> >-- Mike
>>> >
>>> >__
>>> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> >https://stat.ethz.ch/mailman/listinfo/r-help
>>> >PLEASE do read the posting guide
>>> >http://www.R-project.org/posting-guide.html
>>> >and provide commented, minimal, self-contained, reproducible code.
>>>
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>
> --
> Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and 

Re: [R] Making objects global in a package

2018-07-13 Thread Jeff Newmiller
Avoiding rda files because they don't track well with version control seems 
weak to me, since you should be creating the rda with an R file in the tools 
directory.

On July 13, 2018 6:50:31 PM PDT, William Dunlap  wrote:
>What the OP is doing looks fine to me.
>
>The environment holding the data vectors is not necessary, but it helps
>organize things - you know where to look for this sort of data vector.
>
>I would avoid the *.rda file, since it is not text, hence not readily
>editable
>or trackable with most source control systems.
>
>
>Bill Dunlap
>TIBCO Software
>wdunlap tibco.com
>
>On Fri, Jul 13, 2018 at 6:17 PM, Jeff Newmiller
>
>wrote:
>
>> a) There is a mailing list for package development questions:
>> R-package-devel.
>>
>> b) This seems like a job for the sysdata.rda file... no explicit
>> environments needed. See the Writing R Extensions manual.
>>
>> On July 13, 2018 5:51:06 PM PDT, Michael Hannon <
>> jmhannon.ucda...@gmail.com> wrote:
>> >Greetings.  I'm putting together a small package in which I use
>> >`dplyr::read_csv()` to read CSV files from several different
>sources.
>> >I do
>> >this in several different files, but with various kinds of
>subsequent
>> >processing, depending on the file.
>> >
>> >I find it useful to specify column types, as the apparent data type
>of
>> >a given
>> >column sometimes changes unexpectedly deep into the file.  I.e., a
>> >field that
>> >consistently looks like an integer, suddenly becomes a fraction:
>> >
>> >1, 1, ..., 1, 1/2, 1, ...
>> >
>> >Hence, the column type has to be treated as a character, rather than
>as
>> >an
>> >integer (with the possibility of later conversion to double, if
>> >necessary).
>> >(This is just an example.)
>> >
>> >Therefore I use the `col_types` argument in all of the calls to
>> >`read_csv()`.
>> >
>> >These calls are spread over several files, but I want the keep all
>of
>> >the
>> >column types in a single place, yet have them available in each of
>the
>> >several
>> >files.  This is just for the sake of maintainability.
>> >
>> >At the moment I do this by putting the column-type definitions into
>a
>> >single,
>> >file:
>> >
>> >000_define_data_attributes.R
>> >
>> >that:
>> >
>> >(1) is named so that it's parsed first by `devtools::build()`
>> >(2) sets up an environment and stuffs the column types into it:
>> >
>> >data_env <- new.env(parent=emptyenv())
>> >data_env$col_types_alpha <- list(
>> >Date = col_date(),
>> >var1 = col_double(),
>> >...
>> >)
>> >
>> >There are a few other things that go into the file as well.
>> >
>> >Then I pick off the appropriate stuff from the environment in the
>other
>> >files:
>> >
>> >foo_alpha <- read_csv("alpha.csv", col_types =
>> >data_env$col_types_alpha)
>> >
>> >This seems to work, but it doesn't "feel" right to me.  (If this
>were
>> >Python,
>> >people would accuse me of being "non-pythonic").
>> >
>> >Hence, I'm seeking suggestions for the best practice for this kind
>of
>> >thing.
>> >
>> >BTW, I note that both the sources of data ("alpha", etc.) and the
>> >column types
>> >are more or less guaranteed to be static for the foreseeable future.
>> >Hence,
>> >there really isn't much danger in just replicating the column-type
>> >definitions
>> >in each of the various files, which would obviate the need for the
>> >"000..."
>> >file.  In other words, this is mostly a style thing.
>> >
>> >Thanks for any advice you can provide.
>> >
>> >-- Mike
>> >
>> >__
>> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Making objects global in a package

2018-07-13 Thread William Dunlap via R-help
What the OP is doing looks fine to me.

The environment holding the data vectors is not necessary, but it helps
organize things - you know where to look for this sort of data vector.

I would avoid the *.rda file, since it is not text, hence not readily
editable
or trackable with most source control systems.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Jul 13, 2018 at 6:17 PM, Jeff Newmiller 
wrote:

> a) There is a mailing list for package development questions:
> R-package-devel.
>
> b) This seems like a job for the sysdata.rda file... no explicit
> environments needed. See the Writing R Extensions manual.
>
> On July 13, 2018 5:51:06 PM PDT, Michael Hannon <
> jmhannon.ucda...@gmail.com> wrote:
> >Greetings.  I'm putting together a small package in which I use
> >`dplyr::read_csv()` to read CSV files from several different sources.
> >I do
> >this in several different files, but with various kinds of subsequent
> >processing, depending on the file.
> >
> >I find it useful to specify column types, as the apparent data type of
> >a given
> >column sometimes changes unexpectedly deep into the file.  I.e., a
> >field that
> >consistently looks like an integer, suddenly becomes a fraction:
> >
> >1, 1, ..., 1, 1/2, 1, ...
> >
> >Hence, the column type has to be treated as a character, rather than as
> >an
> >integer (with the possibility of later conversion to double, if
> >necessary).
> >(This is just an example.)
> >
> >Therefore I use the `col_types` argument in all of the calls to
> >`read_csv()`.
> >
> >These calls are spread over several files, but I want the keep all of
> >the
> >column types in a single place, yet have them available in each of the
> >several
> >files.  This is just for the sake of maintainability.
> >
> >At the moment I do this by putting the column-type definitions into a
> >single,
> >file:
> >
> >000_define_data_attributes.R
> >
> >that:
> >
> >(1) is named so that it's parsed first by `devtools::build()`
> >(2) sets up an environment and stuffs the column types into it:
> >
> >data_env <- new.env(parent=emptyenv())
> >data_env$col_types_alpha <- list(
> >Date = col_date(),
> >var1 = col_double(),
> >...
> >)
> >
> >There are a few other things that go into the file as well.
> >
> >Then I pick off the appropriate stuff from the environment in the other
> >files:
> >
> >foo_alpha <- read_csv("alpha.csv", col_types =
> >data_env$col_types_alpha)
> >
> >This seems to work, but it doesn't "feel" right to me.  (If this were
> >Python,
> >people would accuse me of being "non-pythonic").
> >
> >Hence, I'm seeking suggestions for the best practice for this kind of
> >thing.
> >
> >BTW, I note that both the sources of data ("alpha", etc.) and the
> >column types
> >are more or less guaranteed to be static for the foreseeable future.
> >Hence,
> >there really isn't much danger in just replicating the column-type
> >definitions
> >in each of the various files, which would obviate the need for the
> >"000..."
> >file.  In other words, this is mostly a style thing.
> >
> >Thanks for any advice you can provide.
> >
> >-- Mike
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Making objects global in a package

2018-07-13 Thread Jeff Newmiller
a) There is a mailing list for package development questions: R-package-devel.

b) This seems like a job for the sysdata.rda file... no explicit environments 
needed. See the Writing R Extensions manual.

On July 13, 2018 5:51:06 PM PDT, Michael Hannon  
wrote:
>Greetings.  I'm putting together a small package in which I use
>`dplyr::read_csv()` to read CSV files from several different sources. 
>I do
>this in several different files, but with various kinds of subsequent
>processing, depending on the file.
>
>I find it useful to specify column types, as the apparent data type of
>a given
>column sometimes changes unexpectedly deep into the file.  I.e., a
>field that
>consistently looks like an integer, suddenly becomes a fraction:
>
>1, 1, ..., 1, 1/2, 1, ...
>
>Hence, the column type has to be treated as a character, rather than as
>an
>integer (with the possibility of later conversion to double, if
>necessary).
>(This is just an example.)
>
>Therefore I use the `col_types` argument in all of the calls to
>`read_csv()`.
>
>These calls are spread over several files, but I want the keep all of
>the
>column types in a single place, yet have them available in each of the
>several
>files.  This is just for the sake of maintainability.
>
>At the moment I do this by putting the column-type definitions into a
>single,
>file:
>
>000_define_data_attributes.R
>
>that:
>
>(1) is named so that it's parsed first by `devtools::build()`
>(2) sets up an environment and stuffs the column types into it:
>
>data_env <- new.env(parent=emptyenv())
>data_env$col_types_alpha <- list(
>Date = col_date(),
>var1 = col_double(),
>...
>)
>
>There are a few other things that go into the file as well.
>
>Then I pick off the appropriate stuff from the environment in the other
>files:
>
>foo_alpha <- read_csv("alpha.csv", col_types =
>data_env$col_types_alpha)
>
>This seems to work, but it doesn't "feel" right to me.  (If this were
>Python,
>people would accuse me of being "non-pythonic").
>
>Hence, I'm seeking suggestions for the best practice for this kind of
>thing.
>
>BTW, I note that both the sources of data ("alpha", etc.) and the
>column types
>are more or less guaranteed to be static for the foreseeable future. 
>Hence,
>there really isn't much danger in just replicating the column-type
>definitions
>in each of the various files, which would obviate the need for the
>"000..."
>file.  In other words, this is mostly a style thing.
>
>Thanks for any advice you can provide.
>
>-- Mike
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Making objects global in a package

2018-07-13 Thread R. Mark Sharp via R-help
I would usually use a function for this. It may not be more R like, but it is 
more readable to me. If you want, to keep the columns in a file, you could have 
the function initialize itself on the first call. 

Mark
R. Mark Sharp, Ph.D.
Data Scientist and Biomedical Statistical Consultant
7526 Meadow Green St.
San Antonio, TX 78251
mobile: 210-218-2868
rmsh...@me.com











> On Jul 13, 2018, at 7:51 PM, Michael Hannon  
> wrote:
> 
> Greetings.  I'm putting together a small package in which I use
> `dplyr::read_csv()` to read CSV files from several different sources.  I do
> this in several different files, but with various kinds of subsequent
> processing, depending on the file.
> 
> I find it useful to specify column types, as the apparent data type of a given
> column sometimes changes unexpectedly deep into the file.  I.e., a field that
> consistently looks like an integer, suddenly becomes a fraction:
> 
>1, 1, ..., 1, 1/2, 1, ...
> 
> Hence, the column type has to be treated as a character, rather than as an
> integer (with the possibility of later conversion to double, if necessary).
> (This is just an example.)
> 
> Therefore I use the `col_types` argument in all of the calls to `read_csv()`.
> 
> These calls are spread over several files, but I want the keep all of the
> column types in a single place, yet have them available in each of the several
> files.  This is just for the sake of maintainability.
> 
> At the moment I do this by putting the column-type definitions into a single,
> file:
> 
>000_define_data_attributes.R
> 
> that:
> 
>(1) is named so that it's parsed first by `devtools::build()`
>(2) sets up an environment and stuffs the column types into it:
> 
>data_env <- new.env(parent=emptyenv())
>data_env$col_types_alpha <- list(
>Date = col_date(),
>var1 = col_double(),
>...
>)
> 
> There are a few other things that go into the file as well.
> 
> Then I pick off the appropriate stuff from the environment in the other files:
> 
>foo_alpha <- read_csv("alpha.csv", col_types = data_env$col_types_alpha)
> 
> This seems to work, but it doesn't "feel" right to me.  (If this were Python,
> people would accuse me of being "non-pythonic").
> 
> Hence, I'm seeking suggestions for the best practice for this kind of thing.
> 
> BTW, I note that both the sources of data ("alpha", etc.) and the column types
> are more or less guaranteed to be static for the foreseeable future.  Hence,
> there really isn't much danger in just replicating the column-type definitions
> in each of the various files, which would obviate the need for the "000..."
> file.  In other words, this is mostly a style thing.
> 
> Thanks for any advice you can provide.
> 
> -- Mike
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.