Re: [R-pkg-devel] Working with connections - What is correct?
On Mon, Aug 10, 2015 at 10:18 PM, Glenn Schultz wrote: > Hi Dirk, > Thanks for your response, I get the point on return(). For me, it is a > security blanket - I just need to let that go rather than justify keeping > it. I will refactor the connections and just get comfortable without > return(). > I'm not sure you get Dirk's point about return(). He did not suggest that you avoid using return(). He provided a simple example to illustrate that code in a function body that occurs after a call to return() is not evaluated. There is nothing inherently wrong with using return(). The only thing that was wrong was how you used it. > Thanks, > Glenn > > > On Aug 10, 2015, at 09:59 PM, Dirk Eddelbuettel wrote: > > > On 11 August 2015 at 02:09, Glenn Schultz wrote: > | All, > | Is my function just plain wrong or is it just programming style? I use > connections because SODA (software for data analysis) recommends using > connections when working with serialized files. > > Nothing wrong with connections. Many of us use them. > > | First, after researching the use of return() - following Joshua's comment > and others I found a post on SO regarding return(). > http://stackoverflow.com/questions/11738823/explicitly-calling-return-in-a-function-or-not > > Let's try once more. The function foo() > > foo <- function() { > x <- 2 > return(x) > x <- 3 > } > > will return what value? Ie what does 'print(foo())' show? > > | Second, Hadley's point regarding the use of RDS in gz function being a > little strange. Following the help files the function is fashioned after the > following which I found in R help files: > > IIRC what Hadley was trying to say what you do _not_ need to compress before > inserting into a rds file as the rds file format compresses by default. > > In other words compression comes for free, and effortlessly so. > > Hth, Dirk > > | ## Less convenient ways to restore the object > | ## which demonstrate compatibility with unserialize() > | con <- gzfile("women.rds", "rb") > | identical(unserialize(con), women) > | close(con) > | con <- gzfile("women.rds", "rb") > | wm <- readBin(con, "raw", n = 1e4) # size is a guess > | close(con) > | identical(unserialize(wm), women) > | | With respect to the first, I understand why my function would be > considered "buggy" - that can be fixed. It is the case, generally speaking, > one does not need return() when programming in R; it is a functional > language. However, some use return() and one finds return(value) in the R > help files. Is it a strict no-no to use return()? > | | With respect to the second, The function can be refactored and reduced > to readRDS(). My question, is using gz file overkill or just plain wrong? > | What is considered "best practice"? > | | -Glenn > | | | On Aug 09, 2015, at 09:04 AM, Joshua Ulrich > wrote: > | | On Sun, Aug 9, 2015 at 8:59 AM, Glenn Schultz > wrote: > | Hi All, > | | I use connections to open and close data folders needed by my package. > | After each function closes I get the following warnings (depending on the > | connection that has been opened). > | | 10: closing unused connection 3 > | > (/Library/Frameworks/R.framework/Versions/3.2/Resources/library/BondLab/BondData/bondlabMBS4.rds) > | | Below is the connection function that is related to the above warning: > | | > # > | #' A connection function to BondData calling MBS cusps > | #' > | #' Opens a connection to the BondData folder to call MBS cusip data > | #' @param MBS.id A character string the MBS.id or cusip number current > | MBS.id is supported > | #' @export > | MBS <- function(MBS.id = "character"){ > | MBS.Conn <- gzfile(description = paste(system.file(package > | = "BondLab"), > | "/BondData/", MBS.id, ".rds", sep = ""), open > | = "rb") > | MBS <- readRDS(MBS.Conn) > | return(MBS) > | close.connection(MBS.Conn) > | } > | | I have googled this warning and it seems to be triggered when a function > | terminates and the connection is open. But, I think the connection > function > | closes the connection once the object is returned. What am I doing wrong? > | | Your call to return() exits the function, so the close.connection() > | call is never evaluated. Considering using on.exit() to close the > | connection, since it will close the connection regardless of how the > | function exits (e.g. because of an error). > | | -Glenn > | __ > | R-package-devel@r-project.org mailing list > | https://stat.ethz.ch/mailman/listinfo/r-package-devel > | | | | -- | Joshua Ulrich | about.me/joshuaulrich > | FOSS Trading | http://www.fosstrading.com > | __ > | R-package-devel@r-project.org mailing list > | https://stat.ethz.ch/mailman/listinfo/r-package-devel > > -- > http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org > __ > R-package-devel@r-
Re: [R-pkg-devel] Working with connections - What is correct?
Hi Dirk, Thanks for your response, I get the point on return(). For me, it is a security blanket - I just need to let that go rather than justify keeping it. I will refactor the connections and just get comfortable without return(). Thanks, Glenn On Aug 10, 2015, at 09:59 PM, Dirk Eddelbuettel wrote: On 11 August 2015 at 02:09, Glenn Schultz wrote: | All, | Is my function just plain wrong or is it just programming style? I use connections because SODA (software for data analysis) recommends using connections when working with serialized files. Nothing wrong with connections. Many of us use them. | First, after researching the use of return() - following Joshua's comment and others I found a post on SO regarding return(). http://stackoverflow.com/questions/11738823/explicitly-calling-return-in-a-function-or-not Let's try once more. The function foo() foo <- function() { x <- 2 return(x) x <- 3 } will return what value? Ie what does 'print(foo())' show? | Second, Hadley's point regarding the use of RDS in gz function being a little strange. Following the help files the function is fashioned after the following which I found in R help files: IIRC what Hadley was trying to say what you do _not_ need to compress before inserting into a rds file as the rds file format compresses by default. In other words compression comes for free, and effortlessly so. Hth, Dirk | ## Less convenient ways to restore the object | ## which demonstrate compatibility with unserialize() | con <- gzfile("women.rds", "rb") | identical(unserialize(con), women) | close(con) | con <- gzfile("women.rds", "rb") | wm <- readBin(con, "raw", n = 1e4) # size is a guess | close(con) | identical(unserialize(wm), women) | | With respect to the first, I understand why my function would be considered "buggy" - that can be fixed. It is the case, generally speaking, one does not need return() when programming in R; it is a functional language. However, some use return() and one finds return(value) in the R help files. Is it a strict no-no to use return()? | | With respect to the second, The function can be refactored and reduced to readRDS(). My question, is using gz file overkill or just plain wrong? | What is considered "best practice"? | | -Glenn | | | On Aug 09, 2015, at 09:04 AM, Joshua Ulrich wrote: | | On Sun, Aug 9, 2015 at 8:59 AM, Glenn Schultz wrote: | Hi All, | | I use connections to open and close data folders needed by my package. | After each function closes I get the following warnings (depending on the | connection that has been opened). | | 10: closing unused connection 3 | (/Library/Frameworks/R.framework/Versions/3.2/Resources/library/BondLab/BondData/bondlabMBS4.rds) | | Below is the connection function that is related to the above warning: | | # | #' A connection function to BondData calling MBS cusps | #' | #' Opens a connection to the BondData folder to call MBS cusip data | #' @param MBS.id A character string the MBS.id or cusip number current | MBS.id is supported | #' @export | MBS <- function(MBS.id = "character"){ | MBS.Conn <- gzfile(description = paste(system.file(package | = "BondLab"), | "/BondData/", MBS.id, ".rds", sep = ""), open | = "rb") | MBS <- readRDS(MBS.Conn) | return(MBS) | close.connection(MBS.Conn) | } | | I have googled this warning and it seems to be triggered when a function | terminates and the connection is open. But, I think the connection function | closes the connection once the object is returned. What am I doing wrong? | | Your call to return() exits the function, so the close.connection() | call is never evaluated. Considering using on.exit() to close the | connection, since it will close the connection regardless of how the | function exits (e.g. because of an error). | | -Glenn | __ | R-package-devel@r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-package-devel | | | | -- | Joshua Ulrich | about.me/joshuaulrich | FOSS Trading | http://www.fosstrading.com | __ | R-package-devel@r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-package-devel -- http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Working with connections - What is correct?
On 11 August 2015 at 02:09, Glenn Schultz wrote: | All, | Is my function just plain wrong or is it just programming style? I use connections because SODA (software for data analysis) recommends using connections when working with serialized files. Nothing wrong with connections. Many of us use them. | First, after researching the use of return() - following Joshua's comment and others I found a post on SO regarding return(). http://stackoverflow.com/questions/11738823/explicitly-calling-return-in-a-function-or-not Let's try once more. The function foo() foo <- function() { x <- 2 return(x) x <- 3 } will return what value? Ie what does 'print(foo())' show? | Second, Hadley's point regarding the use of RDS in gz function being a little strange. Following the help files the function is fashioned after the following which I found in R help files: IIRC what Hadley was trying to say what you do _not_ need to compress before inserting into a rds file as the rds file format compresses by default. In other words compression comes for free, and effortlessly so. Hth, Dirk | ## Less convenient ways to restore the object | ## which demonstrate compatibility with unserialize() | con <- gzfile("women.rds", "rb") | identical(unserialize(con), women) | close(con) | con <- gzfile("women.rds", "rb") | wm <- readBin(con, "raw", n = 1e4) # size is a guess | close(con) | identical(unserialize(wm), women) | | With respect to the first, I understand why my function would be considered "buggy" - that can be fixed. It is the case, generally speaking, one does not need return() when programming in R; it is a functional language. However, some use return() and one finds return(value) in the R help files. Is it a strict no-no to use return()? | | With respect to the second, The function can be refactored and reduced to readRDS(). My question, is using gz file overkill or just plain wrong? | What is considered "best practice"? | | -Glenn | | | On Aug 09, 2015, at 09:04 AM, Joshua Ulrich wrote: | | On Sun, Aug 9, 2015 at 8:59 AM, Glenn Schultz wrote: | Hi All, | | I use connections to open and close data folders needed by my package. | After each function closes I get the following warnings (depending on the | connection that has been opened). | | 10: closing unused connection 3 | (/Library/Frameworks/R.framework/Versions/3.2/Resources/library/BondLab/BondData/bondlabMBS4.rds) | | Below is the connection function that is related to the above warning: | | # | #' A connection function to BondData calling MBS cusps | #' | #' Opens a connection to the BondData folder to call MBS cusip data | #' @param MBS.id A character string the MBS.id or cusip number current | MBS.id is supported | #' @export | MBS <- function(MBS.id = "character"){ | MBS.Conn <- gzfile(description = paste(system.file(package | = "BondLab"), | "/BondData/", MBS.id, ".rds", sep = ""), open | = "rb") | MBS <- readRDS(MBS.Conn) | return(MBS) | close.connection(MBS.Conn) | } | | I have googled this warning and it seems to be triggered when a function | terminates and the connection is open. But, I think the connection function | closes the connection once the object is returned. What am I doing wrong? | | Your call to return() exits the function, so the close.connection() | call is never evaluated. Considering using on.exit() to close the | connection, since it will close the connection regardless of how the | function exits (e.g. because of an error). | | -Glenn | __ | R-package-devel@r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-package-devel | | | | -- | Joshua Ulrich | about.me/joshuaulrich | FOSS Trading | http://www.fosstrading.com | __ | R-package-devel@r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-package-devel -- http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Working with connections - What is correct?
All, Is my function just plain wrong or is it just programming style? I use connections because SODA (software for data analysis) recommends using connections when working with serialized files. First, after researching the use of return() - following Joshua's comment and others I found a post on SO regarding return(). http://stackoverflow.com/questions/11738823/explicitly-calling-return-in-a-function-or-not Second, Hadley's point regarding the use of RDS in gz function being a little strange. Following the help files the function is fashioned after the following which I found in R help files: ## Less convenient ways to restore the object ## which demonstrate compatibility with unserialize() con <- gzfile("women.rds", "rb") identical(unserialize(con), women) close(con) con <- gzfile("women.rds", "rb") wm <- readBin(con, "raw", n = 1e4) # size is a guess close(con) identical(unserialize(wm), women) With respect to the first, I understand why my function would be considered "buggy" - that can be fixed. It is the case, generally speaking, one does not need return() when programming in R; it is a functional language. However, some use return() and one finds return(value) in the R help files. Is it a strict no-no to use return()? With respect to the second, The function can be refactored and reduced to readRDS(). My question, is using gz file overkill or just plain wrong? What is considered "best practice"? -Glenn On Aug 09, 2015, at 09:04 AM, Joshua Ulrich wrote: On Sun, Aug 9, 2015 at 8:59 AM, Glenn Schultz wrote: Hi All, I use connections to open and close data folders needed by my package. After each function closes I get the following warnings (depending on the connection that has been opened). 10: closing unused connection 3 (/Library/Frameworks/R.framework/Versions/3.2/Resources/library/BondLab/BondData/bondlabMBS4.rds) Below is the connection function that is related to the above warning: # #' A connection function to BondData calling MBS cusps #' #' Opens a connection to the BondData folder to call MBS cusip data #' @param MBS.id A character string the MBS.id or cusip number current MBS.id is supported #' @export MBS <- function(MBS.id = "character"){ MBS.Conn <- gzfile(description = paste(system.file(package = "BondLab"), "/BondData/", MBS.id, ".rds", sep = ""), open = "rb") MBS <- readRDS(MBS.Conn) return(MBS) close.connection(MBS.Conn) } I have googled this warning and it seems to be triggered when a function terminates and the connection is open. But, I think the connection function closes the connection once the object is returned. What am I doing wrong? Your call to return() exits the function, so the close.connection() call is never evaluated. Considering using on.exit() to close the connection, since it will close the connection regardless of how the function exits (e.g. because of an error). -Glenn __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel -- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | http://www.fosstrading.com __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Working with connections
Hi Hadley, Thanks for answering. I basically followed what was outlined in Software for Data Analysis. Although, I must admit it was a little light in this area (about a paragraph). I will have to do some additional reading/research on the RDS inside a gz issue. If it is not the correct application then I need to correct what I am doing. To the second issue, I am exposing then data via a function to provide data storage/retrieval flexibility because cusip data for MBS securities can go to the terabyte. Additionally, in the case of a REMIC (a resecuritization) the cusip detail is assembled from many objects (a one to many relationship) before analysis begins. So, based on the above I was thinking that one could have the data stored remotely (SQL, etc.) and edit the connection function as needed to point it to the required cusip data. Is there a better way to do this? -Glenn On Aug 09, 2015, at 09:23 AM, Hadley Wickham wrote: Also it's a little strange to put an RDS file _inside_ a gz, since normally the compression is done internally. And are you sure you should be exposing this data via a function, rather than using the regular package data mechanism? Hadley On Sun, Aug 9, 2015 at 7:04 AM, Joshua Ulrich wrote: On Sun, Aug 9, 2015 at 8:59 AM, Glenn Schultz wrote: Hi All, I use connections to open and close data folders needed by my package. After each function closes I get the following warnings (depending on the connection that has been opened). 10: closing unused connection 3 (/Library/Frameworks/R.framework/Versions/3.2/Resources/library/BondLab/BondData/bondlabMBS4.rds) Below is the connection function that is related to the above warning: # #' A connection function to BondData calling MBS cusps #' #' Opens a connection to the BondData folder to call MBS cusip data #' @param MBS.id A character string the MBS.id or cusip number current MBS.id is supported #' @export MBS <- function(MBS.id = "character"){ MBS.Conn <- gzfile(description = paste(system.file(package = "BondLab"), "/BondData/", MBS.id, ".rds", sep = ""), open = "rb") MBS <- readRDS(MBS.Conn) return(MBS) close.connection(MBS.Conn) } I have googled this warning and it seems to be triggered when a function terminates and the connection is open. But, I think the connection function closes the connection once the object is returned. What am I doing wrong? Your call to return() exits the function, so the close.connection() call is never evaluated. Considering using on.exit() to close the connection, since it will close the connection regardless of how the function exits (e.g. because of an error). -Glenn __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel -- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | http://www.fosstrading.com __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel -- http://had.co.nz/ __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Working with connections
Also it's a little strange to put an RDS file _inside_ a gz, since normally the compression is done internally. And are you sure you should be exposing this data via a function, rather than using the regular package data mechanism? Hadley On Sun, Aug 9, 2015 at 7:04 AM, Joshua Ulrich wrote: > On Sun, Aug 9, 2015 at 8:59 AM, Glenn Schultz wrote: >> Hi All, >> >> I use connections to open and close data folders needed by my package. >> After each function closes I get the following warnings (depending on the >> connection that has been opened). >> >> 10: closing unused connection 3 >> (/Library/Frameworks/R.framework/Versions/3.2/Resources/library/BondLab/BondData/bondlabMBS4.rds) >> >> Below is the connection function that is related to the above warning: >> >> # >> #' A connection function to BondData calling MBS cusps >> #' >> #' Opens a connection to the BondData folder to call MBS cusip data >> #' @param MBS.id A character string the MBS.id or cusip number current >> MBS.id is supported >> #' @export >> MBS <- function(MBS.id = "character"){ >> MBS.Conn <- gzfile(description = paste(system.file(package >> = "BondLab"), >> "/BondData/", MBS.id, ".rds", sep = ""), open >> = "rb") >> MBS <- readRDS(MBS.Conn) >> return(MBS) >> close.connection(MBS.Conn) >> } >> >> I have googled this warning and it seems to be triggered when a function >> terminates and the connection is open. But, I think the connection function >> closes the connection once the object is returned. What am I doing wrong? >> > Your call to return() exits the function, so the close.connection() > call is never evaluated. Considering using on.exit() to close the > connection, since it will close the connection regardless of how the > function exits (e.g. because of an error). > >> -Glenn >> __ >> R-package-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-package-devel > > > > -- > Joshua Ulrich | about.me/joshuaulrich > FOSS Trading | www.fosstrading.com > > __ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel -- http://had.co.nz/ __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Working with connections
Hi Joshua, Thank-you. I did not realize that the call to return closed the connection -Glenn > On Aug 9, 2015, at 9:04 AM, Joshua Ulrich wrote: > > On Sun, Aug 9, 2015 at 8:59 AM, Glenn Schultz wrote: >> Hi All, >> >> I use connections to open and close data folders needed by my package. >> After each function closes I get the following warnings (depending on the >> connection that has been opened). >> >> 10: closing unused connection 3 >> (/Library/Frameworks/R.framework/Versions/3.2/Resources/library/BondLab/BondData/bondlabMBS4.rds) >> >> Below is the connection function that is related to the above warning: >> >> # >> #' A connection function to BondData calling MBS cusps >> #' >> #' Opens a connection to the BondData folder to call MBS cusip data >> #' @param MBS.id A character string the MBS.id or cusip number current >> MBS.id is supported >> #' @export >> MBS <- function(MBS.id = "character"){ >> MBS.Conn <- gzfile(description = paste(system.file(package >> = "BondLab"), >> "/BondData/", MBS.id, ".rds", sep = ""), open >> = "rb") >> MBS <- readRDS(MBS.Conn) >> return(MBS) >> close.connection(MBS.Conn) >> } >> >> I have googled this warning and it seems to be triggered when a function >> terminates and the connection is open. But, I think the connection function >> closes the connection once the object is returned. What am I doing wrong? >> > Your call to return() exits the function, so the close.connection() > call is never evaluated. Considering using on.exit() to close the > connection, since it will close the connection regardless of how the > function exits (e.g. because of an error). > >> -Glenn >> __ >> R-package-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-package-devel > > > > -- > Joshua Ulrich | about.me/joshuaulrich > FOSS Trading | www.fosstrading.com __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Working with connections
On Sun, Aug 9, 2015 at 8:59 AM, Glenn Schultz wrote: > Hi All, > > I use connections to open and close data folders needed by my package. > After each function closes I get the following warnings (depending on the > connection that has been opened). > > 10: closing unused connection 3 > (/Library/Frameworks/R.framework/Versions/3.2/Resources/library/BondLab/BondData/bondlabMBS4.rds) > > Below is the connection function that is related to the above warning: > > # > #' A connection function to BondData calling MBS cusps > #' > #' Opens a connection to the BondData folder to call MBS cusip data > #' @param MBS.id A character string the MBS.id or cusip number current > MBS.id is supported > #' @export > MBS <- function(MBS.id = "character"){ > MBS.Conn <- gzfile(description = paste(system.file(package > = "BondLab"), > "/BondData/", MBS.id, ".rds", sep = ""), open > = "rb") > MBS <- readRDS(MBS.Conn) > return(MBS) > close.connection(MBS.Conn) > } > > I have googled this warning and it seems to be triggered when a function > terminates and the connection is open. But, I think the connection function > closes the connection once the object is returned. What am I doing wrong? > Your call to return() exits the function, so the close.connection() call is never evaluated. Considering using on.exit() to close the connection, since it will close the connection regardless of how the function exits (e.g. because of an error). > -Glenn > __ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel -- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
[R-pkg-devel] Working with connections
Hi All, I use connections to open and close data folders needed by my package. After each function closes I get the following warnings (depending on the connection that has been opened). 10: closing unused connection 3 (/Library/Frameworks/R.framework/Versions/3.2/Resources/library/BondLab/BondData/bondlabMBS4.rds) Below is the connection function that is related to the above warning: # #' A connection function to BondData calling MBS cusps #' #' Opens a connection to the BondData folder to call MBS cusip data #' @param MBS.id A character string the MBS.id or cusip number current MBS.id is supported #' @export MBS <- function(MBS.id = "character"){ MBS.Conn <- gzfile(description = paste(system.file(package = "BondLab"), "/BondData/", MBS.id, ".rds", sep = ""), open = "rb") MBS <- readRDS(MBS.Conn) return(MBS) close.connection(MBS.Conn) } I have googled this warning and it seems to be triggered when a function terminates and the connection is open. But, I think the connection function closes the connection once the object is returned. What am I doing wrong? -Glenn __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel