Bumping this thread in the hope to catch the attention from R core. As I try to argue in my original post, given the existing internal structure of connections, I don't think it's too hard to add protection against the use of corrupted R connection.
Writing to corrupted connection is a mistake that currently may pass silently while corrupting a non-intended target. Henrik On Tue, Oct 30, 2018, 19:51 Henrik Bengtsson <henrik.bengts...@gmail.com wrote: > SUMMARY: > > I'm proposing that R assert that 'connection' options have not changed > since first created such that R will produce the following error: > > > fh <- file("a.txt", open = "w+") > > cat("hello\n", file = fh) > > close(fh) > > > fh2 <- file("b.txt", open = "w+") > > cat("world\n", file = fh2) > > > cat("hello again\n", file = fh) > Error in cat("hello again\n", file = fh) : > invalid connection (non-existing 'conn_id') > > Note that, currently in R, the latter silently writes to 'b.txt' - not > 'a.txt' (for more details, see > https://github.com/HenrikBengtsson/Wishlist-for-R/issues/81). > > > BACKGROUND: > > In R, connections are indexed by their (zero-based) row indices in the > table of available connections. For example, > > > fh <- file("a.txt", open = "w") > > showConnections(all = TRUE) > description class mode text isopen can read can write > 0 "stdin" "terminal" "r" "text" "opened" "yes" "no" > 1 "stdout" "terminal" "w" "text" "opened" "no" "yes" > 2 "stderr" "terminal" "w" "text" "opened" "no" "yes" > 3 "a.txt" "file" "w" "text" "opened" "no" "yes" > > con <- getConnection(3) > > identical(con, fh) > [1] TRUE > > > ISSUE: > > The problem with the current design/implementation where connections > are referred to by their index (only), is that > > (i) the table of connections changes over time and > (ii) connection indices are recycled. > > Because a `connection` object holds the connection row index, it means > that *the actual underlying connection that a `connection` object > refers to may change over its lifetime*. > > > SUGGESTION: > > Make use of the 'Rconn' struct field 'id', which is unique, to assert > that the 'connection' object used is referring to the > original/expected connection. The 'id' field is available via > attribute 'conn_id' part of a 'connection' object. > > > PATCH: > > See attached 'connection.patch' file (or > > https://github.com/HenrikBengtsson/Wishlist-for-R/issues/81#issuecomment-434210222 > ). > The patch introduces a new SEXP R_GetConnection2(SEXP sConn) function, > which looks up a connection by its index *and* the 'id' field. This > function is backward compatible with R_GetConnection(), which looks up > a connection by its index (only). In addition, R_GetConnection2() also > accepts 'sConn' of type integer, which the looks up the connection > similar to how the internal getConnection() function does it. > > Comment: The patch is just one of many alternatives. Hopefully, it > helps clarify what I'm suggesting. It passes 'make check' and I've > tested it on a few packages of mine that make heavy use of different > types of connections. > > In addition to "overridden" connections, the patch protects against > invalid 'connection':s that have been serialized, e.g. > > > fh2 <- file("b.txt", open = "w+") > > saveRDS(fh2, file = "fh2.rds") > > fh3 <- readRDS("fh2.rds") > > attr(fh2, "conn_id") > <pointer: 0x78> > > attr(fh3, "conn_id") > <pointer: (nil)> #<== NIL because external pointer was lost when > serialized > > isOpen(fh2) > [1] TRUE > > isOpen(fh3) > Error in isOpen(fh3) : invalid connection ('conn_id' is NULL) > > This is useful, when for instance 'connection':s are (incorrectly) > passed to background R sessions (e.g. PSOCK cluster nodes). > > > SEE ALSO: > > * More details of the above are scribbled down on > https://github.com/HenrikBengtsson/Wishlist-for-R/issues/81 > * R-devel post 'closeAllConnections() can really mess things up', > 2016-10-30, > https://stat.ethz.ch/pipermail/r-devel/2016-October/073331.html > > All the best, > > Henrik > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel