On 10/19/2017 09:24 PM, Charles Plessy wrote:
(Just sharing my thoughts as those days I am spending quite
some time preparing the upgrade of a Bioconductor package).
Le Fri, Oct 20, 2017 at 12:50:48AM +0000, Ryan Thompson a écrit :
gene_client <- BioThingsClient("gene")
query("CDK2", client=gene_client)
In addition, since the piping operator (%>%) of dplyr and magrittr is
gaining traction, I would recommend to carefully consider which will be
the first argument of the function:
With the client as first argument, one can then write things like:
gene_client %>% query("CDK2") # similar to query(gene_client, "CDK2")
The Bioconductor convention would use S4 objects with CamelCase
constructors.
geneClient = BioThingsGeneClient() ## or just GeneClient()
I agree with enabling the use of pipe, and think the generic + methods
should have signature where the first argument is the client rather than
the pattern against which the query occurs. There is to some extent an
argument for name-mangling in the generic (other knowledgeable people
disagree) so that one is free to implement contracts unique to the
package in question, and avoid conflicts with other generics with
identical names in different packages ( AnnotationDbi::select() /
dplyr::select()).
setGeneric(
"btQuery",
function(x, query, ...) standardGeneric("btQuery")
)
setMethod(
"btQuery", "GeneClient",
function(x, query)
{
## implementation
})
btQuery(geneClient, "CDK2") ## maybe btquery(...)
Yes one could BioThings::query(), or
semanticallyInformativeAlterntaiveToQuery(), but these seem cumbersome
to me, and the first at least has rough edges (that of course should be
fixed...), e.g.,
> methods(AnnotationHub::query)
Error in .S3methods(generic.function, class, parent.frame()) :
no function 'AnnotationHub::query' is visible
I think Michael is arguing for something like plain-old-functions (and
the original examples and problems of multiplying methods seemed somehow
to be plain old functions rather than S4 generics and methods?)
geneQuery <- function(x, query) ...
A down side is that one cannot discover programatically what one can do
with a GeneClient object (if it were a method, one could ask for
methods(class=class(geneClient))); as a developer one also needs to
validate the incoming argument, which requires a certain but not
unsurmountable discipline.
Michael didn't mention it, but these slides of his are relevant
https://bioconductor.org/help/course-materials/2017/BioC2017/DDay/BOF/usability.pdf
One other lesson from the annotation world is to think carefully about
the structure of the return, in particular thinking about 1:1 versus
1:many mappings between vector-valued 'pattern='. While it's tempting to
return say a character vector or named list, probably one wants these
days to take the lessons of tidy data and return a data.frame-like
(e.g., DataFrame(), but maybe that's not 'necessary'; nothing wrong with
a tibble, but a data.table is not likely necessary or particularly
advised [because of the novel syntax and reference semantics]) object
where the first column is the query and the second and subsequent
columns the result of the query; one wants to pay particular attention
to dealing with 1:0 and 1:many mappings in ways that do not confuse
users; some use cases (e.g., adding annotations to the rowData() of
SummarizedExperiment) are really facilitated by a 1:1 mapping between
query and response.
Martin
With the gene symbol as first argument:
"CDK2" %>% query(gene_client) # similar to query("CDK2", gene_client)
If gene symbols may come as output from other commands and the query
function is able to work smartly with a vector of gene symbols as input,
then the second pattern might be useful. Otherwise the first pattern
probably makes more sense.
See https://cran.r-project.org/web/packages/magrittr/vignettes/magrittr.html
for details.
(Note however that the piped and non-piped functions are not exactly
equivalent, and that piped commands can be harder to debug; therefore
it may be better to only use them in interactive sessions.)
Have a nice day,
This email message may contain legally privileged and/or...{{dropped:2}}
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel