On Thu, Oct 19, 2017 at 9:23 PM, Chunlei Wu <c...@scripps.edu> wrote:
> Thank you all for the feedback. Just to give some extra context, here we > have the Python and Javascript versions of the biothings_client: > > > https://github.com/biothings/biothings_client.py > > https://github.com/biothings/biothings_client.js > > > And here is the work-in-progress R client: > > > https://github.com/biothings/biothings_client.R > > > You can find some examples from the README and the test code to see how > the client works in Python and Javascript. > > > One of the nice features of both Python and JS clients is it allows users > to use the same client instance for any new "BioThings" API in the future, > which can be created by another user, not just from us. In this case, one > can do this to work with a new API in python client: > > > from biothings_client import get_client > > mything_client = get_client("mything", url="http://example.com/v1/api") > # could have some extra parameters > > mything_client.query(...) > > mything_client.get_mything(...) > > ... > > > As the developer of all these three biothings_clients, we, of course, like > to keep the same pattern for R, and R6 looks the closest to me. But it > looks like, from R users' perspective, this is not a popular pattern to use > Yes, there will probably be way more users of R wanting to use BioThings than BioThings users wanting to use R. > . With your suggestion, I think it can work this way in R: > > > library(biothings) > > gene_client = BioThingsClient('gene') # a gene client with a preset > config > > queryBioThings(gene_client, "CDK2") # whether we should keep client as > the first argv, that's still TBD, based on the previous pipe comment > > > mything_client = BioThingsClient('mything', url= " > http://example.com/v1/api") > > queryBioThings(mything_client, "something') > > > > Another thing I should mention, in Python client, each client has these > methods: > > > gene_client.getgene > > gene_client.getgenes > > gene_client.query > > gene_client.querymany > > gene_client.metdata > > > Then in R, we will have to create these generic methods (hope this is the > right term): > > > getBioThing(mything_client, ...) > > getBioThings > As Herve points out, R users will expect queries to be vectorized implicitly. queryBioThings() or whatever should probably return a tabular structure describing the things. There is no need for distinguishing singular and plural. > > queryBioThings > > queryManyBioThings > > BioThingsMetadata > > > I personally still like the Python/JS pattern, as you can have client > specific name like "getgene", "getgenes", instead of the generic > getBioThing and getBioThings name. Plus that users can just call > "gene_client" part as "gc" or whatever, it just has much less to type :-) > in the code. In R S4 case, the function name has to be more verbose because > they are global. > > > There seems to be a misconception here. S4 has two types of classes, conventional value classes, and reference classes. The reference classes have the same syntax as the R6 classes. R6 is mostly a stripped down version of S4 reference classes. In this particular case, R is sufficiently flexible that it would be easy to support the reference class syntax on ordinary value classes. So you could use the reference class syntax, but I wouldn't recommend it, for the aforementioned reasons. Moreover, be careful about carrying over notions from Python and JS into R. R is unique in fundamental ways. Does this sound good to the group? Any more suggestions? > > > Chunlei > > > > > > > > > > > > > > > ------------------------------ > *From:* Michael Lawrence <lawrence.mich...@gene.com> > *Sent:* Thursday, October 19, 2017 8:32 PM > *To:* Martin Morgan > *Cc:* Charles Plessy; bioc-devel@r-project.org; Chunlei Wu > *Subject:* Re: [Bioc-devel] R6 class v.s. S4 class > > API discoverability is a big problem in languages with a functional > syntax. Namespaces are verbose, but they do provide for constrained > autocompletion. Prefixing all symbols with an abbreviation like "bt_" seems > too adhoc to me, but it is common practice. Explicitly querying for methods > takes the user out of the flow. > > One could imagine an IDE showing available methods in the tooltip of > function symbols. > > I guess an IDE could support autocompeting on "(object)" or "(object,", > where <tab> would display generics with applicable methods and fill in the > name in front of the "(". Not very intuitive though. > > By simplifying our APIs we make discoverability less of an issue, because > they are easily listed on cheat sheets and memorized. > > I wonder if there are ideas to steal from Julia. > > On Thu, Oct 19, 2017 at 7:36 PM, Martin Morgan < > martin.mor...@roswellpark.org> wrote: > >> On 10/19/2017 09:24 PM, Charles Plessy wrote: >> >>> (Just sharing my thoughts as those days I am spending quite >>> some time preparing the upgrade of a Bioconductor package). >>> >>> Le Fri, Oct 20, 2017 at 12:50:48AM +0000, Ryan Thompson a écrit : >>> >>>> >>>> gene_client <- BioThingsClient("gene") >>>> query("CDK2", client=gene_client) >>>> >>> >>> In addition, since the piping operator (%>%) of dplyr and magrittr is >>> gaining traction, I would recommend to carefully consider which will be >>> the first argument of the function: >>> >>> With the client as first argument, one can then write things like: >>> >>> gene_client %>% query("CDK2") # similar to query(gene_client, >>> "CDK2") >>> >> >> The Bioconductor convention would use S4 objects with CamelCase >> constructors. >> >> geneClient = BioThingsGeneClient() ## or just GeneClient() >> >> I agree with enabling the use of pipe, and think the generic + methods >> should have signature where the first argument is the client rather than >> the pattern against which the query occurs. There is to some extent an >> argument for name-mangling in the generic (other knowledgeable people >> disagree) so that one is free to implement contracts unique to the package >> in question, and avoid conflicts with other generics with identical names >> in different packages ( AnnotationDbi::select() / dplyr::select()). >> >> setGeneric( >> "btQuery", >> function(x, query, ...) standardGeneric("btQuery") >> ) >> >> setMethod( >> "btQuery", "GeneClient", >> function(x, query) >> { >> ## implementation >> }) >> >> btQuery(geneClient, "CDK2") ## maybe btquery(...) >> >> Yes one could BioThings::query(), or >> semanticallyInformativeAlterntaiveToQuery(), >> but these seem cumbersome to me, and the first at least has rough edges >> (that of course should be fixed...), e.g., >> >> > methods(AnnotationHub::query) >> Error in .S3methods(generic.function, class, parent.frame()) : >> no function 'AnnotationHub::query' is visible >> >> I think Michael is arguing for something like plain-old-functions (and >> the original examples and problems of multiplying methods seemed somehow to >> be plain old functions rather than S4 generics and methods?) >> >> geneQuery <- function(x, query) ... >> >> A down side is that one cannot discover programatically what one can do >> with a GeneClient object (if it were a method, one could ask for >> methods(class=class(geneClient))); as a developer one also needs to >> validate the incoming argument, which requires a certain but not >> unsurmountable discipline. >> >> Michael didn't mention it, but these slides of his are relevant >> >> >> https://bioconductor.org/help/course-materials/2017/BioC2017 >> /DDay/BOF/usability.pdf >> >> One other lesson from the annotation world is to think carefully about >> the structure of the return, in particular thinking about 1:1 versus 1:many >> mappings between vector-valued 'pattern='. While it's tempting to return >> say a character vector or named list, probably one wants these days to take >> the lessons of tidy data and return a data.frame-like (e.g., DataFrame(), >> but maybe that's not 'necessary'; nothing wrong with a tibble, but a >> data.table is not likely necessary or particularly advised [because of the >> novel syntax and reference semantics]) object where the first column is the >> query and the second and subsequent columns the result of the query; one >> wants to pay particular attention to dealing with 1:0 and 1:many mappings >> in ways that do not confuse users; some use cases (e.g., adding annotations >> to the rowData() of SummarizedExperiment) are really facilitated by a 1:1 >> mapping between query and response. >> >> Martin >> >> >>> With the gene symbol as first argument: >>> >>> "CDK2" %>% query(gene_client) # similar to query("CDK2", >>> gene_client) >>> >>> If gene symbols may come as output from other commands and the query >>> function is able to work smartly with a vector of gene symbols as input, >>> then the second pattern might be useful. Otherwise the first pattern >>> probably makes more sense. >>> >>> See https://cran.r-project.org/web/packages/magrittr/vignettes/m >>> agrittr.html for details. >>> >>> (Note however that the piped and non-piped functions are not exactly >>> equivalent, and that piped commands can be harder to debug; therefore >>> it may be better to only use them in interactive sessions.) >>> >>> Have a nice day, >>> >>> >> >> This email message may contain legally privileged and/or...{{dropped:2}} >> >> >> _______________________________________________ >> Bioc-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel