Re: [Rd] Feature request: non-dropping regmatches/strextract

2019-08-29 Thread Michael Lawrence via R-devel
Just started thinking about this. The name of regmatches() suggests
that it will only extract the matches but not return anything for the
non-matches. We might need another function that returns a value for
non-matches. Perhaps the value should be the empty string for
non-matches and NA for matches to NA. The rationale is that we
delegate to regexpr() (at least conceptually), and it returns a
"matching region" which would be empty when there is no match. We
could allow strcapture() to accept an atomic vector as a prototype,
which would do what you want for regexec() (NA on no match, empty
string on empty capture). Then we could call the regexpr()-based
function strextract().

What do you think?

Michael

On Thu, Aug 29, 2019 at 3:27 PM Cyclic Group Z_1
 wrote:
>
> Thank you! I greatly appreciate your consideration, though of course it is up 
> to you. I think many people switch to stringr/stringi simply because 
> functions in those packages have some consistent design choices, for example, 
> they do not drop empty/missing matches, which facilitates array-based 
> programming. For example, in the cases where one needs to make a new column 
> in a data.frame (data.table, tibble, etc.) of regex extractions. Or in any 
> other case where there needs to be an element-wise correspondence between 
> input and output. I think insertion of NA_character_ to prevent dropping 
> indices seems like the natural choice for an array language (which, I think, 
> motivated the creation of stringr/stringi). While those are great packages 
> and this behavior can be easily replicated with simple wrappers, string 
> operations are normally easy to accomplish in base languages, so this seems 
> like something that would be appropriate to have in base. For example, MATLAB 
> and Pandas regex both all
 ow non-dropping empty matches (though of course I acknowledge Pandas is not a 
base language).
>
> Best,
> CG



-- 
Michael Lawrence
Scientist, Bioinformatics and Computational Biology
Genentech, A Member of the Roche Group
Office +1 (650) 225-7760
micha...@gene.com

Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Feature request: non-dropping regmatches/strextract

2019-08-29 Thread Michael Lawrence via R-devel
I'd be happy to entertain patches or at least more specific
suggestions to improve strextract() and strcapture(). I hadn't
exported strextract(), because I wasn't quite sure how it should
behave. This feedback should be helpful.

Thanks,
Michael

On Thu, Aug 29, 2019 at 2:20 PM Cyclic Group Z_1 via R-devel
 wrote:
>
> Thank you, I am aware that there are packages that can accomplish this. I 
> mentioned stringr::str_extract as a function that does not drop empty 
> matches. I think that the behavior of regmatches(..., regexpr(...)) in base R 
> should permit an option to prevent dropping of empty matches both for sake of 
> consistency with the rest of the language (missing data does not yield a 
> dropped index in other sorts of R functions, and an empty match conceptually 
> corresponds with missing data) and facility of use in data.frames. The 
> behavior of regmatches(..., gregexpr(...)) is not objectionable to me, as 
> lists do not drop indices when they contain character(0) vectors. 
> Alternatively, perhaps this should be reflected in the (currently 
> non-exported) strextract.
>
> Best,
> CG
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Michael Lawrence
Scientist, Bioinformatics and Computational Biology
Genentech, A Member of the Roche Group
Office +1 (650) 225-7760
micha...@gene.com

Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Bioc-devel] Error: DLL 'rgl' not found: maybe not installed for this architecture?

2019-08-29 Thread Pages, Herve
Hi,

On 8/23/19 14:38, Venu Thatikonda wrote:
> Hi,
> 
> During one of my R packages bioc review, I see the following 2 errors,
> 
> one:
> 
> Error in library.dynam(dynlib, pkg, lib) :
>DLL 'rgl' not found: maybe not installed for this architecture?
> Error: .onLoad failed in loadNamespace() for 'rgl', details:
>call: NULL
>error: Loading rgl's DLL failed.
> Execution halted
> ERROR: lazy loading failed for package 'ALPS'
> * removing 
> 'C:/Users/pkgbuild/packagebuilder/workers/jobs/1215/2a61338/ALPS.buildbin-libdir/ALPS'
> 

For some reasons we had a broken installation of rgl on the Windows 
builders. This should be fixed now. Sorry for that.

> I'm not using any functions from 'rgl', not sure why it is a required
> dependency. Or is there any key detail am I missing? R CMD build, check
> BiocCheck are very clean on my system.

You're not using any function from rgl but I suspect you depend on some 
package that depends on rgl. So your package depends **indirectly** on 
rgl. This means that when you load your package, rgl gets loaded too 
(see sessionInfo() after loading your package).

> 
> Two:
> 
>  * ERROR: System Files found that should not be git tracked:
>  ALPS.Rproj
> 
> 
> Based on this discussion (
> https://urldefense.proofpoint.com/v2/url?u=https-3A__community.rstudio.com_t_should-2Drproj-2Dfiles-2Dbe-2Dadded-2Dto-2Dgitignore_1269=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=8Zrvz7lJB0ilLGOQT4Xj-V07HpjBtgN2hCX9TAtwztk=bNlefMI8tSDVTK1Yk3jOBMNAiNMaE5A2NXTJo-iSNg0=
>  ),
> it is in fact a bit useful not adding PACKAGE.Rproj to .gitignore.

That check seems unjustified to me. Also it seems that adding ALPS.Rproj 
to your .gitignore won't make this error go away. I would suggest that 
you open an issue for it at:

   https://github.com/Bioconductor/BiocCheck/issues

In the meantime hopefully you can convince the reviewer of your package 
to ignore this error.

> Do all bioc packages require adding *Rproj to gitignore?

FWIW I see more than 200 BioC software packages with a PACKAGE.Rproj 
file in their git repo. AFAIK this file has never caused any problem. 
It's automatically ignored by 'R CMD build' and 'R CMD INSTALL' so 
everything is fine.

Best,
H.

> 
> Here is the report, if it's helpful.
> 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_spb-5Freports_ALPS-5Fbuildreport-5F20190823145032.html=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=8Zrvz7lJB0ilLGOQT4Xj-V07HpjBtgN2hCX9TAtwztk=KMJ8PJKE39eKYpi42dIbQlGrfQdm1CDHvdrFq-M-rww=
> 
> Thank you & have a ncie day!
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Rd] ?Syntax wrong about `?`'s precedence ?

2019-08-29 Thread Ant F
Dear all,

`?Syntax` documents that `?` has the lowest precedence, right under `=`.

Indeed it reads:

*The following unary and binary operators are defined. They are listed in
precedence groups, from highest to lowest.  *

and ends the list with

*<- <<-* *assignment (right to left)*
*=* *assignment (right to left)*
*?* *help (unary and binary)*
I believe it to be wrong, `=` has lower precedence than `?`.

See the following example :

`?` <- `+`
x = 2 ? 3
x
#> [1] 5

We see that `2 ? 3` is evaluated first, then the result is assigned to x,
showing
higher precedence for `?`.

Compare it to the similar code using `<-` :

x <- 2 ? 3
#> [1] 5
x
#> [1] 2

Here first `x <- 2` is evaluated, then its output is added to 3, and the
result
`5` is printed. and we verify that `x` is still `2`. Showing lower
precedence
for `?` consistent with the doc.

Hadley Wickham's package `lobstr` makes it easy to compare the parse trees:

lobstr::ast({x = 2 ? 3})
#> o-`{`
#> \-o-`=`
#>   +-x
#>   \-o-`?`
#> +-2
#> \-3

lobstr::ast({x <- 2 ? 3})
#> o-`{`
#> \-o-`?`
#>   +-o-`<-`
#>   | +-x
#>   | \-2
#>   \-3

Best regards,

Antoine

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel