Excellent - thanks David!
Regex syntax never fails to scare the crap out of me :)

David absolutely solved my problem (in record time, no less), so it
can be put to rest. However, if anyone knows how to accomplish the
same thing through non base packages, like stringr or stringi, I'd be
interested in seeing those solutions as well.

Thanks,
Joe


On Mon, Oct 24, 2016 at 3:42 PM, David Wolfskill <da...@catwhisker.org> wrote:
>
> On Mon, Oct 24, 2016 at 03:33:20PM -0600, Joe Ceradini wrote:
> > R Helpers,
> >
> > I would like to extract the entire word beginning with "BT" (or "BT-")
> > and not any thing else in the string. Or, I would like to extract from
> > BT up until the next space.
> >
> > test <- data.frame(x = c("abc", "Sample BT-1501-2E stuff", "Bt-1599-3E 
> > stuff"))
> > test
> >
> > So, from test$x I would like to only extract "BT-1501-2E" and "Bt-1599-3E".
> >
> > I started with straight grep but of course that is not what I need.
> > grep("BT", test$x, value = TRUE, ignore.case = TRUE)
> > "Sample BT-1501-2E stuff" "Bt-2134df stuff"
> >
> > In a somewhat similar post, the solution involved boundaries or
> > anchors, but I haven't been able to adapt it to my needs, so I won't
> > even bother including my boundary attempts :)
> > http://stackoverflow.com/questions/7227976/using-grep-in-r-to-find-strings-as-whole-words-but-not-strings-as-part-of-words
> >
> > If possible, it would also be helpful if something was returned, like
> > NA, for rows without a "BT" match. So, conceptually, test$x would
> > return:
> > NA, "BT-1501-2E", "Bt-1599-3E".
> >
> > Thanks!
> > Joe
> > ....
>
> This is not exactly what you requested, as it returns the original
> unmodified string when there's no match; I expect you can come up with
> some code to test for that.  It does, however, meet the rest of your
> requirements -- or so I believe:
>
> > test
>                         x
> 1                     abc
> 2 Sample BT-1501-2E stuff
> 3        Bt-1599-3E stuff
> > sub("^.*(BT-?\\w*).*$", "\\1", test$x, ignore.case = TRUE, perl = TRUE)
> [1] "abc"     "BT-1501" "Bt-1599"
> >
>
> Peace,
> david
> --
> David H. Wolfskill                              da...@catwhisker.org
> Those who would murder in the name of God or prophet are blasphemous cowards.
>
> See http://www.catwhisker.org/~david/publickey.gpg for my public key.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to