Re: [R] Using regular expressions to detect clusters of consonants in a string

2009-07-01 Thread Mark Heckmann
Hi Gabor, thanks fort his great advice. Just one more question: I cannot find how to switch off case sensitivity for the regex in the documentation for gsubfn or strapply, like e.g. in gregexpr the ignore.case =TRUE command. Is there a way? TIA, Mark --- Mark

Re: [R] Using regular expressions to detect clusters of consonants in a string

2009-07-01 Thread Gabor Grothendieck
strapply and gsubfn pass the ... argument to gsub so it accepts all the same arguments. See ?strappy and ?gsubfn. e.g. strapply(MyString, [bcdfghjklmnpqrstvwxyz]+, nchar, ignore.case = TRUE) [[1]] [1] 5 2 gsubfn([bcdfghjklmnpqrstvwxyz]+, X, MyString, ignore.case = TRUE) [1] XiX On Wed, Jul

[R] Using regular expressions to detect clusters of consonants in a string

2009-06-30 Thread Mark Heckmann
Hi, I want to parse a string extracting the number of occurrences where two consonants clump together. Consider for example the word hallo. Here I want the algorithm to return 1. For chess if want it to return 2. For the word screw the result should be negative as it is a clump of three

Re: [R] Using regular expressions to detect clusters of consonants in a string

2009-06-30 Thread Greg Hirson
Mark, Abstraction also has a valid two consonant cluster (ct). Some logic could be added to reject words that have valid twos if they also have longer strings of consonants. This may work as a starting off point, using strsplit: twocons = function(word){ chars = strsplit(word, [aeiou])

Re: [R] Using regular expressions to detect clusters of consonants in a string

2009-06-30 Thread Gabor Grothendieck
Try this: library(gsubfn) s - mystring strapply(s, [bcdfghjklmnpqrstvwxyz]+, nchar)[[1]] which returns a vector of consonant string lengths. Now apply your algorithm to that. See http://gsubfn.googlecode.com for more. On Tue, Jun 30, 2009 at 11:30 AM, Mark Heckmannmark.heckm...@gmx.de wrote: