On Jan 15, 2014, at 4:37 PM, Fisher Dennis wrote:

> R 3.0.2
> OS X
> 
> Colleagues
> 
> I am writing code to read a large number of files in a particular folder.  In 
> some situations, there may be two versions of the file with different 
> extensions, e.g.:
>       FILE.csv
>       FILE.xls
> I extracted the portion before the extension with:
>       sub("\\..*$", "", basename(FILELIST))
> then used 
>       duplicated
> to find duplicates.  All was well until I encountered files named:
>       FILE.XXX.csv
>       FILE.YYY.xls
> 
> My regular expression extracted only the “FILE” portion of the text and 
> claimed that the filenames (without the extensions) matched.  Can someone 
> provide me with the appropriate regular expression to deal with this?  Thanks.

Why not:

sub("\\..{3}$", "", basename(FILELIST))

See ?regex

-- 

David Winsemius
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to