James Yes, I just did much the same myself, assuming it might be a wait of a few seconds - similar result, less than a second. Brute force is less prone to making my brain hurt than regular expressions, too!
Thanks for the input. _____ Ian Thomas Victoria Park, Western Australia _____ From: [email protected] [mailto:[email protected]] On Behalf Of James Chapman-Smith Sent: Friday, November 26, 2010 3:19 PM To: 'ozDotNet' Subject: RE: One for next week Hi Ian, I just did a test of the speed of removing the invalid chars using brute force. Here's my code: var invalids = System.IO.Path.GetInvalidPathChars() .Union(System.IO.Path.GetInvalidFileNameChars()); var text = new string('x', 200000); var query = from c in text where !invalids.Contains(c) select c; var clean = new string(query.ToArray()); My computer manages to strip the chars from a 139,000 character string in about a second - timed using System.Diagnostics.Stopwatch. So for many circumstances I think that a brute force approach is quite workable. What do you think? Cheers. James. From: [email protected] [mailto:[email protected]] On Behalf Of Ian Thomas Sent: Friday, 26 November 2010 17:22 To: 'ozDotNet' Subject: One for next week My regex is very irregular, so some ideas would be nice Problem: excluding the prohibited characters from file paths and file names. I started off thinking that \ / : * ? " < > | would be about the maximum, and I would just pass the filenames (generated from text titles - eg, books, videos, etc) though a simple looping routine looking for the 9 prohibited characters. Using a simple regex, Regex.Replace(strIn, "[^\...@-]", "") is too restrictive - for example, bracketed numbers (1), [23], etc are very common. I've devoted too long to expanding this without much joy, and would appreciate help. In my researches, I discovered these two helpful methods in System.IO - which is why my first approach, comparing characters and arrays, was abandoned to explore if regular expressions might help. Path.GetInvalidPathChars() - Get a list of invalid path characters (returns an array of Char) and Path.GetInvalid FileNameChars() - Get a list of invalid filename characters (returns an array of Char) The number returned is surprisingly large, so iterating through even a 50-character long filename / path name and checking for the undesirable characters would be considerably longer than doing the same for 9 characters. _____ Ian Thomas Victoria Park, Western Australia _____ No virus found in this message. Checked by AVG - www.avg.com Version: 10.0.1170 / Virus Database: 426/3278 - Release Date: 11/25/10
