Sam, that's a method I had not ever seen - I learn something every day! (problem is when you forget it, within a week)
_____ Ian Thomas Victoria Park, Western Australia _____ From: [email protected] [mailto:[email protected]] On Behalf Of Sam Lai Sent: Friday, November 26, 2010 3:26 PM To: ozDotNet Subject: Re: One for next week I wonder if using String.IndexOfAny would be any faster, e.g. string filename = "some\invalid+:file^*>?name.txt"; if (filename.IndexOfAny(System.IO.Path.GetInvalidFileNameChars()) >= 0) System.Console.WriteLine("contains invalid filename chars!"); else System.Console.WriteLine("valid!"); On 26 November 2010 18:18, James Chapman-Smith <[email protected]> wrote: > Hi Ian, > > > > I just did a test of the speed of removing the invalid chars using brute > force. Here's my code: > > > > > > var invalids = System.IO.Path.GetInvalidPathChars() > > .Union(System.IO.Path.GetInvalidFileNameChars()); > > > > var text = new string('x', 200000); > > > > var query = from c in text > > where !invalids.Contains(c) > > select c; > > > > var clean = new string(query.ToArray()); > > > > > > My computer manages to strip the chars from a 139,000 character string in > about a second - timed using System.Diagnostics.Stopwatch. So for many > circumstances I think that a brute force approach is quite workable. What do > you think? > > > > Cheers. > > > > James. > > > > From: [email protected] [mailto:[email protected]] > On Behalf Of Ian Thomas > Sent: Friday, 26 November 2010 17:22 > To: 'ozDotNet' > Subject: One for next week > > > > My regex is very irregular, so some ideas would be nice > > Problem: excluding the prohibited characters from file paths and file names. > I started off thinking that \ / : * ? " < > | would be about the maximum, > and I would just pass the filenames (generated from text titles - eg, books, > videos, etc) though a simple looping routine looking for the 9 prohibited > characters. > > Using a simple regex, Regex.Replace(strIn, "[^\...@-]", "") is too > restrictive - for example, bracketed numbers (1), [23], etc are very common. > I've devoted too long to expanding this without much joy, and would > appreciate help. > > In my researches, I discovered these two helpful methods in System.IO - > which is why my first approach, comparing characters and arrays, was > abandoned to explore if regular expressions might help. > > Path.GetInvalidPathChars() - Get a list of invalid path characters (returns > an array of Char) > > and > > Path.GetInvalid FileNameChars() - Get a list of invalid filename characters > (returns an array of Char) > > The number returned is surprisingly large, so iterating through even a > 50-character long filename / path name and checking for the undesirable > characters would be considerably longer than doing the same for 9 > characters. > > ________________________________ > > Ian Thomas > Victoria Park, Western Australia _____ No virus found in this message. Checked by AVG - www.avg.com Version: 10.0.1170 / Virus Database: 426/3278 - Release Date: 11/25/10
