I wonder if using String.IndexOfAny would be any faster, e.g.

string filename = "some\invalid+:file^*>?name.txt";
if (filename.IndexOfAny(System.IO.Path.GetInvalidFileNameChars()) >= 0)
    System.Console.WriteLine("contains invalid filename chars!");
else
    System.Console.WriteLine("valid!");

On 26 November 2010 18:18, James Chapman-Smith <[email protected]> wrote:
> Hi Ian,
>
>
>
> I just did a test of the speed of removing the invalid chars using brute
> force. Here’s my code:
>
>
>
>
>
> var invalids = System.IO.Path.GetInvalidPathChars()
>
>      .Union(System.IO.Path.GetInvalidFileNameChars());
>
>
>
> var text = new string('x', 200000);
>
>
>
> var query = from c in text
>
>                 where !invalids.Contains(c)
>
>                 select c;
>
>
>
> var clean = new string(query.ToArray());
>
>
>
>
>
> My computer manages to strip the chars from a 139,000 character string in
> about a second – timed using System.Diagnostics.Stopwatch.  So for many
> circumstances I think that a brute force approach is quite workable. What do
> you think?
>
>
>
> Cheers.
>
>
>
> James.
>
>
>
> From: [email protected] [mailto:[email protected]]
> On Behalf Of Ian Thomas
> Sent: Friday, 26 November 2010 17:22
> To: 'ozDotNet'
> Subject: One for next week
>
>
>
> My regex is very irregular, so some ideas would be nice
>
> Problem: excluding the prohibited characters from file paths and file names.
> I started off thinking that \ / : * ? " < > | would be about the maximum,
> and I would just pass the filenames (generated from text titles – eg, books,
> videos, etc) though a simple looping routine looking for the 9 prohibited
> characters.
>
> Using a simple regex, Regex.Replace(strIn, "[^\...@-]", "") is too
> restrictive – for example, bracketed numbers (1), [23], etc are very common.
> I’ve devoted too long to expanding this without much joy, and would
> appreciate help.
>
> In my researches, I discovered these two helpful methods in System.IO –
> which is why my first approach, comparing characters and arrays, was
> abandoned to explore if regular expressions might help.
>
> Path.GetInvalidPathChars() - Get a list of invalid path characters (returns
> an array of Char)
>
> and
>
> Path.GetInvalid FileNameChars() - Get a list of invalid filename characters
> (returns an array of Char)
>
> The number returned is surprisingly large, so iterating through even a
> 50-character long filename / path name and checking for the undesirable
> characters would be considerably longer than doing the same for 9
> characters.
>
> ________________________________
>
> Ian Thomas
> Victoria Park, Western Australia

Reply via email to