Sam, that's a method I had not ever seen - I learn something every day!
(problem is when you forget it, within a week)

 

  _____  

Ian Thomas
Victoria Park, Western Australia

  _____  

From: [email protected] [mailto:[email protected]]
On Behalf Of Sam Lai
Sent: Friday, November 26, 2010 3:26 PM
To: ozDotNet
Subject: Re: One for next week

 

I wonder if using String.IndexOfAny would be any faster, e.g.

string filename = "some\invalid+:file^*>?name.txt";
if (filename.IndexOfAny(System.IO.Path.GetInvalidFileNameChars()) >= 0)
    System.Console.WriteLine("contains invalid filename chars!");
else
    System.Console.WriteLine("valid!");

On 26 November 2010 18:18, James Chapman-Smith <[email protected]>
wrote:
> Hi Ian,
>
>
>
> I just did a test of the speed of removing the invalid chars using brute
> force. Here's my code:
>
>
>
>
>
> var invalids = System.IO.Path.GetInvalidPathChars()
>
>      .Union(System.IO.Path.GetInvalidFileNameChars());
>
>
>
> var text = new string('x', 200000);
>
>
>
> var query = from c in text
>
>                 where !invalids.Contains(c)
>
>                 select c;
>
>
>
> var clean = new string(query.ToArray());
>
>
>
>
>
> My computer manages to strip the chars from a 139,000 character string in
> about a second - timed using System.Diagnostics.Stopwatch.  So for many
> circumstances I think that a brute force approach is quite workable. What
do
> you think?
>
>
>
> Cheers.
>
>
>
> James.
>
>
>
> From: [email protected] [mailto:[email protected]]
> On Behalf Of Ian Thomas
> Sent: Friday, 26 November 2010 17:22
> To: 'ozDotNet'
> Subject: One for next week
>
>
>
> My regex is very irregular, so some ideas would be nice
>
> Problem: excluding the prohibited characters from file paths and file
names.
> I started off thinking that \ / : * ? " < > | would be about the maximum,
> and I would just pass the filenames (generated from text titles - eg,
books,
> videos, etc) though a simple looping routine looking for the 9 prohibited
> characters.
>
> Using a simple regex, Regex.Replace(strIn, "[^\...@-]", "") is too
> restrictive - for example, bracketed numbers (1), [23], etc are very
common.
> I've devoted too long to expanding this without much joy, and would
> appreciate help.
>
> In my researches, I discovered these two helpful methods in System.IO -
> which is why my first approach, comparing characters and arrays, was
> abandoned to explore if regular expressions might help.
>
> Path.GetInvalidPathChars() - Get a list of invalid path characters
(returns
> an array of Char)
>
> and
>
> Path.GetInvalid FileNameChars() - Get a list of invalid filename
characters
> (returns an array of Char)
>
> The number returned is surprisingly large, so iterating through even a
> 50-character long filename / path name and checking for the undesirable
> characters would be considerably longer than doing the same for 9
> characters.
>
> ________________________________
>
> Ian Thomas
> Victoria Park, Western Australia 

  _____  

No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1170 / Virus Database: 426/3278 - Release Date: 11/25/10

Reply via email to