Anže Vidmar wrote:
>
> I have some nasty, non-ascii character in some files that contains php
> code (actually somewhere in my SVN branch). What I want to do here is
> to recursively find all the files that contains a specific non-ascii
> character in the file. And most importantly - i need to know the name
> of the files containing it.
>
> So far, I found a script that looks into a file for non-ascii
> characters and prints this characters in hex:
>
> while (<>) {
> s/([\x80-\xff])/sprintf "\\x{%02x}",ord($1)/eg;
> print;
> }
>
> Ok, this is good, the non-ascii character (in hex) that I'm looking
> for is:
>
> x{ef}\\x{bb}\\x{bf}
>
> The problem here is that I can't run this script to run recursively
> and I don't get the name of the file that actually contains this
> characters.
>
> I've tried with bash, but since it's standard output, I can't get any
> resault on this. Here is what I've tried:
>
> find |xargs /usr/local/bin/check_for_non-ascii_characters.sh |grep -l
> 'x{ef}\\x{bb}\\x{bf}'
>
> So, I need a way to recursively find non-ascii characters (a specific
> pattern, mentioned before) in all files and I need the name of the
> files containing it.
>
> It would be enough if I would be able only to see what file contains
> this character set.
As you have established, finding characters within the given range is trivial,
but I think you need to be clear about what you mean by recursion.
Without further information I would have written something that worked on a
given .php file and also on everything that file referred to via include() or
require(). Chas has assumed differently and I hope you will tell us what it is
that you need.
Rob
--
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
http://learn.perl.org/