I've tried this wonderful command:
hexadump -c file.txt
and I found that I have to include more and more chars to erase as  
follows:

$text=~s/\177//g;
$text=~s/\377//g;
$text=~s/\335//g;
$text=~s/\360//g;
$text=~s/\204//g;
$text=~s/\222/\n/g;
$text=~s/\214//g;
$text=~s/\216//g;
$text=~s/\224//g;
$text=~s/\240//g;
$text=~s/\237//g;
$text=~s/\234//g;
$text=~s/\325//g;
$text=~s/\351//g;
$text=~s/\352//g;
$text=~s/\355//g;
$text=~s/\361//g;
$text=~s/\362//g;
$text=~s/\366//g;

Is there a way to erase all the chars that are higher than, say, 300?  
Does this make sense?

Thank you again!

On Oct 3, 2008, at 7:33 AM, Brian Raven wrote:

> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of
> Alejandro Santillan Iturres
> Sent: 02 October 2008 19:45
> To: activeperl@listserv.ActiveState.com
> Cc: [EMAIL PROTECTED]
> Subject: Re: regexp to "clean" a text file
>
>> Thank you William, Bill and Tim. Finally s/[\x00-\x1f]//g did the
> trick, almost perfect.
>> The original file is the palm database of memo pads. The text is
> there, plain. Several mixed control characters > were present.
>> The system I working on is a Fedora linux box. I have no hex utility
> installed to make de dump, so I don't know > if the ^E is really a ^E.
>
> I find that a little hard to believe. Try 'hexdump', or if that isn't
> present you should at least have 'od'. If neither of them are  
> installed,
> you Linux installation sounds a bit broken. Unless you can identify
> which characters are to be kept or discarded, you will find it  
> difficult
> to 'clean' your data effectively.
>
> HTH
>
> -- 
> Brian Raven
>
> -----------------------------------------------------------------------------------------------------------
> This e-mail may contain confidential and/or privileged information.  
> If you are not the intended recipient or have received this e-mail  
> in error, please advise the sender immediately by reply e-mail and  
> delete this message and any attachments without retaining a copy.  
> Any unauthorised copying, disclosure or distribution of the material  
> in this e-mail is strictly forbidden.
>
>
> _______________________________________________
> ActivePerl mailing list
> ActivePerl@listserv.ActiveState.com
> To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

_______________________________________________
ActivePerl mailing list
ActivePerl@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to