Re: VIM removal of Unicode ('bomb')

2009-02-05 Thread Avraham Rosenberg
On Wed, Feb 04, 2009 at 04:53:26PM +0200, Noam Rathaus wrote:
> I am using:
> VIM - Vi IMproved 6.1 (2002 Mar 24, compiled Jan 15 2003 08:05:27)
> 
> And I have a few unicode characters (unicode encoding 'bomb'/marker) at the 
> beginning of the file that I want to remove.
> 000 bbef 3cbf 4421 434f 5954 4550 6820 6d74
> 
> I m referring to 0xbb, 0xef, 0x3c, 0xbf
> 
> I would to remove them, but opening the file with vim doesn't show them - as 
> they are markers it hides.
> 
> Can anyone help me get rid of them?
> 
> Any solution (not just using VIM) would be great.
> 
> -- 
> Noam Rathaus
> CTO
> no...@beyondsecurity.com
> http://www.beyondsecurity.com
> 
> "Know that you are safe."
> 
> Beyond Security Finalist for the "Red Herring 100 Global" Awards 2007
> 
> ___
> Linux-il mailing list
> Linux-il@cs.huji.ac.il
> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Hi
Try filtering with "tr":
Sick-file < tr '\0xbb' ' ' | tr '\0xef' ' ' | tr  '\0x3c' ' ' |  tr '\0xbf'
' ' >Healed-file
Of course, tr requires octal, not hexadecimal numbers. I was too lazy to
translate...
Cheers, Avraham

-- 
Please avoid sending to this address Excell or Powerpoint attachments.

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: VIM removal of Unicode ('bomb')

2009-02-04 Thread Noam Rathaus
Shachar,

Not my invention :)

do under vim
 :help bomb

It states as you said that the name is BOM

On Wednesday 04 February 2009 19:21:41 Shachar Shemesh wrote:
> Noam Rathaus wrote:
> > I am using:
> > VIM - Vi IMproved 6.1 (2002 Mar 24, compiled Jan 15 2003 08:05:27)
> >
> > And I have a few unicode characters (unicode encoding 'bomb'/marker) at
> > the beginning of the file that I want to remove.
> > 000 bbef 3cbf 4421 434f 5954 4550 6820 6d74
> >
> > I m referring to 0xbb, 0xef, 0x3c, 0xbf
> >
> > I would to remove them, but opening the file with vim doesn't show them -
> > as they are markers it hides.
> >
> > Can anyone help me get rid of them?
> >
> > Any solution (not just using VIM) would be great.
>
> I would use khexedit, myself, or set the locale to C and open with vim.
>
> Just a technical note: It's "BOM" - Byte Order Mark. I've never heard of
> bomb (in the Unicode context).
>
> Shachar


-- 
Noam Rathaus
CTO
no...@beyondsecurity.com
http://www.beyondsecurity.com

"Know that you are safe."

Beyond Security Finalist for the "Red Herring 100 Global" Awards 2007

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: VIM removal of Unicode ('bomb')

2009-02-04 Thread Shachar Shemesh

Noam Rathaus wrote:

I am using:
VIM - Vi IMproved 6.1 (2002 Mar 24, compiled Jan 15 2003 08:05:27)

And I have a few unicode characters (unicode encoding 'bomb'/marker) at the 
beginning of the file that I want to remove.

000 bbef 3cbf 4421 434f 5954 4550 6820 6d74

I m referring to 0xbb, 0xef, 0x3c, 0xbf
Supplementing my previous answer, your stated byte order is wrong. The 
order is 0xef 0xbb 0xbf. The first three bytes of the file are the BOM. 
The fourth byte, 0x3c, encodes a "<", and is not part of the meta-data.


Shachar

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: VIM removal of Unicode ('bomb')

2009-02-04 Thread Shachar Shemesh

Noam Rathaus wrote:

I am using:
VIM - Vi IMproved 6.1 (2002 Mar 24, compiled Jan 15 2003 08:05:27)

And I have a few unicode characters (unicode encoding 'bomb'/marker) at the 
beginning of the file that I want to remove.

000 bbef 3cbf 4421 434f 5954 4550 6820 6d74

I m referring to 0xbb, 0xef, 0x3c, 0xbf

I would to remove them, but opening the file with vim doesn't show them - as 
they are markers it hides.


Can anyone help me get rid of them?

Any solution (not just using VIM) would be great.

  

I would use khexedit, myself, or set the locale to C and open with vim.

Just a technical note: It's "BOM" - Byte Order Mark. I've never heard of 
bomb (in the Unicode context).


Shachar

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: VIM removal of Unicode ('bomb')

2009-02-04 Thread Yedidyah Bar-David
On Wed, Feb 04, 2009 at 04:53:26PM +0200, Noam Rathaus wrote:
> I am using:
> VIM - Vi IMproved 6.1 (2002 Mar 24, compiled Jan 15 2003 08:05:27)
> 
> And I have a few unicode characters (unicode encoding 'bomb'/marker) at the 
> beginning of the file that I want to remove.
> 000 bbef 3cbf 4421 434f 5954 4550 6820 6d74
> 
> I m referring to 0xbb, 0xef, 0x3c, 0xbf
> 
> I would to remove them, but opening the file with vim doesn't show them - as 
> they are markers it hides.
> 
> Can anyone help me get rid of them?
> 
> Any solution (not just using VIM) would be great.

You can use xxd. Basic stuff is trivial, google for 'vim xxd' for more.
-- 
Didi


___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: VIM removal of Unicode ('bomb')

2009-02-04 Thread Gabor Szabo
On Wed, Feb 4, 2009 at 4:53 PM, Noam Rathaus  wrote:
> I am using:
> VIM - Vi IMproved 6.1 (2002 Mar 24, compiled Jan 15 2003 08:05:27)
>
> And I have a few unicode characters (unicode encoding 'bomb'/marker) at the
> beginning of the file that I want to remove.
> 000 bbef 3cbf 4421 434f 5954 4550 6820 6d74
>
> I m referring to 0xbb, 0xef, 0x3c, 0xbf
>
> I would to remove them, but opening the file with vim doesn't show them - as
> they are markers it hides.
>
> Can anyone help me get rid of them?
>
> Any solution (not just using VIM) would be great.
>

try this one:

perl -i.bak -0777 -pe 's/^\xbb\xef\x3c\xbf//'   file.txt

it will also create a backup file with .bak extension

seeperldoc perlrun

Gabor

--
Gabor Szabo http://szabgab.com/blog.html
Perl Training in Israel http://www.pti.co.il/
Test Automation Tipshttp://szabgab.com/test_automation_tips.html

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


VIM removal of Unicode ('bomb')

2009-02-04 Thread Noam Rathaus
I am using:
VIM - Vi IMproved 6.1 (2002 Mar 24, compiled Jan 15 2003 08:05:27)

And I have a few unicode characters (unicode encoding 'bomb'/marker) at the 
beginning of the file that I want to remove.
000 bbef 3cbf 4421 434f 5954 4550 6820 6d74

I m referring to 0xbb, 0xef, 0x3c, 0xbf

I would to remove them, but opening the file with vim doesn't show them - as 
they are markers it hides.

Can anyone help me get rid of them?

Any solution (not just using VIM) would be great.

-- 
Noam Rathaus
CTO
no...@beyondsecurity.com
http://www.beyondsecurity.com

"Know that you are safe."

Beyond Security Finalist for the "Red Herring 100 Global" Awards 2007

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il