RE: Help with end of line charaters

Westcott Andrew-AWESTCO1 Mon, 23 Feb 2004 06:46:46 -0800

Hi,

You have solved my problem, the key was in your script. I had not been
reading the file in binary mode and so the different LF and CR where getting
lost.

Thanks

I did run the script you sent which was very help full and I will store that
away as I'm sure it will come in use.

Script output

Obj_ID
09
In_link
0d    0a
WBX586
09
2SV_WBX123
0d    0a
WBX2367
09
2SV_WBX123
0d    0a
WBX2428
09
2SV_WBX123
0a
WBX_APPS169
0d    0a
WBX588
09
2SV_WBX123
0d    0a
WBX2432
09
2SV_WBX123
0d    0a
WBX589
09
2SV_WBX123
0d    0a
WBX2433
09
2SV_WBX123
0d    0a

so it would appear I have 

field1_1 <TAB> field2_1 <LF> field2_2 <LF> .... field2_x <CR> <LF>

With the binmode I can replace the LF with commas and then my script is ok.

Thanks once more

Andy 

-----Original Message-----
From: R. Joseph Newton [mailto:[EMAIL PROTECTED] 
Sent: 23 February 2004 05:50
To: Westcott Andrew-AWESTCO1
Cc: [EMAIL PROTECTED]
Subject: Re: Help with end of line charaters

Westcott Andrew-AWESTCO1 wrote:

> Hi,
>
> I'm new to perl but need to write a script that takes a file and formats
> lines.
>
> The file has to 2 fields that are tab separated and each field is made up
of
> items separated by some type of linefeed character. The end of the second
> field is identified by another type of linefeed character.
>
> When I view the file in VIM the second linefeed shows as a ^M so there
must
> be a way of identifying these separately.
>
> I have tried searching for \r \n  %CR %LF $VT $FF but nothing seems to
give
> the required effect.
>
> I need to run the script on a PC.
>
> Please can you offer some advice or possible places to look.
>
> Thanks
>
> Andy

Maybe you should do a binary/text dump of the file.  Chose a set of
meanguful
printing characters to print as character, printi anything outside of this
range
as hex.  Something like:

open IN, 'hello.obj' or die "Couldn't open damnfool file: $!";
binmode IN;
local $/;

my $whole_durn_thang = <IN>;
my @chars = split //, $whole_durn_thang;
my $alphabetic = 0;
foreach $char (@chars) {
   if ($char =~ /[EMAIL PROTECTED]&*()_+ ,]/) {
      print "\n" unless $alphabetic;
      print $char;
      $alphabetic = 1;
   } else {
      print "\n" if $alphabetic;
      printf "%02x    ", ord $char,
      $alphabetic = 0;
   }
}

It doesn't make for pretty output, nor does it exactly slap you in the face
with
its meaning, but it does give you some straight, solid information about the
actual binary content of your file, in rough outline.  Take a chunk of the
output and examine it, and you should be able to make some reasonable
conjectures about the way the file is structured.  Presumably, most of your
output should come as lines of plain text.  It is the hex numbers between
the
lines that you want to watch for.  Then you can filter out the unwanted ones
when you process the file.

Note:  the code above is not intended for the processing itself, but as a
utility for your preliminary analysis of the material.

Joseph

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

RE: Help with end of line charaters

Reply via email to