On Wednesday 29 August 2007 17:01:20 Adam W wrote:
> Single quotes go around the whole sed script unless you are using a
> separate sed script file.
> try      sed 's/\n//' 1.txt > 2.txt
>
>  - Adam
>
> On 8/29/07, Florian Kulzer <[EMAIL PROTECTED]> wrote:
> > On Wed, Aug 29, 2007 at 15:17:46 +0200, Joe Hart wrote:
> > > I am having trouble using sed to edit text files, heres a good example
> > > of what I am looking for:
> > >
> > > <begin 1.txt>
> > > This is a test
> > > file, what I am
> > > trying to do is get the lines to join.
> > >
> > > It isn't a complicated thing,
> > > but I also want to keep the paragraphs
> > > separate.
> > > </end 1.txt>
> >
> > [...]
> >
> > > But ideally I'd like to just have a script to do it, but cannot figure
> > > out how to go about it, as sed doesn't seem to be working.
> >
> > Why not use Perl?
> >
> > $ perl -p0e '$_=~s/(.)\n(.)/$1 $2/g' < 1.txt
> > This is a test file, what I am trying to do is get the lines to join.
> >
> > It isn't a complicated thing, but I also want to keep the paragraphs
> > separate.
> >
> > $ perl -p0e '$_=~s/(.)\n(.|\n)/$1 $2/g;$_=~s/ \n/\n/g' < 1.txt
> > This is a test file, what I am trying to do is get the lines to join.
> > It isn't a complicated thing, but I also want to keep the paragraphs
> > separate.
> >
> > --
> > Regards,            | http://users.icfo.es/Florian.Kulzer
> >           Florian   |

Both very good solutions for the example I gave, although the first perl 
snippet seams to skip a line when I try it.  

However, on the real files, I am afraid it isn't working.  What I am actually 
trying to do is reformat books that I downloaded from the gutenberg project.  
Many of them are coded with loads of hard returns because the OCR software 
was poorly written, or they were typed by people used to old fashioned 
typewriters that require people to hit the return/enter key at the end of 
every line.  A practice frowned upon in the modern world.

If I have to, I'll write a program that reads character by character and looks 
for the line break, but like I said before there should be tools that can 
already do it.  It seems that a regex \n or even a $ would be enough, but 
alas that doesn't seem to give me decent output.

At least I have been pointed in the right direction, and I have learned some 
regex in the process.  Always good to learn new things.

Joe


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to