On 15 Jun 2001 16:04:55 -0400, Tim Musson wrote:
> Hey Perlers,
>
> I have pulled some books from Project Gutenberg (www.Gutenberg.net).
> What I want to do is take all the Paragraphs and put them on one
> line, then put them into the Palm DOC format. This way they wrap
> correctly on the Palm screen.
>
> There is no indenting that I care about, so I would say that I could
> take any block of text and put it all on one line. Starting a new
> block after each blank line.
>
> I would also like to take the PG headders (disclaimers, how to
> donate, etc.) and move it to the bottom of the text, so a way to do
> that would be cool too!
>
> Any suggestions would be appreciated.
>
> Thanks, you all are great!
<snip />
I tested this against several ebooks downloaded from Project Gutenberg
and it seems to work like you want except for converting to Palm DOC. I
will post an update when I find documentation for the format.
<code>
#!/usr/bin/perl -w
use strict;
my $header; #hold the header until the end
{
#ebooks seems to have a line like:
#*END*THE SMALL PRINT! FOR PUBLIC DOMAIN ETEXTS*Ver.04.29.93*END*\r\n
#at the end of the header, so read everthing in until
#first *END*\r\n
local($/) = "*END*\r\n";
$header = <>;
}
{
#set line seperator to a blank line
local($/) = "\r\n\r\n";
#read in all paragraphs
while (<>) {
#remove all <CR><LF> pairs in paragraph
s/\r\n/ /g;
#print paragraph with a blank line after it
print "$_\r\n\r\n";
}
}
#print header
print "$header";
</code>
--
Today is Sweetmorn, the 20th day of Confusion in the YOLD 3167
Keep the Lasagna flying!