> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Meredydd
> Sent: Thursday, July 22, 2004 11:37 AM
> To: [EMAIL PROTECTED]
> Subject: Re: Project Gutenberg
> 
> On Thursday 22 July 2004 14:51, Lambert, Mark wrote:
> > I use a txt2html converter, jEdit with regular expression 
> search and 
> > replace to put <H2> tags around chapters, and then use HTMLSplitter 
> > http://www.rekenwonder.com/htmlsplitter.htm to split it up into 
> > separate files before Plucking.
> Ooh, now that sounds more promising. Do you have the 
> particular expressions &c that I could try out? Are jEdit's 
> regexps perlable? (or even sedable?)

The only problem is I find I have to tweak them for different books.  If
the headers all all in caps I can do something like "^([A-Z ]+)<BR>" and
the replace as "<H2>$1</H2><BR>" but others are more like
"([Chapter\s*.*|Prologue|Epilogue])<BR>" and the replace as
"<H2>$1</H2><BR>".  Multi-line ones are a royal pain.

Mark





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Reply via email to