Hello Ronald,
Excellent! Very well done! Works great! Thank-you so much! Not being a Perl
programmer, I would have never figured out how to do this.
WW
On Sep 16, 2011, at 5:41 AM, Ronald J Kimball wrote:
> On Fri, Sep 16, 2011 at 12:06:37AM +1000, Webmaster wrote:
>> I have 31,102 HTML documents. Each one contains one verse from the KJV Bible.
>>
>> Currently, these files are named consecutively like this:
>>
>> KJV.00001.html
>> KJV.00002.html
>> KJV.00003.html
>> KJV.00004.html
>> KJV.00005.html
>>
>> The title tag in each HTML document contains the actual Bible reference,
>> followed by a short description. For example, "KJV.00001.html" has
>> "Genesis 1:1 - KJV ( King James Version ) Bible Verse" in the title tag.
>>
>> I need a way to take just the reference part of the title tag, and make
>> it the actual file name, so that "KJV.00001.html" becomes
>> "Genesis_1-1.html". In other words, I want to drop the " - KJV ( King
>> James Version ) Bible Verse" part of the title tag when I create the
>> actual file names.
>>
>> I want the book name to be followed by an underscore, instead of a space,
>> and the colon in each verse reference will have to be replaced with a
>> hyphen, since we cannot use the colon in file names on the Macintosh.
>
> Here's a script to do this in Perl. You can save it to a file, perhaps
> named rename_bible_files.pl, and then run it on a directory, perhaps named
> my_bible_files, like this:
>
> perl rename_bible_files.pl my_bible_files
>
> To preview without actually renaming any files:
>
> perl rename_bible_files.pl -p my_bible_files
>
> To verbosely list the renaming as it's done:
>
> perl rename_bible_files.pl -v my_bible_files
>
>
> As written, it only processes files in the top level of the directory. It
> could be modified to descend into sub-directories if needed.
>
> If it encounters an error opening the directory, opening a file, or
> renaming a file, it will abort. If it can't find a title in the file, or
> finds the title but can't parse it, it will output a warning and continue
> processing the remaining files.
>
>
> #!perl
>
> use strict;
> use warnings;
>
> use Getopt::Long;
>
> local $/;
>
> GetOptions("verbose" => \ my $verbose,
> "preview" => \ my $preview,
> )
> or die "Invalid options";
>
> my $dir = shift
> or die "Must specify directory.\n";
>
> opendir my $dh, $dir
> or die "Can't open directory $dir: $!\n";
>
> while (defined(my $file = readdir $dh)) {
> next unless -f "$dir/$file" && $file =~ /\.html$/;
>
> open my $fh, '<', "$dir/$file"
> or die "Can't open $dir/$file: $!\n";
>
> my $contents = <$fh>;
>
> close $fh;
>
> my ($title) = $contents =~ m,<title[^>]*>\s*(.*?)\s*</title>,
> or do {
> warn "Could not find title in $dir/$file\n";
> next;
> };
>
> my ($book, $chapter, $verse) = $title =~ /^([\w ]*?)\s+(\d+):(\d+)/
> or do {
> warn "Could not parse title '$title' in $dir/$file\n";
> next;
> };
>
> $book =~ tr/ /_/;
>
> my $new = "$book\_$chapter-$verse.html";
>
> if ($new eq $file) {
> next;
> }
>
> if ($verbose || $preview) {
> print "$file => $new\n";
> }
>
> if (!$preview) {
> rename("$dir/$file", "$dir/$new")
> or die "Can't rename $dir/$file to $dir/$new: $!\n";
> }
> }
>
> __END__
>
>
> Ronald
--
You received this message because you are subscribed to the
"BBEdit Talk" discussion group on Google Groups.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
<http://groups.google.com/group/bbedit?hl=en>
If you have a feature request or would like to report a problem,
please email "[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>