Hello Ronald,

Excellent! Very well done! Works great! Thank-you so much! Not being a Perl 
programmer, I would have never figured out how to do this.

WW

On Sep 16, 2011, at 5:41 AM, Ronald J Kimball wrote:

> On Fri, Sep 16, 2011 at 12:06:37AM +1000, Webmaster wrote:
>> I have 31,102 HTML documents. Each one contains one verse from the KJV Bible.
>> 
>> Currently, these files are named consecutively like this:
>> 
>> KJV.00001.html
>> KJV.00002.html
>> KJV.00003.html
>> KJV.00004.html
>> KJV.00005.html
>> 
>> The title tag in each HTML document contains the actual Bible reference,
>> followed by a short description. For example, "KJV.00001.html" has
>> "Genesis 1:1 - KJV ( King James Version ) Bible Verse" in the title tag.
>> 
>> I need a way to take just the reference part of the title tag, and make
>> it the actual file name, so that "KJV.00001.html" becomes
>> "Genesis_1-1.html". In other words, I want to drop the " - KJV ( King
>> James Version ) Bible Verse" part of the title tag when I create the
>> actual file names.
>> 
>> I want the book name to be followed by an underscore, instead of a space,
>> and the colon in each verse reference will have to be replaced with a
>> hyphen, since we cannot use the colon in file names on the Macintosh.
> 
> Here's a script to do this in Perl.  You can save it to a file, perhaps
> named rename_bible_files.pl, and then run it on a directory, perhaps named
> my_bible_files, like this:
> 
>  perl rename_bible_files.pl my_bible_files
> 
> To preview without actually renaming any files:
> 
>  perl rename_bible_files.pl -p my_bible_files
> 
> To verbosely list the renaming as it's done:
> 
>  perl rename_bible_files.pl -v my_bible_files
> 
> 
> As written, it only processes files in the top level of the directory.  It
> could be modified to descend into sub-directories if needed.
> 
> If it encounters an error opening the directory, opening a file, or
> renaming a file, it will abort.  If it can't find a title in the file, or
> finds the title but can't parse it, it will output a warning and continue
> processing the remaining files.
> 
> 
> #!perl
> 
> use strict;
> use warnings;
> 
> use Getopt::Long;
> 
> local $/;
> 
> GetOptions("verbose" => \  my $verbose,
>           "preview" => \  my $preview,
>          )
>  or die "Invalid options";
> 
> my $dir = shift
> or die "Must specify directory.\n";
> 
> opendir my $dh, $dir
>  or die "Can't open directory $dir: $!\n";
> 
> while (defined(my $file = readdir $dh)) {
>  next unless -f "$dir/$file" && $file =~ /\.html$/;
> 
>  open my $fh, '<', "$dir/$file"
>    or die "Can't open $dir/$file: $!\n";
> 
>  my $contents = <$fh>;
> 
>  close $fh;
> 
>  my ($title) = $contents =~ m,<title[^>]*>\s*(.*?)\s*</title>,
>    or do {
>      warn "Could not find title in $dir/$file\n";
>      next;
>    };
> 
>  my ($book, $chapter, $verse) = $title =~ /^([\w ]*?)\s+(\d+):(\d+)/
>    or do {
>      warn "Could not parse title '$title' in $dir/$file\n";
>      next;
>    };
> 
>  $book =~ tr/ /_/;
> 
>  my $new = "$book\_$chapter-$verse.html";
> 
>  if ($new eq $file) {
>    next;
>  }
> 
>  if ($verbose || $preview) {
>    print "$file => $new\n";
>  }
> 
>  if (!$preview) {
>    rename("$dir/$file", "$dir/$new")
>      or die "Can't rename $dir/$file to $dir/$new: $!\n";
>  }
> }
> 
> __END__
> 
> 
> Ronald


-- 
You received this message because you are subscribed to the 
"BBEdit Talk" discussion group on Google Groups.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
<http://groups.google.com/group/bbedit?hl=en>
If you have a feature request or would like to report a problem, 
please email "[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>

Reply via email to