On Mon, 16 Oct 2017 20:17:01 +0000, Pew, Curtis G wrote:

>On Oct 16, 2017, at 1:39 PM, John McKown wrote:
>> 
>> The above is why I love *IX symlinks. What would be _really_ nice, IMO,
>> would be if your PDF generation process placed the ​actual document number
>> & title in the PDF information title section of the pdf file. So that I
>> could get it out of the "pdfinfo" program. It would make generation of the
>> symlinks simple. Something like:
>> 
>> for i in *.pdf ; do pdfinfo "${i}" | ln -s "${i}" "$(awk '/^Title: / {print
>> substr($0,17);} | sed -r 's/ *.*? *//')" ; done
>
>Yes, please. I tried doing something like this once, but the PDF information 
>sections are inconsistent or incomplete. Everything’s easier to manage when 
>you have consistent, reliable metadata.
>
Of course.  In the interim, here's a script that creates the symlinks by
scraping the *index.htm file:

#! /bin/sh -x

# Make symbolic links for anchors in ../*index.htm

# Wholeheartedly empirical.  I tweaked it until it mostly worked.

# Run this in an expendable subdirectory of the doc archive.
ln -s ../pdf .  # Make a symlink to the real PDF directory.

awk '
/td.*href=.pdf.*\.pdf/ {
        Target = $0
        sub( /.*href=./, "ln -s ", Target )
        sub( /. target=.*/, " \\", Target )
        print( Target )
        next }
/td/ && Target {
        L = $0
        sub( /.*<td *>/, "    \"", L )
        sub( /<\/td>.*/, "\"", L )
        gsub( /\//, "-", L )
        print( L )
        Target = ""
        next }
' ../*index.htm

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to