On Thu, Jul 31, 2014 at 11:20 AM, Nordlund, Eric <[email protected]> wrote:
> Hello docbook-apps. > > I have a large set of projects that I am looking to scrub for unused > graphics and XML files prior to sending off to localization. > > Some of my colleagues have created some very basic bash and batch > scripts to scan through the folders and find files that aren’t referenced > in any of the source files so we can delete them, but I worry that these > scripts don’t catch everything (unused XML files in the base directory that > reference images will ‘bless’ this images) and we could still have > extraneous files left over or accidentally delete important ones > unknowingly. > > Each project has a book.xml file that is the gold master for the > outputs. If the book.xml file or any of its includes doesn’t reference a > file in the project, it’s safe to delete. I was hoping that I could use > xmllint to tell me which files are loaded when I try to validate the > book.xml, but I haven’t found the magic formula yet. > > I’ve tried the following command to reference all of the loaded files > during a pass, but it doesn’t seem to list the image files referenced, > which is mostly the point of this exercise, and I get a lot of noise from > the module files for the DTD on every include. > $ xmllint --load-trace book.xml --xinclude --noout &> test1 > > Has anyone had a similar problem to solve? Am I going about this the > right way? > > Thanks, and I’m open to any suggestion. If bash and xmllint don’t work > here, I am partial to Python as an alternative. Just saying. > > *Eric Nordlund* > > Senior Technical Writer > > Amazon Web Services > > Ph: 206-266-8048 | [email protected] > > > > [image: Description: Description: New Picture] > > > https://gist.github.com/reflexing/3184e28a315ed0cc4a1c -- Kirill Churin
