Hi,

as i mentioned at the DBpedia meetup yesterday, it would be great if there were 
checksum files for the dump files (for example in each of the folders).

My use-case is mostly to be able to quickly check if i have the current version 
of files.
It happened to me a couple of times already that i download files early in the 
release process, but then later found out that some were modified online.
Not knowing what exactly was modified i ended up re-downloading everything.

Another benefit of checksums would be easier spotting of files that are in 
multiple folders (core, en, links...).

To generate the checksum file for all files in a single folder simply do this:
md5sum * > MD5SUMS
more secure but slower:
sha256sum * > SHA256SUMS

To verify them, one could then simply do a
md5sum -c MD5SUMS
or
sha256sum -c SHA256SUMS

To initially put the checksums in all folders:

#!/bin/bash
set -e
checksumbin=md5sum
checksumfile=MD5SUMS
shopt -s globstar
for d in **/ ; do
  pushd "$d"
  find . -maxdepth 1 -type f -print0 | xargs -0 $checksumbin > $checksumfile
  popd
done


Cheers,
Jörn


------------------------------------------------------------------------------
_______________________________________________
DBpedia-developers mailing list
DBpedia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers

Reply via email to