2019-07-31 22:36:18 -0500, Peng Yu: > Hi, > > Suppose that I know a md5sum that is derived one of the timestamps > computed below. Is there a way to quickly derive what the original > timestamp is? I could make a database of all the timestamps and their > md5sums. But as the total number of entries increases, this solution > will not be scalable as the database can be big. Is it there any > better solution to this problem? > > for i in {1..2563200}; do date -d "-$i minutes" +%Y%m%d_%I%M%p; done [...]
seq -f '-%g minutes' 2563200 | date -f - +%Y%m%d_%I%M%p would be an improvement as it would only run one date invocation, but you'd still need to run one md5sum for each of those lines. coreutils md5sum in itself is not slow, but forking a process and loading a command and linking its libraries is, that's not a bug in coreutils itself. You'd be better off using perl/python which can also compute MD5 sums by themselves without having to invoke a separate utility. If you want to do it in a shell, you can use ksh93 which if built as part of ast-open will have a builtin md5sum command (which you enable with "builtin md5sum" or can invoke with "command /opt/ast/bin/md5sum") Something like: #! /bin/ksh93 sum_to_find=${1?} builtin md5sum mktemp || exit tmp=${ mktemp; } trap 'rm -f -- "$tmp"' EXIT now=${ printf '%(%s)T' now; } for ((i = 1; i <= 2563200; i++)) { t=${ printf '%(%Y%m%d_%I%M%p)T' "#$((now - 60*i))"; } printf %s "$t" > "$tmp" sum=${ md5sum < "$tmp"; } if [ "$sum" = "$sum_to_find" ]; then printf '%s\n' "Timestamp: $t ($i minutes ago)" exit 0 fi } exit 1 (here using a tempfile as using a pipe would mean forking extra processes) That would be orders of magnitude faster than running one coreutils md5sum for each timestamp, but still orders of magnitude slower than doing it in python/perl because of all the shell interpretation and I/O overhead. (in any case, note that the builtin versions of mktemp, printf, md5sum, [ in ksh93 have nothing to do with GNU coreutils'). -- Stephane