Hello,

Regarding comment:

Pádraig Brady wrote, On 03/07/2013 06:26 PM:
> On 03/07/2013 07:32 PM, Assaf Gordon wrote:
>> The second attached patch is experimental - it tries to assess the
>> randomness of 'shuf' output by running it 1,000 times and checking
>> if the output is (very roughly) uniformly distributed. 
> 
> Cool, I was considering testing with rngtest or something, so it'll
> be good to have something independent.
(  http://lists.gnu.org/archive/html/coreutils/2013-03/msg00030.html )

Using rngtest is probably much more reliable than the independent test - 
attached are tests for sort and shuf with rngtest.
They are marked 'expensive' as they require an external program and they run 
each test 10 times.

-gordon



>From 15392de8f0ffa0746c9fd338ed14d15b614029a3 Mon Sep 17 00:00:00 2001
From: Assaf Gordon <assafgor...@gmail.com>
Date: Fri, 8 Mar 2013 15:54:24 -0500
Subject: [PATCH] tests: test sort,shuf with rngtest

rngtest check the randomness of data using FIPS 140-2 tests.
http://sourceforge.net/projects/gkernel/

If rngtest is not installed (and available in the PATH),
the tests will be skipped.

These tests are marked 'expensive'. To run directly:

  $ make check TESTS=tests/misc/sort-rand-rngtest.sh \
               SUBDIRS=. RUN_EXPENSIVE_TESTS=yes
  $ make check TESTS=tests/misc/shuf-rand-rngtest.sh \
               SUBDIRS=. RUN_EXPENSIVE_TESTS=yes

* tests/misc/shuf-rand-rngtest.sh - test shuf with rngtest.
* tests/misc/sort-rand-rngtest.sh - test sort with rngtest.
* tests/local.mk - add above tests.
---
 tests/local.mk                  |    2 +
 tests/misc/shuf-rand-rngtest.sh |   78 +++++++++++++++++++++++++++++++++++++++
 tests/misc/sort-rand-rngtest.sh |   71 +++++++++++++++++++++++++++++++++++
 3 files changed, 151 insertions(+), 0 deletions(-)
 create mode 100755 tests/misc/shuf-rand-rngtest.sh
 create mode 100755 tests/misc/sort-rand-rngtest.sh

diff --git a/tests/local.mk b/tests/local.mk
index 607ddc4..21d347a 100644
--- a/tests/local.mk
+++ b/tests/local.mk
@@ -313,6 +313,7 @@ all_tests =					\
   tests/misc/shred-passes.sh			\
   tests/misc/shred-remove.sh			\
   tests/misc/shuf.sh				\
+  tests/misc/shuf-rand-rngtest.sh		\
   tests/misc/sort.pl				\
   tests/misc/sort-benchmark-random.sh		\
   tests/misc/sort-compress.sh			\
@@ -329,6 +330,7 @@ all_tests =					\
   tests/misc/sort-month.sh			\
   tests/misc/sort-exit-early.sh			\
   tests/misc/sort-rand.sh			\
+  tests/misc/sort-rand-rngtest.sh		\
   tests/misc/sort-spinlock-abuse.sh		\
   tests/misc/sort-stale-thread-mem.sh		\
   tests/misc/sort-unique.sh			\
diff --git a/tests/misc/shuf-rand-rngtest.sh b/tests/misc/shuf-rand-rngtest.sh
new file mode 100755
index 0000000..9ad2797
--- /dev/null
+++ b/tests/misc/shuf-rand-rngtest.sh
@@ -0,0 +1,78 @@
+#!/bin/sh
+# Test shuf's random output with rngtest
+#
+# NOTE:
+#  rngtest must be installed, or the test will be skipped.
+#  rngtest is available here: http://sourceforge.net/projects/gkernel/
+
+# Copyright (C) 2013 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
+print_ver_ shuf
+expensive_
+
+if ! which rngtest > /dev/null ; then
+  skip_ "rngtest not found - skipping test."
+fi
+
+# Test for randomness several times.
+# On the reare occasion when the randomly sorted data doesn't pass rngtest,
+# it should be just one failure out of 10 rounds.
+# If more rounds fail in a single run - there's likely a real problem.
+ROUNDS=10
+
+( yes 1 | head -n 10000 ; yes 0 | head -n 10000 ) > in || framework_failure_
+
+# rgntest always reads the first 32 bits as bootstrap data
+printf "\x00\x00\x00\x00" > rngtest_header || framework_failure_
+
+
+# Sanity check:
+#  unsorted data should not be random
+cat in | tr -d '\n' | \
+       perl -npe '$_=pack("b*",$_)' > out_non_random || framework_failure_
+
+echo "Testing rngtest on non-random input:" 1>&2
+cat rngtest_header out_non_random | rngtest &&
+  { fail=1 ; echo "rngtest failed to detect non-random data." 1>&2 ; }
+
+#
+# Check randomness of shuf's output
+# (using the 'read-entire-file' code path)
+for i in $(seq $ROUNDS) ; do
+  cat in | shuf | tr -d '\n' | \
+       perl -npe '$_=pack("b*",$_)' > out_random$i || framework_failure_
+
+  echo "Testing rngtest on randomly-sorted input (round $i of $ROUNDS):" 1>&2
+  cat rngtest_header out_random$i | rngtest ||
+      { fail=1 ; echo "shuf random output did not pass rngtest" \
+                      " (round $i of $ROUNDS)." 1>&2 ; }
+done
+
+#
+# Check randomness of shuf's output
+# (using the '--head-count=N' code path)
+for i in $(seq $ROUNDS) ; do
+  cat 'in' 'in' 'in' 'in' | shuf -n 20000 | tr -d '\n' | \
+       perl -npe '$_=pack("b*",$_)' > out_n_random$i || framework_failure_
+
+  echo "Testing rngtest on randomly-sorted input (round $i of $ROUNDS):" 1>&2
+  cat rngtest_header out_n_random$i | rngtest ||
+      { fail=1 ; echo "shuf -n random output did not pass rngtest" \
+                      " (round $i of $ROUNDS)." 1>&2 ; }
+done
+
+Exit $fail
diff --git a/tests/misc/sort-rand-rngtest.sh b/tests/misc/sort-rand-rngtest.sh
new file mode 100755
index 0000000..19508dd
--- /dev/null
+++ b/tests/misc/sort-rand-rngtest.sh
@@ -0,0 +1,71 @@
+#!/bin/sh
+# Test sort's random output with rngtest
+#
+# NOTE:
+#  rngtest must be installed, or the test will be skipped.
+#  rngtest is available here: http://sourceforge.net/projects/gkernel/
+
+# Copyright (C) 2013 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
+print_ver_ sort
+expensive_
+
+if ! which rngtest > /dev/null ; then
+  skip_ "rngtest not found - skipping test."
+fi
+
+# Test for randomness several times.
+# On the reare occasion when the randomly sorted data doesn't pass rngtest,
+# it should be just one failure out of 10 rounds.
+# If more rounds fail in a single run - there's likely a real problem.
+ROUNDS=10
+
+# NOTE:
+# We sort 20000 integers, but use only their LSB for the test.
+# Before sorting, their LSB should not be random (alternating 0/1 for odd/even
+# integers). After sorting, the LSBs should be randomly ordered.
+seq 1 20000 > in || framework_failure_
+
+# rgntest always reads the first 32 bits as bootstrap data
+printf "\x00\x00\x00\x00" > rngtest_header || framework_failure_
+
+
+
+# Sanity check:
+#  unsorted data should not be random
+cat in | awk '{printf "%d",and($0,1)}' | \
+       perl -npe '$_=pack("b*",$_)' > out_non_random || framework_failure_
+
+echo "Testing rngtest on non-random input:" 1>&2
+cat rngtest_header out_non_random | rngtest &&
+    { fail=1 ; echo "rngtest failed to detect non-random data." 1>&2 ; }
+
+
+#
+# Check randomness of sort's random
+#
+for i in $(seq $ROUNDS) ; do
+  cat in | sort --random-sort | awk '{printf "%d",and($0,1)}' | \
+       perl -npe '$_=pack("b*",$_)' > out_random$i || framework_failure_
+
+  echo "Testing rngtest on randomly-sorted input (round $i of $ROUNDS):" 1>&2
+  cat rngtest_header out_random$i | rngtest ||
+      { fail=1 ; echo "sort random output did not pass rngtest" \
+                      " (round $i of $ROUNDS)." 1>&2 ; }
+done
+
+Exit $fail
-- 
1.7.7.4

Reply via email to