Hi Jason,

On Thu, May 29, 2025 at 11:08 AM Jason McIntyre <j...@kerhand.co.uk> wrote:
>
> On Wed, May 28, 2025 at 05:22:55PM -0300, K R wrote:
> > >Synopsis:      fortune(6): fortunes2 file has duplicate entries
> > >Category:      system games
> > >Environment:
> >         System      : OpenBSD 7.7
> >         Details     : OpenBSD 7.7 (GENERIC) #0: Sun May  4 11:10:16 MDT 2025
> >
> > r...@syspatch-77-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> >
> >         Architecture: OpenBSD.amd64
> >         Machine     : amd64
> > >Description:
> >
> >         There are 100+ entries in the fortunes2 file that are already
> >         present in the fortunes file.
> >
> > >How-To-Repeat:
> >
> >         cd /tmp
> >         cp /usr/share/games/fortune/{fortunes,fortunes2} .
> >         split -a 4 -p '^%$' fortunes fortunes.
> >         split -a 4 -p '^%$' fortunes2 fortunes2.
> >         sha256 fortunes.* > SHA256.fortunes
> >         sha256 fortunes2.* > SHA256.fortunes2
> >         # compare the two SHA256 files...
> >
> > >Fix:
> >         diff below removes the duplicate entries from fortunes2.
> >
> > Thanks,
> > --Kor
>
> hi.
>
> this methodology is too smart for me! if another obsd dev wants to
> confirm it's sound, i'd be happy to remove dups (or said dev could
> kindly take care of it themselves ;)

Sorry for the delay.  You' re right, the methodology can be simplified.

Please find attached a simple Python script that detects duplicate
entries in fortune files.  It uses sets to do that. Given two files,
file1 and file2, detects entries in file2 already present in file1.
It warns to stderr and writes file2 to stdout with the duplicate
entries removed.

Looking into other fortune(6) files, it's not only fortunes2 that has
duplicates:

/usr/share/games/fortune/fortunes2: 104 dups, from fortunes
/usr/share/games/fortune/fortunes2-o: 73 dups, from fortunes-o
/usr/share/games/fortune/limerick: 10 dups, from fortunes
/usr/share/games/fortune/zippy: 8 dups, from fortunes

I hope it helps.

Thanks,
--Kor

>
> jmc
>

Attachment: fortunes_rm_dups.py
Description: Binary data

Reply via email to