as per subject title or do they have to be standard unix format
Yes, sa-learn accepts mail in mbox format with the --mbox flag.
Without the --mbox flag, it expects files to be rfc.822 format, and directories to be maildir format.
As for "standard unix format", technically, the .mbx format used by Mozilla is a standard unix format. It's the format a lot of sendmail/procmail setups use for /var/spool/mail, among other things.