You have the option of uploading your corpus to the central server to process every night. But most people have privacy concerns about that if it is their own personal ham. For this reason you have the option of running the masscheck script yourself every night on your own server and to rsync upload the logs only to the spamassassin central server.
https://fedorahosted.org/auto-mass-check/ I run this script every night from cron on my corpora. I wrote this as a friendlier wrapper script around spamassassin's confusing and difficult to configure scripts. ♫ And yes, a ham only corpus is extremely useful. You must confirm that it is 100% human verified. Start small, make sure the script is working properly, and sort more ham into that folder. Warren