Reproduced below is an example message I have scraped out of spam's mailbox file. Do I need to remove anything or is all the extra routing info for the encapsulating mail OK?
You need to scrap the forwarding headers. SA will otherwise interpret the second set of headers as a part of the message body, which is not the desired result.
It will also learn tokens in the first set of headers quite agressively. Eliminate them. sa-learn needs to see the message with pretty close to orignial headers. An extra Recieved: or two is ok, but other than that, no changes are good changes.