I have numbers for text size and conversion performance of BOCU-1 and SCSU relative to UTF-8.
Quick summary: For Latin text, UTF-8 is best. For CJK, BOCU-1 and SCSU provide smaller size, with some speed trade-off. For other scripts, BOCU-1 and SCSU are much better than UTF-8 in both speed and size. Note that BOCU-1 encoded text (since it preserves control characters and spaces) could be directly used in emails, for CVS, etc. Please see http://oss.software.ibm.com/icu/dropbox/bocuperf.html Best regards, markus