John -- thanks for clarification!  Certainly it's the data in
Twitter's database as a whole, not just the Streaming API.  One
question is whether you should accept illegal Unicode?  Probably it's
a safer thing to do to avoid scaring the clients, but maybe you'd want
to apply some filter before sticking it into the database?  I.e., is
it reasonable to have a policy of accepting or storing only legal
Unicode?  I know some folks use Twitter for machine/sensor data, but
perhaps it's not intended?  I can envision Twitter allowing non-
Unicode data if marked as such, perhaps on a closed stream, for
machines talking to each other, -- but not humans.


Reply via email to