It appears Twitter returns the 'source' field which has a user agent in it.

Most of the time this is an HTML HREF string. I have found one case where that is not true. "web" will exist as only that string.

Is "web" the only non HTML string I can expect at this time? I'm logging the event if I get another edge case, but wanted to check.

Regarding regex's to get just the user agent name, and strip the HTML bits away...

I've heard some noise on this list about the format changing. I see some have nofollow, others do not, many variations.

I don't want to chase this around, though I also log non matches here.

My regex looks for '\>.+<\i'

I believe this to be pretty solid, but it feels a little too easy. I'm also left with the returned match having > and < in the string. Of course, those are easy enough to replace/trim off.

What do you think of this approach? Any cases others have seen that would lead this to failure?
--
Scott
Iphone says hello.

Reply via email to