Re: [twitter-dev] t.co issue -- querying for original url in streaming & search apis

2010-06-09 Thread Jeffrey Greenberg
Just to say it, this matching of "actual" URL as well as the  
shortened, supplied URL has been regarded as a bug by our users; it  
confuses them.  I would prefer it if it were optional to search so  
that I could turn it off... They only want to match the literal  
text... We provide means for them to deal with the actual/final URL by  
others means and in a different context.


The entire t.co idea is not really of use to us as a monitoring service.


Sent from my iPhone

On Jun 9, 2010, at 2:46 PM, Abraham Williams <4bra...@gmail.com> wrote:

Currently Search (and probably Streaming) returns results that match  
text in unshortened URLs not just in the text of the status. I doubt  
It would change now especially with t.co coming.


Abraham
-
Abraham Williams | Developer for hire | http://abrah.am
@abraham | http://projects.abrah.am | http://blog.abrah.am
This email is: [ ] shareable [x] ask first [ ] private.


On Wed, Jun 9, 2010 at 12:13, Jim Gilliam  wrote:
I'm creating a new thread for this because a few others have  
mentioned it, and we haven't gotten a response yet.  My hunch is  
that changing those APIs involve other teams within Twitter, so  
figuring out a solution could be challenging.


Here is the issue.  We need to be able to get matches on the  
original URL through the streaming and search APIs.   For me, I'm  
tracking "act" so I can match tweets that link to 'http://act.ly'.   
This is not a link shortener service, the actual pages live at  
act.ly, and it was all designed specifically for Twitter so there  
would be no need for url shorteners.


As far as I'm concerned, it's fine if that link changes to t.co, as  
long as I can still get matches on act.ly (or act) through the  
streaming API (the search API is going to be important for people  
too, but less of an issue for me personally).


The most elegant way to fix this would be to allow tracking of the  
original URL.  So I can put in a domain name, or URL substring, and  
match everything that way.  Same with search. This would be useful  
to a lot of people, and virtually all link oriented web apps with  
APIs provide a way to get all the matches for a particular domain.  
(digg, google, yahoo, etc)


I'm sure there are other workaround ways of doing this, and I'm all  
ears.  It would be SUPER NICE (wink wink) to hear some kind of  
assurance that there will be a way for us to query this type of  
information before the t.co changes go live.


Thanks guys...

Jim Gilliam
http://act.ly/
http://twitter.com/jgilliam

On Tue, Jun 8, 2010 at 4:43 PM, Jim Gilliam  wrote:
Will we be able to get matches on the original URL through the  
streaming API?


For example, I'm tracking "act" so I can match tweets that link to 'http://act.ly' 
.  Will I still be able to do that?


Jim Gilliam
http://act.ly/
http://twitter.com/jgilliam


On Tue, Jun 8, 2010 at 4:33 PM, Dewald Pretorius   
wrote:

Raffi,

I'm fine with everything up to the new 140 character count.

If you count the characters *after* link wrapping, you are seriously
going to mess up my system. My short URLs are currently 18 characters
long, and they will be 18 long for quite some time to come. After that
they will be 19 for a very long time to come.

If you implement this change, a ton, and I mean a *huge* number of my
system's updates are going to be rejected for being over 140
characters.

On Jun 8, 7:57 pm, Raffi Krikorian  wrote:
> hi all.
>
> twitter has been wrapping links in e-mailed DMs for a couple months
> now.
> with that feature, we're trying to protect users against phishing  
and other
> malicious attacks. the way that we're doing this is that any URL  
that comes
> through in a DM gets currently wrapped with a twt.tl URL -- if the  
URL turns
> out to be malicious, Twitter can simply shut it down, and whoever  
follows
> that link will be presented with a page that warns them of  
potentially
> malicious content. in a few weeks, we're going to start slowly  
enabling this
> throughout the API for all statuses as well, but instead of  
twt.tl, we will

> be using t.co.
>
> practically, any tweet that is sent through statuses/update that  
has a link
> on it will have the link automatically converted to a t.co link on  
its way
> through the Twitter platform. if you fetch any tweet created after  
this
> change goes live, then its text field will have all its links  
automatically
> wrapped with t.co links. when a user clicks on that link, Twitter  
will
> redirect them to the original URL after first confirming with our  
database
> that that URL is not malicious.  on top of the end-user benefit,  
we hope to
> eventually provide all developers with aggregate usage data around  
your
> applications such as the number of clicks people make on URLs you  
display

> (it will, of course, be in aggregate and not identifiable manner).
> additionally, we want to be able to build services and APIs that  
can make

> algorithmic re

Re: [twitter-dev] t.co issue -- querying for original url in streaming & search apis

2010-06-09 Thread Jim Gilliam
Fantastic, thank you!

On Wed, Jun 9, 2010 at 2:48 PM, Mark McBride  wrote:

> We will have this support in the streaming API.  Track terms will work
> against tweet text as well as entity text.  Currently streaming does
> *not* work as Abraham describes below.  We only match against tweet
> text, and don't do any link expansion/contraction.
>
>   ---Mark
>
> http://twitter.com/mccv
>
>
>
> On Wed, Jun 9, 2010 at 12:13 PM, Jim Gilliam  wrote:
> > I'm creating a new thread for this because a few others have mentioned
> it,
> > and we haven't gotten a response yet.  My hunch is that changing those
> APIs
> > involve other teams within Twitter, so figuring out a solution could be
> > challenging.
> > Here is the issue.  We need to be able to get matches on the original URL
> > through the streaming and search APIs.   For me, I'm tracking "act" so
> I can
> > match tweets that link to 'http://act.ly'.  This is not a link shortener
> > service, the actual pages live at act.ly, and it was all designed
> > specifically for Twitter so there would be no need for url shorteners.
> > As far as I'm concerned, it's fine if that link changes to t.co, as long
> as
> > I can still get matches on act.ly (or act) through the streaming API
> (the
> > search API is going to be important for people too, but less of an issue
> for
> > me personally).
> > The most elegant way to fix this would be to allow tracking of the
> original
> > URL.  So I can put in a domain name, or URL substring, and match
> everything
> > that way.  Same with search. This would be useful to a lot of people, and
> > virtually all link oriented web apps with APIs provide a way to get all
> the
> > matches for a particular domain. (digg, google, yahoo, etc)
> > I'm sure there are other workaround ways of doing this, and I'm all ears.
> >  It would be SUPER NICE (wink wink) to hear some kind of assurance that
> > there will be a way for us to query this type of information before the
> t.co
> > changes go live.
> > Thanks guys...
> > Jim Gilliam
> > http://act.ly/
> > http://twitter.com/jgilliam
> > On Tue, Jun 8, 2010 at 4:43 PM, Jim Gilliam  wrote:
> >>
> >> Will we be able to get matches on the original URL through the streaming
> >> API?
> >> For example, I'm tracking "act" so I can match tweets that link to
> >> 'http://act.ly'.  Will I still be able to do that?
> >> Jim Gilliam
> >> http://act.ly/
> >> http://twitter.com/jgilliam
> >>
> >> On Tue, Jun 8, 2010 at 4:33 PM, Dewald Pretorius 
> wrote:
> >>>
> >>> Raffi,
> >>>
> >>> I'm fine with everything up to the new 140 character count.
> >>>
> >>> If you count the characters *after* link wrapping, you are seriously
> >>> going to mess up my system. My short URLs are currently 18 characters
> >>> long, and they will be 18 long for quite some time to come. After that
> >>> they will be 19 for a very long time to come.
> >>>
> >>> If you implement this change, a ton, and I mean a *huge* number of my
> >>> system's updates are going to be rejected for being over 140
> >>> characters.
> >>>
> >>> On Jun 8, 7:57 pm, Raffi Krikorian  wrote:
> >>> > hi all.
> >>> >
> >>> > twitter has been wrapping links in e-mailed DMs for a couple months
> >>> > now.
> >>> > with that feature, we're trying to protect users against phishing and
> >>> > other
> >>> > malicious attacks. the way that we're doing this is that any URL that
> >>> > comes
> >>> > through in a DM gets currently wrapped with a twt.tl URL -- if the
> URL
> >>> > turns
> >>> > out to be malicious, Twitter can simply shut it down, and whoever
> >>> > follows
> >>> > that link will be presented with a page that warns them of
> potentially
> >>> > malicious content. in a few weeks, we're going to start slowly
> enabling
> >>> > this
> >>> > throughout the API for all statuses as well, but instead of twt.tl,
> we
> >>> > will
> >>> > be using t.co.
> >>> >
> >>> > practically, any tweet that is sent through statuses/update that has
> a
> >>> > link
> >>> > on it will have the link automatically converted to a t.co link on
> its
> >>> > way
> >>> > through the Twitter platform. if you fetch any tweet created after
> this
> >>> > change goes live, then its text field will have all its links
> >>> > automatically
> >>> > wrapped with t.co links. when a user clicks on that link, Twitter
> will
> >>> > redirect them to the original URL after first confirming with our
> >>> > database
> >>> > that that URL is not malicious.  on top of the end-user benefit, we
> >>> > hope to
> >>> > eventually provide all developers with aggregate usage data around
> your
> >>> > applications such as the number of clicks people make on URLs you
> >>> > display
> >>> > (it will, of course, be in aggregate and not identifiable manner).
> >>> > additionally, we want to be able to build services and APIs that can
> >>> > make
> >>> > algorithmic recommendations to users based on the content they are
> >>> > consuming. gathering the data from t.co will 

Re: [twitter-dev] t.co issue -- querying for original url in streaming & search apis

2010-06-09 Thread Mark McBride
We will have this support in the streaming API.  Track terms will work
against tweet text as well as entity text.  Currently streaming does
*not* work as Abraham describes below.  We only match against tweet
text, and don't do any link expansion/contraction.

   ---Mark

http://twitter.com/mccv



On Wed, Jun 9, 2010 at 12:13 PM, Jim Gilliam  wrote:
> I'm creating a new thread for this because a few others have mentioned it,
> and we haven't gotten a response yet.  My hunch is that changing those APIs
> involve other teams within Twitter, so figuring out a solution could be
> challenging.
> Here is the issue.  We need to be able to get matches on the original URL
> through the streaming and search APIs.   For me, I'm tracking "act" so I can
> match tweets that link to 'http://act.ly'.  This is not a link shortener
> service, the actual pages live at act.ly, and it was all designed
> specifically for Twitter so there would be no need for url shorteners.
> As far as I'm concerned, it's fine if that link changes to t.co, as long as
> I can still get matches on act.ly (or act) through the streaming API (the
> search API is going to be important for people too, but less of an issue for
> me personally).
> The most elegant way to fix this would be to allow tracking of the original
> URL.  So I can put in a domain name, or URL substring, and match everything
> that way.  Same with search. This would be useful to a lot of people, and
> virtually all link oriented web apps with APIs provide a way to get all the
> matches for a particular domain. (digg, google, yahoo, etc)
> I'm sure there are other workaround ways of doing this, and I'm all ears.
>  It would be SUPER NICE (wink wink) to hear some kind of assurance that
> there will be a way for us to query this type of information before the t.co
> changes go live.
> Thanks guys...
> Jim Gilliam
> http://act.ly/
> http://twitter.com/jgilliam
> On Tue, Jun 8, 2010 at 4:43 PM, Jim Gilliam  wrote:
>>
>> Will we be able to get matches on the original URL through the streaming
>> API?
>> For example, I'm tracking "act" so I can match tweets that link to
>> 'http://act.ly'.  Will I still be able to do that?
>> Jim Gilliam
>> http://act.ly/
>> http://twitter.com/jgilliam
>>
>> On Tue, Jun 8, 2010 at 4:33 PM, Dewald Pretorius  wrote:
>>>
>>> Raffi,
>>>
>>> I'm fine with everything up to the new 140 character count.
>>>
>>> If you count the characters *after* link wrapping, you are seriously
>>> going to mess up my system. My short URLs are currently 18 characters
>>> long, and they will be 18 long for quite some time to come. After that
>>> they will be 19 for a very long time to come.
>>>
>>> If you implement this change, a ton, and I mean a *huge* number of my
>>> system's updates are going to be rejected for being over 140
>>> characters.
>>>
>>> On Jun 8, 7:57 pm, Raffi Krikorian  wrote:
>>> > hi all.
>>> >
>>> > twitter has been wrapping links in e-mailed DMs for a couple months
>>> > now.
>>> > with that feature, we're trying to protect users against phishing and
>>> > other
>>> > malicious attacks. the way that we're doing this is that any URL that
>>> > comes
>>> > through in a DM gets currently wrapped with a twt.tl URL -- if the URL
>>> > turns
>>> > out to be malicious, Twitter can simply shut it down, and whoever
>>> > follows
>>> > that link will be presented with a page that warns them of potentially
>>> > malicious content. in a few weeks, we're going to start slowly enabling
>>> > this
>>> > throughout the API for all statuses as well, but instead of twt.tl, we
>>> > will
>>> > be using t.co.
>>> >
>>> > practically, any tweet that is sent through statuses/update that has a
>>> > link
>>> > on it will have the link automatically converted to a t.co link on its
>>> > way
>>> > through the Twitter platform. if you fetch any tweet created after this
>>> > change goes live, then its text field will have all its links
>>> > automatically
>>> > wrapped with t.co links. when a user clicks on that link, Twitter will
>>> > redirect them to the original URL after first confirming with our
>>> > database
>>> > that that URL is not malicious.  on top of the end-user benefit, we
>>> > hope to
>>> > eventually provide all developers with aggregate usage data around your
>>> > applications such as the number of clicks people make on URLs you
>>> > display
>>> > (it will, of course, be in aggregate and not identifiable manner).
>>> > additionally, we want to be able to build services and APIs that can
>>> > make
>>> > algorithmic recommendations to users based on the content they are
>>> > consuming. gathering the data from t.co will help make these possible.
>>> >
>>> > our current plan is that no user will see a t.co URL on twitter.com but
>>> > we
>>> > still have some details to work through. the links will still be
>>> > displayed
>>> > as they were sent in, but the target of the link will be the t.co link
>>> > instead. and, we want to pro

Re: [twitter-dev] t.co issue -- querying for original url in streaming & search apis

2010-06-09 Thread Abraham Williams
Currently Search (and probably Streaming) returns results that match text in
unshortened URLs not just in the text of the status. I doubt It would change
now especially with t.co coming.

Abraham
-
Abraham Williams | Developer for hire | http://abrah.am
@abraham | http://projects.abrah.am | http://blog.abrah.am
This email is: [ ] shareable [x] ask first [ ] private.


On Wed, Jun 9, 2010 at 12:13, Jim Gilliam  wrote:

> I'm creating a new thread for this because a few others have mentioned it,
> and we haven't gotten a response yet.  My hunch is that changing those APIs
> involve other teams within Twitter, so figuring out a solution could be
> challenging.
>
> Here is the issue.  We need to be able to get matches on the original URL
> through the streaming and search APIs.   For me, I'm tracking "act" so I can
> match tweets that link to 'http://act.ly'.  This is not a link shortener
> service, the actual pages live at act.ly, and it was all designed
> specifically for Twitter so there would be no need for url shorteners.
>
> As far as I'm concerned, it's fine if that link changes to t.co, as long
> as I can still get matches on act.ly (or act) through the streaming API
> (the search API is going to be important for people too, but less of an
> issue for me personally).
>
> The most elegant way to fix this would be to allow tracking of the original
> URL.  So I can put in a domain name, or URL substring, and match everything
> that way.  Same with search. This would be useful to a lot of people, and
> virtually all link oriented web apps with APIs provide a way to get all the
> matches for a particular domain. (digg, google, yahoo, etc)
>
> I'm sure there are other workaround ways of doing this, and I'm all ears.
>  It would be SUPER NICE (wink wink) to hear some kind of assurance that
> there will be a way for us to query this type of information before the
> t.co changes go live.
>
> Thanks guys...
>
> Jim Gilliam
> http://act.ly/
> http://twitter.com/jgilliam
>
> On Tue, Jun 8, 2010 at 4:43 PM, Jim Gilliam  wrote:
>
>> Will we be able to get matches on the original URL through the streaming
>> API?
>>
>> For example, I'm tracking "act" so I can match tweets that link to '
>> http://act.ly'.  Will I still be able to do that?
>>
>> Jim Gilliam
>> http://act.ly/
>> http://twitter.com/jgilliam
>>
>>
>> On Tue, Jun 8, 2010 at 4:33 PM, Dewald Pretorius wrote:
>>
>>> Raffi,
>>>
>>> I'm fine with everything up to the new 140 character count.
>>>
>>> If you count the characters *after* link wrapping, you are seriously
>>> going to mess up my system. My short URLs are currently 18 characters
>>> long, and they will be 18 long for quite some time to come. After that
>>> they will be 19 for a very long time to come.
>>>
>>> If you implement this change, a ton, and I mean a *huge* number of my
>>> system's updates are going to be rejected for being over 140
>>> characters.
>>>
>>> On Jun 8, 7:57 pm, Raffi Krikorian  wrote:
>>> > hi all.
>>> >
>>> > twitter has been wrapping links in e-mailed DMs for a couple months
>>> > now.
>>> > with that feature, we're trying to protect users against phishing and
>>> other
>>> > malicious attacks. the way that we're doing this is that any URL that
>>> comes
>>> > through in a DM gets currently wrapped with a twt.tl URL -- if the URL
>>> turns
>>> > out to be malicious, Twitter can simply shut it down, and whoever
>>> follows
>>> > that link will be presented with a page that warns them of potentially
>>> > malicious content. in a few weeks, we're going to start slowly enabling
>>> this
>>> > throughout the API for all statuses as well, but instead of twt.tl, we
>>> will
>>> > be using t.co.
>>> >
>>> > practically, any tweet that is sent through statuses/update that has a
>>> link
>>> > on it will have the link automatically converted to a t.co link on its
>>> way
>>> > through the Twitter platform. if you fetch any tweet created after this
>>> > change goes live, then its text field will have all its links
>>> automatically
>>> > wrapped with t.co links. when a user clicks on that link, Twitter will
>>> > redirect them to the original URL after first confirming with our
>>> database
>>> > that that URL is not malicious.  on top of the end-user benefit, we
>>> hope to
>>> > eventually provide all developers with aggregate usage data around your
>>> > applications such as the number of clicks people make on URLs you
>>> display
>>> > (it will, of course, be in aggregate and not identifiable manner).
>>> > additionally, we want to be able to build services and APIs that can
>>> make
>>> > algorithmic recommendations to users based on the content they are
>>> > consuming. gathering the data from t.co will help make these possible.
>>> >
>>> > our current plan is that no user will see a t.co URL on twitter.combut we
>>> > still have some details to work through. the links will still be
>>> displayed
>>> > as they were sent in, but the target 

[twitter-dev] t.co issue -- querying for original url in streaming & search apis

2010-06-09 Thread Jim Gilliam
I'm creating a new thread for this because a few others have mentioned it,
and we haven't gotten a response yet.  My hunch is that changing those APIs
involve other teams within Twitter, so figuring out a solution could be
challenging.

Here is the issue.  We need to be able to get matches on the original URL
through the streaming and search APIs.   For me, I'm tracking "act" so I can
match tweets that link to 'http://act.ly'.  This is not a link shortener
service, the actual pages live at act.ly, and it was all designed
specifically for Twitter so there would be no need for url shorteners.

As far as I'm concerned, it's fine if that link changes to t.co, as long as
I can still get matches on act.ly (or act) through the streaming API (the
search API is going to be important for people too, but less of an issue for
me personally).

The most elegant way to fix this would be to allow tracking of the original
URL.  So I can put in a domain name, or URL substring, and match everything
that way.  Same with search. This would be useful to a lot of people, and
virtually all link oriented web apps with APIs provide a way to get all the
matches for a particular domain. (digg, google, yahoo, etc)

I'm sure there are other workaround ways of doing this, and I'm all ears.
 It would be SUPER NICE (wink wink) to hear some kind of assurance that
there will be a way for us to query this type of information before
the t.cochanges go live.

Thanks guys...

Jim Gilliam
http://act.ly/
http://twitter.com/jgilliam

On Tue, Jun 8, 2010 at 4:43 PM, Jim Gilliam  wrote:

> Will we be able to get matches on the original URL through the streaming
> API?
>
> For example, I'm tracking "act" so I can match tweets that link to '
> http://act.ly'.  Will I still be able to do that?
>
> Jim Gilliam
> http://act.ly/
> http://twitter.com/jgilliam
>
>
> On Tue, Jun 8, 2010 at 4:33 PM, Dewald Pretorius  wrote:
>
>> Raffi,
>>
>> I'm fine with everything up to the new 140 character count.
>>
>> If you count the characters *after* link wrapping, you are seriously
>> going to mess up my system. My short URLs are currently 18 characters
>> long, and they will be 18 long for quite some time to come. After that
>> they will be 19 for a very long time to come.
>>
>> If you implement this change, a ton, and I mean a *huge* number of my
>> system's updates are going to be rejected for being over 140
>> characters.
>>
>> On Jun 8, 7:57 pm, Raffi Krikorian  wrote:
>> > hi all.
>> >
>> > twitter has been wrapping links in e-mailed DMs for a couple months
>> > now.
>> > with that feature, we're trying to protect users against phishing and
>> other
>> > malicious attacks. the way that we're doing this is that any URL that
>> comes
>> > through in a DM gets currently wrapped with a twt.tl URL -- if the URL
>> turns
>> > out to be malicious, Twitter can simply shut it down, and whoever
>> follows
>> > that link will be presented with a page that warns them of potentially
>> > malicious content. in a few weeks, we're going to start slowly enabling
>> this
>> > throughout the API for all statuses as well, but instead of twt.tl, we
>> will
>> > be using t.co.
>> >
>> > practically, any tweet that is sent through statuses/update that has a
>> link
>> > on it will have the link automatically converted to a t.co link on its
>> way
>> > through the Twitter platform. if you fetch any tweet created after this
>> > change goes live, then its text field will have all its links
>> automatically
>> > wrapped with t.co links. when a user clicks on that link, Twitter will
>> > redirect them to the original URL after first confirming with our
>> database
>> > that that URL is not malicious.  on top of the end-user benefit, we hope
>> to
>> > eventually provide all developers with aggregate usage data around your
>> > applications such as the number of clicks people make on URLs you
>> display
>> > (it will, of course, be in aggregate and not identifiable manner).
>> > additionally, we want to be able to build services and APIs that can
>> make
>> > algorithmic recommendations to users based on the content they are
>> > consuming. gathering the data from t.co will help make these possible.
>> >
>> > our current plan is that no user will see a t.co URL on twitter.com but
>> we
>> > still have some details to work through. the links will still be
>> displayed
>> > as they were sent in, but the target of the link will be the t.co link
>> > instead. and, we want to provide the same ability to display original
>> links
>> > to developers. we're going to use the entities attribute to make this
>> > possible.
>> >
>> > let's say i send out the following tweet: "you have to check outhttp://
>> dev.twitter.com!"
>> >
>> > a returned (and truncated) status object may look like:
>> >
>> > {
>> >   "text" : "you have to check outhttp://t.co/s9gfk2d4!";,
>> >   ...
>> >   "user" : {
>> > "screen_name" : "raffi",
>> > ...
>> >   },
>> >   ...
>> >   "entiti