Melvin Carvalho wrote:


On 17 July 2010 17:00, Kingsley Idehen <[email protected] <mailto:[email protected]>> wrote:

    Nathan wrote:

        So, after seeing this question on stack overflow...

        '''Geiitng Adresses of Contact Us page of any web site
        I want to capture address given on the contact us page. Is
        there any php script to do so. I am struck coz of it. my
        client want to store adresse of the web sites given on contact
        us page. I am able to get content from contact us page. but i
        am quite confuse how to get only address from this page.'''

        .. and preceded by seeing 50-100+ projects advertised (daily)
        to scrape contact details and create databases for spam
        purposes on most freelance websites - I quickly come to the
        realisation that with Linked Data, the tasks of these people
        just became a whole lot easier, indeed the data is all MRD for
        them and linked up to more.

        Thus, in addition to nudging at general awareness of these
        issues, I do wonder who (if anybody) is working on spam (and
        unethical usage) solutions for the web of data?


Agreed, it's quite easy to make a very effective spam filter using LOD

Entity -> Spam Rank

Unknown Entity -> Spam Rank = 0%

Has WebID -> Spam Rank = 50%

Has WebID in LOD (e.g. sindice) -> Spam Rank = 75%

WebID is 3 links away from you -> Spam Rank = 85%

WebID is 2 links away from you -> Spam Rank = 90%

WebID is one link away from you -> Spam Rank = 95%

It's not perfect, but you can go from zero to very good, in under a day ...

Most important of all, the subjectivity of spam ranking is catered for when you have WebIDs and Linked Data in the mix.

One persons Spam is another's Ham :-)

Kingsley

        Best,

        Nathan



    Nathan,

    SPAM busting is something Linked Data while handle very well. A
    few moons ago when TimBL put out the GGG post [1], we had a little
    experiment whereby you could only comment if you where at least
    one degree of separation from an individual in his FOAF file. The
    post has zillions of readers and not a single SPAM comment :-)
    Sadly, the platform went down and the sole comment (mine) was lost :-(

    The WebID protocol emergence marks the beginning of the end for
    easy SPAM.

    Make note of this re. WebID protocol usecases as we continue our
    development of usecase collateral for the protocol :-)

    Links:

    1. http://dig.csail.mit.edu/breadcrumbs/node/215 -- this post used
    to have a single comment, platform upgrade lost the comment

--
    Regards,

    Kingsley Idehen       President & CEO OpenLink Software     Web:
    http://www.openlinksw.com
    Weblog: http://www.openlinksw.com/blog/~kidehen
    <http://www.openlinksw.com/blog/%7Ekidehen>
    Twitter/Identi.ca: kidehen








--

Regards,

Kingsley Idehen President & CEO OpenLink Software Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen





Reply via email to