[PHP] Connect to Google
I'm a teacher. I want to use PHP to interface with Google and see if a student has plagiarized. I don't see many open-source projects on the subject, so I want to create my own script. How can I use PHP to interface with Google and see if this text exists on the internet? If this is possible, I need some ideas on how to parse the text and input it into Google. Then I might like to get a percentage idea of how this text compares to a site that Google has indexed. $SampleText = Lorem ipsum dolor sit amet, test link adipiscing elit. Nullam dignissim convallis est. Quisque aliquam. Donec faucibus. Nunc iaculis suscipit dui. Nam sit amet sem. Aliquam libero nisi, imperdiet at, tincidunt nec, gravida vehicula, nisl. Praesent mattis, massa quis luctus fermentum, turpis mi volutpat justo, eu volutpat enim diam eget metus. Maecenas ornare tortor. Donec sed tellus eget sapien fringilla nonummy. Mauris a ante. Suspendisse quam sem, consequat at, commodo vitae, feugiat in, nunc. Morbi imperdiet augue quis tellus. John -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Connect to Google
I'm a teacher. I want to use PHP to interface with Google and see if a student has plagiarized. Hi. Why not just enter the suspected text into a search engine and see if any close matches come up? If you use the advanced search tools you can choose verbatim and see if the exact phrase matches. If that's not good enough, can you explain how you would like it to function? Would the whole paper be scanned phrase-by-phrase for matches and then spit out a report? Marc -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Connect to Google
On Wed, 2012-02-15 at 21:56 -0500, John Taylor-Johnston wrote: I'm a teacher. I want to use PHP to interface with Google and see if a student has plagiarized. I don't see many open-source projects on the subject, so I want to create my own script. How can I use PHP to interface with Google and see if this text exists on the internet? If this is possible, I need some ideas on how to parse the text and input it into Google. Then I might like to get a percentage idea of how this text compares to a site that Google has indexed. $SampleText = Lorem ipsum dolor sit amet, test link adipiscing elit. Nullam dignissim convallis est. Quisque aliquam. Donec faucibus. Nunc iaculis suscipit dui. Nam sit amet sem. Aliquam libero nisi, imperdiet at, tincidunt nec, gravida vehicula, nisl. Praesent mattis, massa quis luctus fermentum, turpis mi volutpat justo, eu volutpat enim diam eget metus. Maecenas ornare tortor. Donec sed tellus eget sapien fringilla nonummy. Mauris a ante. Suspendisse quam sem, consequat at, commodo vitae, feugiat in, nunc. Morbi imperdiet augue quis tellus. John Wow, that's a pretty big project you're chewing there. A quick search shows that there are some project out there to detect plagiarism, but I think for university calibre there's a hefty sum of money required. To get a rough idea, you could break a text into sentences, and then query each one of those to see if it occurs just like that. You can use cURL to grab search results pages for this sort of thing, no need for a special interface. There are a few things to bear in mind though: * Googles terms and conditions may prohibit using their search engine like this, or may impose a limit on how much you can do this * Some sentences will be intentionally copied, as quotes. Maybe some sort of check against the source to see if it's in a quote context. * What if only part of a sentence is copied? Maybe after you've searched for exact matches from the sentences in the source, you could remove them from the source, then re-check every sentence against Googles fuzzy search. It may produce many false positives though. There are plenty of other factors too, such as students copying from books which don't exist in a search engines archives, some subjects may unintentionally result in the same way of wording, particularly technical subjects which tend to be removed from more creative and flowery descriptive tendencies. -- Thanks, Ash http://www.ashleysheridan.co.uk
Re: [PHP] Connect to Google
If you use the advanced search tools you can choose verbatim and see if the exact phrase matches. Just correcting myself here, the way to do this is by simply wrapping the words in quotes like this, hey now. The verbatim tool is something else. Marc -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Connect to Google
Can I use PHP to interface with Google? Any possible examples of this? Let's start with the first step. :) I'm sure proprietary sites like http://www.compilatio.net/ for example connects to search engines. They cannot be crawling the net too. That would be crazy. (I'm a top quoter. It's more intuitive.) Thanks Ash. John Ashley Sheridan wrote: On Wed, 2012-02-15 at 21:56 -0500, John Taylor-Johnston wrote: How can I use PHP to interface with Google and see if this text exists on the internet? Wow, that's a pretty big project you're chewing there. A quick search shows that there are some project out there to detect plagiarism, but I think for university calibre there's a hefty sum of money required. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Connect to Google
I'm a top quoter. I would parse the text first. Phrase by phrase, or phrase segments. Then spit out a report. Marc Guay wrote: If that's not good enough, can you explain how you would like it to function? Would the whole paper be scanned phrase-by-phrase for matches and then spit out a report? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Connect to Google
On Thu, 2012-02-16 at 14:47 -0500, John Taylor-Johnston wrote: Can I use PHP to interface with Google? Any possible examples of this? Let's start with the first step. :) I'm sure proprietary sites like http://www.compilatio.net/ for example connects to search engines. They cannot be crawling the net too. That would be crazy. (I'm a top quoter. It's more intuitive.) Thanks Ash. John Ashley Sheridan wrote: On Wed, 2012-02-15 at 21:56 -0500, John Taylor-Johnston wrote: How can I use PHP to interface with Google and see if this text exists on the internet? Wow, that's a pretty big project you're chewing there. A quick search shows that there are some project out there to detect plagiarism, but I think for university calibre there's a hefty sum of money required. It might seem more intuitive to you, but it really, really screws up the archives. Like I said before, cURL is the way to interface with Google. Basically, cURL can be used to request resources, in this case a web page, from the web. You can call a URL and parse the page of results to determine whatever you need to. As you've not really hashed out any firm ides of what exactly you want, it's a little difficult to say exactly what you need to do. -- Thanks, Ash http://www.ashleysheridan.co.uk
Re: [PHP] Connect to Google
On Thu, 2012-02-16 at 14:50 -0500, John Taylor-Johnston wrote: I'm a top quoter. I would parse the text first. Phrase by phrase, or phrase segments. Then spit out a report. Marc Guay wrote: If that's not good enough, can you explain how you would like it to function? Would the whole paper be scanned phrase-by-phrase for matches and then spit out a report? You might be a top quoter but, please, to get the best from this list and not annoy people post at the bottom. The list gets archived online at many places, and it's annoying to read things in this order: reply 4 reply 2 question reply 1 reply 3 Almost every email client I know of allows bottom posting. This is just one of the rules of this list, please don't be offended, but do try to keep to the rules, it keeps everyone happy, and happy people are helpful people! -- Thanks, Ash http://www.ashleysheridan.co.uk
Re: [PHP] Connect to Google
2012/2/16 John Taylor-Johnston jt.johns...@usherbrooke.ca: Can I use PHP to interface with Google? Any possible examples of this? There's Google Custom Search API: http://code.google.com/intl/nl-NL/apis/customsearch/v1/overview.html It interfaces in JSON, and PHP has json functions included since PHP 5.2. [1]. It's free up to 100 queries a day, after that you have to pay $5 per 1000 queries. - Matijn [1] www.php.net/json -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Connect to Google
This is the first time I've been surprised that a Drupal module existed for something... http://drupal.org/project/authenticate -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Connect to Google
Sort of off topic but here's a list of existing services (some of which are free) in case you don't want to reinvent the wheel. http://www.justfitstudio.com/articles/plagiarism-detection.html -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php