Sure. You're pretty darn close to what I said.

But that's an important point: I doubt that any of these list providers (phishing, email blocklists, rss feeds, etc) want you to actually hit their server every time you get a request of your own. Most of them expect you to pull their data occasionally and bank it locally; and will even block you if you request it too often.

Al


On 12/16/2010 12:07 PM, Jason King wrote:
Gotcha.. 

So rather than hitting the original xml doc.. maybe create a custom tailored SQL table that only holds the info I need, and just import the XML data into that? I was actually considering that...  This way, I could write all my searching and such in standard sql. 

-jason

On Thu, Dec 16, 2010 at 1:40 PM, Alan Holden <[email protected]> wrote:
 So, pull and parse the phishtank document every day or so. XML, JSON, whatever floats your boat. It'll be an offline process.
Store ALL the document's stuff in a data object for later.
Then take out just the urls - all of the unique ones - and store them in memory as an application list or array.
Each time a url is submitted, scan the memory list for a match.
If you find a match, go back to the data, find out why and act on that.

--
Open BlueDragon Public Mailing List
http://www.openbluedragon.org/ http://twitter.com/OpenBlueDragon
official manual: http://www.openbluedragon.org/manual/
Ready2Run CFML http://www.openbluedragon.org/openbdjam/
 
mailing list - http://groups.google.com/group/openbd?hl=en

Reply via email to