Re: [OpenBD] Which format would OpenBD be most efficient with?

Alan Holden Thu, 16 Dec 2010 12:26:17 -0800

Sure. You're pretty darn close to what I said.

But that's an important point: I doubt that any of these list providers (phishing, email blocklists, rss feeds, etc) want you to actually hit their server every time you get a request of your own. Most of them expect you to pull their data occasionally and bank it locally; and will even block you if you request it too often.

Al

On 12/16/2010 12:07 PM, Jason King wrote:

Gotcha..

So rather than hitting the original xml doc.. maybe create a custom tailored SQL table that only holds the info I need, and just import the XML data into that? I was actually considering that... This way, I could write all my searching and such in standard sql.

-jason

On Thu, Dec 16, 2010 at 1:40 PM, Alan Holden <[email protected]> wrote:

So, pull and parse the phishtank document every day or so. XML, JSON, whatever floats your boat. It'll be an offline process.
Store ALL the document's stuff in a data object for later.
Then take out just the urls - all of the unique ones - and store them in memory as an application list or array.
Each time a url is submitted, scan the memory list for a match.
If you find a match, go back to the data, find out why and act on that.

--
Open BlueDragon Public Mailing List
http://www.openbluedragon.org/ http://twitter.com/OpenBlueDragon
official manual: http://www.openbluedragon.org/manual/
Ready2Run CFML http://www.openbluedragon.org/openbdjam/

mailing list - http://groups.google.com/group/openbd?hl=en

Re: [OpenBD] Which format would OpenBD be most efficient with?

Reply via email to