Looks like there is approx 830k entries in the xml file. Some only have a few attributes, some of them have hundreds.
So basically, everytime I scan the doc it has to parse through nearly 1million unique parent elements. On Thu, Dec 16, 2010 at 12:13 PM, Jason King <[email protected]> wrote: > Here is a link to the XML file. > > http://data.phishtank.com/data/online-valid.xml > > It is approx 15MB's of entries like this > > > * > * > > > <output> > <meta> > <generated_at>2009-06-19T16:18:40+00:00</generated_at> > <total_entries>1234</total_entries> > </meta> > <entries> > <entry> > <url><![CDATA[http://www.example.com/] > <http://www.example.com/%5D>]></url> > <phish_id>123456</phish_id> > > <phish_detail_url>http://www.phishtank.com/phish_detail.php?phish_id=123456</phish_detail_url> > <details> > <detail> > <ip_address>1.2.3.4</ip_address> > <cidr_block>1.2.3.0/24</ip_address> > > <announcing_network>1234</announcing_network> > <rir>arin</rir> > > <detail_time>2009-06-20T15:37:31+00:00</detail_time> > </detail> > </details> > <submission> > > <submission_time>2009-06-19T15:15:47+00:00</submission_time> > </submission> > <verification> > <verified>yes</verified> > <verification_time>2009-06-19T15:37:31+00:00</ > verification_time> > </verification> > <status> > <online>yes</online> > </status> > <target>1st National Example Bank</target> > </entry> > ... > </entries> > </output> > > > -- Open BlueDragon Public Mailing List http://www.openbluedragon.org/ http://twitter.com/OpenBlueDragon official manual: http://www.openbluedragon.org/manual/ Ready2Run CFML http://www.openbluedragon.org/openbdjam/ mailing list - http://groups.google.com/group/openbd?hl=en
