Looks like there is approx 830k entries in the xml file. Some only have a
few attributes, some of them have hundreds.

So basically, everytime I scan the doc it has to parse through nearly
1million unique parent elements.

On Thu, Dec 16, 2010 at 12:13 PM, Jason King <[email protected]> wrote:

> Here is a link to the XML file.
>
> http://data.phishtank.com/data/online-valid.xml
>
> It is approx 15MB's of entries like this
>
>
> *
> *
>
>
> <output>
>       <meta>
>               <generated_at>2009-06-19T16:18:40+00:00</generated_at>
>               <total_entries>1234</total_entries>
>       </meta>
>       <entries>
>               <entry>
>                       <url><![CDATA[http://www.example.com/] 
> <http://www.example.com/%5D>]></url>
>                       <phish_id>123456</phish_id>
>                       
> <phish_detail_url>http://www.phishtank.com/phish_detail.php?phish_id=123456</phish_detail_url>
>                       <details>
>                               <detail>
>                                       <ip_address>1.2.3.4</ip_address>
>                                       <cidr_block>1.2.3.0/24</ip_address>
>                                       
> <announcing_network>1234</announcing_network>
>                                       <rir>arin</rir>
>                                       
> <detail_time>2009-06-20T15:37:31+00:00</detail_time>
>                               </detail>
>                       </details>
>                       <submission>
>                               
> <submission_time>2009-06-19T15:15:47+00:00</submission_time>
>                       </submission>
>                       <verification>
>                               <verified>yes</verified>
>                               <verification_time>2009-06-19T15:37:31+00:00</
> verification_time>
>                       </verification>
>                       <status>
>                               <online>yes</online>
>                       </status>
>                       <target>1st National Example Bank</target>
>               </entry>
>               ...
>       </entries>
> </output>
>
>
>

-- 
Open BlueDragon Public Mailing List
 http://www.openbluedragon.org/   http://twitter.com/OpenBlueDragon
 official manual: http://www.openbluedragon.org/manual/
 Ready2Run CFML http://www.openbluedragon.org/openbdjam/

 mailing list - http://groups.google.com/group/openbd?hl=en

Reply via email to