We're using mySql. I found out that this file is really only sent out
quarterly. I suggested that he use some sort of shell script (we're on
linux) to parse and import the file.

<!----------------//------
andy matthews
web developer
certified advanced coldfusion programmer
ICGLink, Inc.
[EMAIL PROTECTED]
615.370.1530 x737
--------------//--------->

-----Original Message-----
From: Jim Davis [mailto:[EMAIL PROTECTED]
Sent: Monday, July 17, 2006 2:57 PM
To: CF-Talk
Subject: RE: Parsing Extra-large XML files?


> -----Original Message-----
> From: Andy Matthews [mailto:[EMAIL PROTECTED]
> Sent: Monday, July 17, 2006 2:20 PM
> To: CF-Talk
> Subject: Parsing Extra-large XML files?
>
> I've got a co-worker who is working with a client that receives product
> information from a third-party vendor in XML format. He needs to take the
> information contained in the XML file and dump it into our local database.
>
> The XML file is 50 megs and contains lots of "columns". He's been trying
> to
> dump the file into memory using XMLParse but CF keeps crapping out on him.

What does "crapping out" mean?

CF should, I think, handle this just fine (it'll take a while of course).
There are some potential issues however:

+) Are you sure that the XML file is valid?  ColdFusion will definitely crap
out if the file corrupt.

+) Are you giving CF enough time?  By default CF kills what it suspects are
"hanging" threads.  What this means is that any thread taking longer than
the set timeout (default is 60 seconds I believe) will be killed.  Set the
timeout high enough or try turning it off completely for this task and see
if that helps.

+) Is the machine he's trying this on underpowered for the task?  Even a
simple XML file is represented internally by a significantly complex
structure.  If the machine is too small (especially too-little memory) it
could have problems.

> Does anyone have any ideas what my co-worker can do with this file?

Why do you have to do this in CF?  Most Databases (and all enterprise DBs
that I know of) will import XML directly with just a little coaxing.

SQL Server, for example, has a "Bulk XML Load" component.  You give it the
XML file and a schema for it (which matches a DB table) and run a command
line utility to import.  Here's a MSKB article on it:

http://support.microsoft.com/Default.aspx?scid=316005

The third party vendor should be able to give you the schema but you'll have
to modify it a bit to add the SQL relationships.

Actually, is this something that needs to be done automatically at all?  If
this is only a development one-off then there are probably even easier ways
to deal with.

Later versions of MS Office Excel or Access will, for example, import XML
files directly.  You may have to fiddle with them but they do a pretty good
job.  From there you can set up a CF datasource directly or export from them
into something your dev database can handle directly (CSV for example).

To skip that step you could run an XSL filter on the file to convert it to
something more usable as well.  XSL isn't super approachable (at least to
me) but it's not rocket science.

In any case don't involve CF unless you NEED to for reason.  A modern
database is more tuned for the work and cutting out unneeded middlemen is
always a good thing.

Jim Davis




~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Introducing the Fusion Authority Quarterly Update. 80 pages of hard-hitting,
up-to-date ColdFusion information by your peers, delivered to your door four 
times a year.
http://www.fusionauthority.com/quarterly

Archive: 
http://www.houseoffusion.com/cf_lists/message.cfm/forumid:4/messageid:246820
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

Reply via email to