Hi,

I've searched this and other app engine groups as well as general
googling and I haven't found a solution, so posting here.  I'm writing
an app to provide a web service API on top of the free the USDA
nutrition database, and it's a fairly large data set - a few of the
tables have 500K rows due to the many-to-many relationships between
food and nutrients.  Any suggestions for a more efficient way to get
this data into the database than the vanilla bulk loader?  That's
taking hours to complete and running up against my CPU limit.  The
data is separated into CSV files by table, and the relationships
between tables in the CSVs are foreign key strings that make it
straightforward to generate a db.Key for the relationship.

I know I can buy more CPU, that seems a little goofy since this is
just the initial data load, and, unless my app becomes extremely
popular, I'm likely not going to hit the limit again.  If nothing
else, I suppose I could split the data set and just do this over
multiple days, though if I ever have to load the data again due to a
schema change or something that's a serious annoyance.

I do want to use Python for this.  It's been long enough since I've
used Java that there'd be a learning curve to start back up with it,
and I like Python.

Thanks,
Matt

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to