Hi, I've searched this and other app engine groups as well as general googling and I haven't found a solution, so posting here. I'm writing an app to provide a web service API on top of the free the USDA nutrition database, and it's a fairly large data set - a few of the tables have 500K rows due to the many-to-many relationships between food and nutrients. Any suggestions for a more efficient way to get this data into the database than the vanilla bulk loader? That's taking hours to complete and running up against my CPU limit. The data is separated into CSV files by table, and the relationships between tables in the CSVs are foreign key strings that make it straightforward to generate a db.Key for the relationship.
I know I can buy more CPU, that seems a little goofy since this is just the initial data load, and, unless my app becomes extremely popular, I'm likely not going to hit the limit again. If nothing else, I suppose I could split the data set and just do this over multiple days, though if I ever have to load the data again due to a schema change or something that's a serious annoyance. I do want to use Python for this. It's been long enough since I've used Java that there'd be a learning curve to start back up with it, and I like Python. Thanks, Matt --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---
