Just an update on ways to get a large dataset up to Heroku. I saw that they're working on an update to Taps to make it work better with large datasets. In the meantime, however, here's a hack that essentially lets you upload only individual tables (or parts of tables) with Taps.
Taps does not clear out its target database, on either a push or pull, but instead just tries to insert everything it receives. If there are duplicates with uniqueness constraints such as primary keys, this will cause it to error out, otherwise presumably it'll add duplicate entries. This can be used to our advantage in pushing a large database up to Heroku. If you do a push with nothing in your local database other than your new entries, Taps will happily send those up, seamlessly merging them into the Heroku database. I still haven't had the chance to upload my large datasets that were the original topic of this thread, but I have confirmed that the following procedure works on a small test set: First get a full set of current database. This is only necessary if the tables you are adding having primary keys, or other unique fields. By having them in the local database, you prevent newly added entries from conflicting. Then, add whatever you need to the tables, and remove all entries that are already on the application. Full tables can be removed, Taps seems to function fine without having a full table set being pushed. Finally, a db:push will now only send your new information. You can break up your new information into chunks and send them up in batches to avoid a single overly large push. While this is going on, it's important to prevent the live app from doing anything that might add entries into the table being pushed into, or you might get primary key conflicts. On Mar 15, 6:26 pm, Terence Lee <[email protected]> wrote: > What I've been told for tmp folder size is about 1gb but it's a soft > limit. > > -Terence > > On Mon, 2010-03-15 at 08:52 -0700, Mike wrote: > > Also in that link you linked to: > > Slug Size: 500MB - Hard > > > Man, that probably includes the temp directory. > > > Getting data onto Heroku is pretty friggin hard.... > > > On Mar 15, 9:27 am, Daniele <[email protected]> wrote: > > > On 15 Mar, 01:13, Mike <[email protected]> wrote: > > > > > That's a really good idea on having a controller take the upload into > > > > temp. Is there a size limit on the temp directory? > > > > I don't know, sorry. > > > > > Or a time limit on > > > > how long a dyno can be locked to a single upload before being > > > > restarted? > > > > "Request Length: 30 seconds - Hard" (http://legal.heroku.com/aup) > > > > A bit low timeout for a middle-size upload... > > -- You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/heroku?hl=en.
