Just to chime in again here, there are many alternatives, but don't give up on MySQL straight away.
Solutions such as Cassandra, MongoDB, Hadoop clustering etc are also lovely, but they all come at a cost (whether that be additional risk, development time, performance/feature trade offs, the "unknown" etc). Here is an interesting article by David Mytton over at BoxedIce, which is just one example of some of the unexpected problems that can come up http://blog.serverdensity.com/queueing-mongodb-using-mongodb/ In my own experience, I've seen MySQL handle 400+ million rows without much read/write performance hit for a specific data structure, but I've also seen it suffer terribly at both read/write performance for 900+ million rows for a different data structure, on the same hardware. As always, you need to identify what your data is and how you intend to use it - before you can decide both how it needs to be stored, and how it will be extracted. Hope this helps - I should also note that I am by no means an expect on the subject, I'm just quoting off my own production experience :) Cal On Wed, Aug 8, 2012 at 11:41 AM, Ivo Marcelo Leonardi Zaniolo < [email protected]> wrote: > Some years ago I got a similar problem using postgresql. To.improve the > bulk performance I chose to disable all keys and indexes after bulk the > data. > On postgres it solved my problem, but if you need more performance, and > have more than one computer, I recommend that you take a look at Cassandra, > it has a grate insert performance and could be used to hold that large data. > > Take a look at this post, I wrote that in Portuguese, but google can > translate it: > > > Http://imarcelolz.blogspot.com/2011/11/cassandra-hadoopy-performance-tunning.html > > Ivo Marcelo Leonardi Zaniolo > +55 71 9302 3400 > [email protected] > www.informatizzare.com.br > imarcelolz.blogspot.com.br > Em 08/08/2012 01:18, "Cal Leeming [Simplicity Media Ltd]" < > [email protected]> escreveu: > > Hi Anton, >> >> In short, attempting to do any sort of bulk import "out of the box" with >> the ORM, will always end with bad performance. >> >> No matter how fast your disks are (SSDs with 4000 iops in RAID 1 for >> example), you'll still only get around 0.1s per insert via the ORM on a >> single thread, and if you multi thread the import on a MySQL backend then >> you'll end up hitting the notorious high throughput race condition ( >> https://code.djangoproject.com/ticket/18557 ) >> >> The method used to speed up your bulk imports depend entirely on the >> actual data itself. >> >> Depending on whether or not you absolutely need all the functionality of >> the ORM during import (signals, hooks etc) - then you could also look at >> using bulk_create() that comes with Django 1.4. >> >> I actually did a webcast about a year ago on how to do bulk updates to >> achieve massively increased throughput (from 30 rows/sec to 8000 rows/sec) >> - and this method was eventually integrated into the DSE plugin ( >> http://pypi.python.org/pypi/dse/3.3.0 - look for bulk_update ). >> >> As your table and indexes grow, both read/write performance will slowly >> start to get worse once you pass the 1mil rows point - so make sure to put >> some thought into the initial model structure (using ints to define choices >> where possible, no indexes on varchar/char if possible, make good use of >> composite/compound indexes etc). >> >> Failing that, if you want to drag every single microsecond of performance >> out of this, then you could performance the bulk imports as raw SQL. >> >> Hope this helps! >> >> Cal >> >> On Mon, Aug 6, 2012 at 4:14 PM, creecode <[email protected]> wrote: >> >>> Hello Anton, >>> >>> On Friday, August 3, 2012 9:05:47 AM UTC-7, Anton wrote: >>> >>> >>>> My Problem: I will populate it with 50000 items. >>>> >>>> Has Django really such a bad performance for data insertion? >>>> >>>> I can't believe it, so ... can somebody give me a hint? >>>> Is there a doc page dedicated to speed/performance issues?? >>>> >>>> Otherwise I have to look for an other system. >>>> >>> >>> Before you go looking for another system you may want to do some >>> research on inserting lots of rows in Django. Cal Leeming on this list >>> has had some experience dealing with lots of rows and there are several >>> threads on the topic. Check the Google interface for this group and use >>> the search feature. Also try Googling for inserting lots of data with >>> Django, several folks have written about their experiences inserting lots >>> of rows. >>> >>> Toodle-loooooooooooooo................... >>> creecode >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Django users" group. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msg/django-users/-/yLxStReK_1gJ. >>> >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected]. >>> For more options, visit this group at >>> http://groups.google.com/group/django-users?hl=en. >>> >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Django users" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/django-users?hl=en. >> > -- > You received this message because you are subscribed to the Google Groups > "Django users" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/django-users?hl=en. > -- *Cal Leeming Technical Support | Simplicity Media Ltd **US *310-362-7070 | *UK *02476 100401 | *Direct *02476 100402 *Available 24 hours a day, 7 days a week.* -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

