A agree with you, Cal. Migrate to another DB engine could not solve the problem. I just mentioned that, cause there another good opportunities. But it doesn't mean that will be easy and cheap.
Ivo Marcelo Leonardi Zaniolo +55 71 9302 3400 [email protected] www.informatizzare.com.br imarcelolz.blogspot.com.br Em 08/08/2012 09:26, "Cal Leeming [Simplicity Media Ltd]" < [email protected]> escreveu: > Just to chime in again here, there are many alternatives, but don't give > up on MySQL straight away. > > Solutions such as Cassandra, MongoDB, Hadoop clustering etc are also > lovely, but they all come at a cost (whether that be additional risk, > development time, performance/feature trade offs, the "unknown" etc). > > Here is an interesting article by David Mytton over at BoxedIce, which is > just one example of some of the unexpected problems that can come up > > http://blog.serverdensity.com/queueing-mongodb-using-mongodb/ > > In my own experience, I've seen MySQL handle 400+ million rows without > much read/write performance hit for a specific data structure, but I've > also seen it suffer terribly at both read/write performance for 900+ > million rows for a different data structure, on the same hardware. > > As always, you need to identify what your data is and how you intend to > use it - before you can decide both how it needs to be stored, and how it > will be extracted. > > Hope this helps - I should also note that I am by no means an expect on > the subject, I'm just quoting off my own production experience :) > > Cal > > > On Wed, Aug 8, 2012 at 11:41 AM, Ivo Marcelo Leonardi Zaniolo < > [email protected]> wrote: > >> Some years ago I got a similar problem using postgresql. To.improve the >> bulk performance I chose to disable all keys and indexes after bulk the >> data. >> On postgres it solved my problem, but if you need more performance, and >> have more than one computer, I recommend that you take a look at Cassandra, >> it has a grate insert performance and could be used to hold that large data. >> >> Take a look at this post, I wrote that in Portuguese, but google can >> translate it: >> >> >> Http://imarcelolz.blogspot.com/2011/11/cassandra-hadoopy-performance-tunning.html >> >> Ivo Marcelo Leonardi Zaniolo >> +55 71 9302 3400 >> [email protected] >> www.informatizzare.com.br >> imarcelolz.blogspot.com.br >> Em 08/08/2012 01:18, "Cal Leeming [Simplicity Media Ltd]" < >> [email protected]> escreveu: >> >> Hi Anton, >>> >>> In short, attempting to do any sort of bulk import "out of the box" with >>> the ORM, will always end with bad performance. >>> >>> No matter how fast your disks are (SSDs with 4000 iops in RAID 1 for >>> example), you'll still only get around 0.1s per insert via the ORM on a >>> single thread, and if you multi thread the import on a MySQL backend then >>> you'll end up hitting the notorious high throughput race condition ( >>> https://code.djangoproject.com/ticket/18557 ) >>> >>> The method used to speed up your bulk imports depend entirely on the >>> actual data itself. >>> >>> Depending on whether or not you absolutely need all the functionality of >>> the ORM during import (signals, hooks etc) - then you could also look at >>> using bulk_create() that comes with Django 1.4. >>> >>> I actually did a webcast about a year ago on how to do bulk updates to >>> achieve massively increased throughput (from 30 rows/sec to 8000 rows/sec) >>> - and this method was eventually integrated into the DSE plugin ( >>> http://pypi.python.org/pypi/dse/3.3.0 - look for bulk_update ). >>> >>> As your table and indexes grow, both read/write performance will slowly >>> start to get worse once you pass the 1mil rows point - so make sure to put >>> some thought into the initial model structure (using ints to define choices >>> where possible, no indexes on varchar/char if possible, make good use of >>> composite/compound indexes etc). >>> >>> Failing that, if you want to drag every single microsecond of >>> performance out of this, then you could performance the bulk imports as raw >>> SQL. >>> >>> Hope this helps! >>> >>> Cal >>> >>> On Mon, Aug 6, 2012 at 4:14 PM, creecode <[email protected]> wrote: >>> >>>> Hello Anton, >>>> >>>> On Friday, August 3, 2012 9:05:47 AM UTC-7, Anton wrote: >>>> >>>> >>>>> My Problem: I will populate it with 50000 items. >>>>> >>>>> Has Django really such a bad performance for data insertion? >>>>> >>>>> I can't believe it, so ... can somebody give me a hint? >>>>> Is there a doc page dedicated to speed/performance issues?? >>>>> >>>>> Otherwise I have to look for an other system. >>>>> >>>> >>>> Before you go looking for another system you may want to do some >>>> research on inserting lots of rows in Django. Cal Leeming on this >>>> list has had some experience dealing with lots of rows and there are >>>> several threads on the topic. Check the Google interface for this group >>>> and use the search feature. Also try Googling for inserting lots of data >>>> with Django, several folks have written about their experiences inserting >>>> lots of rows. >>>> >>>> Toodle-loooooooooooooo................... >>>> creecode >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Django users" group. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msg/django-users/-/yLxStReK_1gJ. >>>> >>>> To post to this group, send email to [email protected]. >>>> To unsubscribe from this group, send email to >>>> [email protected]. >>>> For more options, visit this group at >>>> http://groups.google.com/group/django-users?hl=en. >>>> >>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Django users" group. >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected]. >>> For more options, visit this group at >>> http://groups.google.com/group/django-users?hl=en. >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Django users" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/django-users?hl=en. >> > > > > -- > > *Cal Leeming > Technical Support | Simplicity Media Ltd > **US *310-362-7070 | *UK *02476 100401 | *Direct *02476 100402 > > *Available 24 hours a day, 7 days a week.* > > -- > You received this message because you are subscribed to the Google Groups > "Django users" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/django-users?hl=en. > -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

