A agree with you, Cal. Migrate to another DB engine could not solve the
problem.
I just mentioned that, cause there another good opportunities. But it
doesn't mean that will be easy and cheap.

Ivo Marcelo Leonardi Zaniolo
+55 71 9302 3400
[email protected]
www.informatizzare.com.br
imarcelolz.blogspot.com.br
Em 08/08/2012 09:26, "Cal Leeming [Simplicity Media Ltd]" <
[email protected]> escreveu:

> Just to chime in again here, there are many alternatives, but don't give
> up on MySQL straight away.
>
> Solutions such as Cassandra, MongoDB, Hadoop clustering etc are also
> lovely, but they all come at a cost (whether that be additional risk,
> development time, performance/feature trade offs, the "unknown" etc).
>
> Here is an interesting article by David Mytton over at BoxedIce, which is
> just one example of some of the unexpected problems that can come up
>
> http://blog.serverdensity.com/queueing-mongodb-using-mongodb/
>
> In my own experience, I've seen MySQL handle 400+ million rows without
> much read/write performance hit for a specific data structure, but I've
> also seen it suffer terribly at both read/write performance for 900+
> million rows for a different data structure, on the same hardware.
>
> As always, you need to identify what your data is and how you intend to
> use it - before you can decide both how it needs to be stored, and how it
> will be extracted.
>
> Hope this helps - I should also note that I am by no means an expect on
> the subject, I'm just quoting off my own production experience :)
>
> Cal
>
>
> On Wed, Aug 8, 2012 at 11:41 AM, Ivo Marcelo Leonardi Zaniolo <
> [email protected]> wrote:
>
>> Some years ago I got a similar problem using postgresql. To.improve the
>> bulk performance I chose to disable all keys and  indexes after bulk the
>> data.
>> On postgres it solved my problem, but if you need more performance, and
>> have more than one computer, I recommend that you take a look at Cassandra,
>> it has a grate insert performance and could be used to hold that large data.
>>
>> Take a look at this post, I wrote that in Portuguese, but google can
>> translate it:
>>
>>
>> Http://imarcelolz.blogspot.com/2011/11/cassandra-hadoopy-performance-tunning.html
>>
>> Ivo Marcelo Leonardi Zaniolo
>> +55 71 9302 3400
>> [email protected]
>> www.informatizzare.com.br
>> imarcelolz.blogspot.com.br
>> Em 08/08/2012 01:18, "Cal Leeming [Simplicity Media Ltd]" <
>> [email protected]> escreveu:
>>
>> Hi Anton,
>>>
>>> In short, attempting to do any sort of bulk import "out of the box" with
>>> the ORM, will always end with bad performance.
>>>
>>> No matter how fast your disks are (SSDs with 4000 iops in RAID 1 for
>>> example), you'll still only get around 0.1s per insert via the ORM on a
>>> single thread, and if you multi thread the import on a MySQL backend then
>>> you'll end up hitting the notorious high throughput race condition (
>>> https://code.djangoproject.com/ticket/18557 )
>>>
>>> The method used to speed up your bulk imports depend entirely on the
>>> actual data itself.
>>>
>>> Depending on whether or not you absolutely need all the functionality of
>>> the ORM during import (signals, hooks etc) - then you could also look at
>>> using bulk_create() that comes with Django 1.4.
>>>
>>> I actually did a webcast about a year ago on how to do bulk updates to
>>> achieve massively increased throughput (from 30 rows/sec to 8000 rows/sec)
>>> - and this method was eventually integrated into the DSE plugin (
>>> http://pypi.python.org/pypi/dse/3.3.0 - look for bulk_update ).
>>>
>>> As your table and indexes grow, both read/write performance will slowly
>>> start to get worse once you pass the 1mil rows point - so make sure to put
>>> some thought into the initial model structure (using ints to define choices
>>> where possible, no indexes on varchar/char if possible, make good use of
>>> composite/compound indexes etc).
>>>
>>> Failing that, if you want to drag every single microsecond of
>>> performance out of this, then you could performance the bulk imports as raw
>>> SQL.
>>>
>>> Hope this helps!
>>>
>>> Cal
>>>
>>> On Mon, Aug 6, 2012 at 4:14 PM, creecode <[email protected]> wrote:
>>>
>>>> Hello Anton,
>>>>
>>>> On Friday, August 3, 2012 9:05:47 AM UTC-7, Anton wrote:
>>>>
>>>>
>>>>> My Problem: I will populate it with 50000 items.
>>>>>
>>>>> Has Django really such a bad performance for data insertion?
>>>>>
>>>>> I can't believe it, so ... can somebody give me a hint?
>>>>> Is there a doc page dedicated to speed/performance issues??
>>>>>
>>>>> Otherwise I have to look for an other system.
>>>>>
>>>>
>>>> Before you go looking for another system you may want to do some
>>>> research on inserting lots of rows in Django.  Cal Leeming on this
>>>> list has had some experience dealing with lots of rows and there are
>>>> several threads on the topic.  Check the Google interface for this group
>>>> and use the search feature.  Also try Googling for inserting lots of data
>>>> with Django, several folks have written about their experiences inserting
>>>> lots of rows.
>>>>
>>>> Toodle-loooooooooooooo...................
>>>> creecode
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Django users" group.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msg/django-users/-/yLxStReK_1gJ.
>>>>
>>>> To post to this group, send email to [email protected].
>>>> To unsubscribe from this group, send email to
>>>> [email protected].
>>>> For more options, visit this group at
>>>> http://groups.google.com/group/django-users?hl=en.
>>>>
>>>
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "Django users" group.
>>> To post to this group, send email to [email protected].
>>> To unsubscribe from this group, send email to
>>> [email protected].
>>> For more options, visit this group at
>>> http://groups.google.com/group/django-users?hl=en.
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "Django users" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/django-users?hl=en.
>>
>
>
>
> --
>
> *Cal Leeming
> Technical Support | Simplicity Media Ltd
> **US *310-362-7070 | *UK *02476 100401 | *Direct *02476 100402
>
> *Available 24 hours a day, 7 days a week.*
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Django users" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/django-users?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to