Re: Re: Massive import in Django database
Hi, John: Sorry! The pseudo code write by me is not correct, and It's slow.. I will come back tonight. With Regards, Qiancong,Mo From: moqianc...@gmail.com Date: 2014-06-11 23:47 To: django-users Subject: Re: Massive import in Django database Hi, John: I think your code is right, except "Doc.object" should be "Doc.objects"; The following pseudo code maybe fater than what you write: doc_map = {} for each xml: extract from the xml data -> mydoc_code, mydoc_text, myRelated_doc_codes doc = Doc.objects.create(doc_code=mydoc_code, doc_text=mydoc_text) doc_map[mydoc_code] = (doc, myRelated_doc_codes) for (doc, rcodes) in doc_map.values(): for rcode in rcodes: doc.related_doc.add(doc_map[rcode]) doc.save() I have checked, It's okay; The object have be cached in doc_map, and no need re-query related_codes for related_doc from database, the speed should speed up. With Regards. moqianc...@gmail.com From: John Carlo Date: 2014-06-11 21:14 To: django-users Subject: Massive import in Django database Hello everybody, I've fallen in love with Django two years ago and I've been using it for my job projects. In the past I found very useful information in this group, so a big thank you guys! I have a little doubt. I have to import in Django db (sqlite for local development, mySql on the server) about 1.000.000 xml documents. The model class is the following: class Doc(models.Model): doc_code = models.CharField(max_length=20, unique=True, primary_key=True, db_index = True) doc_text = models.TextField(null=True, blank=True) related_doc= models.ManyToManyField('self', null=True, blank=True, db_index = True) >From what I know bulk insertion is not possibile because I have a >ManyToManyField relation. So I have this simple loop (in pseudo code) for each xml: extract from the xml date-> mydoc_code, mydoc_text, myRelated_doc_codes myDoc = Doc.object.get_or_create(doc_code = mydoc_code)[0] myDoc.doc_text = mydoc_text for reldoc_code in myRelated_doc_codes: myRelDoc = Doc.object.get_or_create(doc_code = reldoc_code )[0] myDoc.related_doc.add(myRelDoc ) myDoc.save() I'm doing it right? Do you have some suggestions, recommendation? I fear that since I have 1.000.000 docs to import, it will take a lt of time, especially during the get_or_create routines thank you in advance everybody! John -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at http://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/5b88deaf-d806-4a64-9e8d-528d95599c80%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at http://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/2014061208110509465878%40gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Massive import in Django database
Hi, John: I think your code is right, except "Doc.object" should be "Doc.objects"; The following pseudo code maybe fater than what you write: doc_map = {} for each xml: extract from the xml data -> mydoc_code, mydoc_text, myRelated_doc_codes doc = Doc.objects.create(doc_code=mydoc_code, doc_text=mydoc_text) doc_map[mydoc_code] = (doc, myRelated_doc_codes) for (doc, rcodes) in doc_map.values(): for rcode in rcodes: doc.related_doc.add(doc_map[rcode]) doc.save() I have checked, It's okay; The object have be cached in doc_map, and no need re-query related_codes for related_doc from database, the speed should speed up. With Regards. moqianc...@gmail.com From: John Carlo Date: 2014-06-11 21:14 To: django-users Subject: Massive import in Django database Hello everybody, I've fallen in love with Django two years ago and I've been using it for my job projects. In the past I found very useful information in this group, so a big thank you guys! I have a little doubt. I have to import in Django db (sqlite for local development, mySql on the server) about 1.000.000 xml documents. The model class is the following: class Doc(models.Model): doc_code = models.CharField(max_length=20, unique=True, primary_key=True, db_index = True) doc_text = models.TextField(null=True, blank=True) related_doc= models.ManyToManyField('self', null=True, blank=True, db_index = True) >From what I know bulk insertion is not possibile because I have a >ManyToManyField relation. So I have this simple loop (in pseudo code) for each xml: extract from the xml date-> mydoc_code, mydoc_text, myRelated_doc_codes myDoc = Doc.object.get_or_create(doc_code = mydoc_code)[0] myDoc.doc_text = mydoc_text for reldoc_code in myRelated_doc_codes: myRelDoc = Doc.object.get_or_create(doc_code = reldoc_code )[0] myDoc.related_doc.add(myRelDoc ) myDoc.save() I'm doing it right? Do you have some suggestions, recommendation? I fear that since I have 1.000.000 docs to import, it will take a lt of time, especially during the get_or_create routines thank you in advance everybody! John -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at http://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/5b88deaf-d806-4a64-9e8d-528d95599c80%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at http://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/2014061123474049956470%40gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Re: Django Python roop
Sometimes, I have to use while statement too. But any time, I think write right code, write clean code, It's a basic rule for our develope work. moqianc...@gmail.com From: hito koto Date: 2014-06-11 21:05 To: django-users Subject: Re: Django Python roop Hi, qiancong: Thank you, Do you not use the while statement to Django? 2014年6月11日水曜日 21時52分54秒 UTC+9 Qiancong: Hi, hito koto: I think the problems you asked should be post in python-lang mail-list. For python program, I prefer "for", not "while"; It's more simpler. But if you like while, I think the following code maybe helpful: def fff(x): y = [] i = 0 xlen = len(x) while i< xlen: y.append(x[i]) i += 1 return y but I more like the follow writing: def fff(x): return x[:] or you just do as following: y = x[:] python is a power & beautiful language, please read the https://docs.python.org/2/tutorial/ at first like François said. With Regards. moqia...@gmail.com From: hito koto Date: 2014-06-11 19:21 To: django-users Subject: Django Python roop Hello, all I want to change to while statement from for statement, so how can i do to? this is my correct for statement codes: def fff(x): y = [] for i in range(len(x)): y.append(x[i]) return y and i want change to while statement So, this code have erroes: TypeError: list indices must be integers, not list def fff(x): y = [] while x !=[]: for i in x: y.append(x[i]) return y -- -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at http://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/8afde78f-9910-4d26-b1e6-e5f1eaaca63b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at http://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/2014061121121170324647%40gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Re: Django Python roop
In your function, if x is not a empty list, then while will be a infinite loop, x always not be []; then the inner for-loop will append many many items into y list, Then your computer will say "have the Memory Error", to tell you the memory not enought. moqianc...@gmail.com From: hito koto Date: 2014-06-11 20:49 To: django-users Subject: Re: Django Python roop MemoryError , Why? idon't know. I try this have Traceback (most recent call last): File "", line 1, in File "", line 5, in foo MemoryError errors: I'm change to this code: have the Memory Error, def foo(x): y = [] while x != []: for i in range(len(x)): y.append(x[i]) return y 2014年6月11日水曜日 20時21分42秒 UTC+9 hito koto: Hello, all I want to change to while statement from for statement, so how can i do to? this is my correct for statement codes: def fff(x): y = [] for i in range(len(x)): y.append(x[i]) return y and i want change to while statement So, this code have erroes: TypeError: list indices must be integers, not list def fff(x): y = [] while x !=[]: for i in x: y.append(x[i]) return y -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at http://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/cae941c1-3af8-4ba7-9a1c-2d1232dbed9b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at http://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/2014061121074166965044%40gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Django Python roop
Hi, hito koto: I think the problems you asked should be post in python-lang mail-list. For python program, I prefer "for", not "while"; It's more simpler. But if you like while, I think the following code maybe helpful: def fff(x): y = [] i = 0 xlen = len(x) while i< xlen: y.append(x[i]) i += 1 return y but I more like the follow writing: def fff(x): return x[:] or you just do as following: y = x[:] python is a power & beautiful language, please read the https://docs.python.org/2/tutorial/ at first like François said. With Regards. moqianc...@gmail.com From: hito koto Date: 2014-06-11 19:21 To: django-users Subject: Django Python roop Hello, all I want to change to while statement from for statement, so how can i do to? this is my correct for statement codes: def fff(x): y = [] for i in range(len(x)): y.append(x[i]) return y and i want change to while statement So, this code have erroes: TypeError: list indices must be integers, not list def fff(x): y = [] while x !=[]: for i in x: y.append(x[i]) return y -- -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at http://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/2014061120521739307138%40gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Re: Django python fuction
Yeah, I think hito koto need a function like copy.deepcopy.. I think he known copy.deepcopy before(as his example said), just not known how to write a funtion work as copy.deecopy does for list. moqianc...@gmail.com From: François Schiettecatte Date: 2014-06-11 00:21 To: django-users Subject: Re: Django python fuction Wouldn't the deep copy module handle this for you: https://docs.python.org/2/library/copy.html François On Jun 10, 2014, at 12:17 PM, hito koto <hitokoto2...@gmail.com> wrote: > Hi, Qiancong : > > I was looking for exactly this、 > Thank you ver much! > > > 2014年6月10日火曜日 23時35分52秒 UTC+9 Qiancong: > > Hi, hito koto: > I think you want to deep copy the list. The following code maybe helpful: > def dcopy(obj): > if not isinstance(obj, list): > return obj > return [dcopy(x) for x in obj] > > But this function only deep copy for list. You can change code as what I did > for dic, set, tuple , etc .. > moqia...@gmail.com > > From: François Schiettecatte > Date: 2014-06-10 22:07 > To: django-users > Subject: Re: Django python fuction > You need to use .append() to add elements to a list, I suggest you take a > look at the python tutorial: > > https://docs.python.org/2/tutorial/ > > Not quite sure what you are doing with i or elem either. > > François > > On Jun 10, 2014, at 10:01 AM, hito koto <hitoko...@gmail.com> wrote: > > > So, I also have errors: > > > > > > >>> def foo(x): > > ... myList = [] > > ... if isinstance(x, list): > > ... for i in x: > > ... elem = i > > ... return myList + foo(elem) > > ... else: > > ... return 1 > > ... > > >>> > > >>> foo(a) > > Traceback (most recent call last): > > File "", line 1, in > > File "", line 6, in foo > > File "", line 6, in foo > > TypeError: can only concatenate list (not "int") to list > > > > > > > > > > > > 2014年6月10日火曜日 22時53分28秒 UTC+9 François Schiettecatte: > > You are redefining 'list' on line 2, rename list to myList as follows: > > > > def foo(x): > > myList = [] > > if isinstance(x, list): > > for i in x: > > elem = i > > return myList + foo(i) > > else: > > return 1 > > > > And take this to a python list, this is for django. > > > > Cheers > > > > François > > > > > > On Jun 10, 2014, at 9:43 AM, hito koto <hitoko...@gmail.com> wrote: > > > > > hi, > > > > > > I have this erroes: > > > > > > >>> def foo(x): > > > ... list = [] > > > ... if isinstance(x, list): > > > ... for i in x: > > > ... elem = i > > > ... return list + foo(i) > > > ... else: > > > ... return 1 > > > ... > > > >>> foo(a) > > > Traceback (most recent call last): > > > File "", line 1, in > > > File "", line 3, in foo > > > TypeError: isinstance() arg 2 must be a class, type, or tuple of classes > > > and types > > > > > > > > > > > > 2014年6月10日火曜日 22時27分42秒 UTC+9 Andrew Farrell: > > > In general, I recommend adding the line "import pdb;pdb.set_trace()" > > > to the top of your function and walking through it to see why it doesn't > > > work. > > > > > > def foo(x): > > > import pdb;pdb.set_trace() > > > > > > list = [] > > > if isinstance(x, list): > > > for i in x: > > > elem = i > > > return list + foo(i) > > > else: > > > return 1 > > > > > > See https://docs.python.org/2/library/pdb.html on how pdb works. > > > There are faster ways to debug, but when starting out, pdb lets you see > > > what is happening as you run the function. > > > > > > > > > Some questions I have about this function: > > > - What is the purpose of the "elem" function? It is never accessed. > > > - What is the purpose of returning 1 if the argument is not a list? > > > - Why is it named "foo" rather than something that tells me what the > > > purpose of the function is? > > &g
Re: Re: Django python fuction
Hi, hito koto: I think you want to deep copy the list. The following code maybe helpful: def dcopy(obj): if not isinstance(obj, list): return obj return [dcopy(x) for x in obj] But this function only deep copy for list. You can change code as what I did for dic, set, tuple , etc .. moqianc...@gmail.com From: François Schiettecatte Date: 2014-06-10 22:07 To: django-users Subject: Re: Django python fuction You need to use .append() to add elements to a list, I suggest you take a look at the python tutorial: https://docs.python.org/2/tutorial/ Not quite sure what you are doing with i or elem either. François On Jun 10, 2014, at 10:01 AM, hito koto <hitokoto2...@gmail.com> wrote: > So, I also have errors: > > > >>> def foo(x): > ... myList = [] > ... if isinstance(x, list): > ... for i in x: > ... elem = i > ... return myList + foo(elem) > ... else: > ... return 1 > ... > >>> > >>> foo(a) > Traceback (most recent call last): > File "", line 1, in > File "", line 6, in foo > File "", line 6, in foo > TypeError: can only concatenate list (not "int") to list > > > > > > 2014年6月10日火曜日 22時53分28秒 UTC+9 François Schiettecatte: > You are redefining 'list' on line 2, rename list to myList as follows: > > def foo(x): > myList = [] > if isinstance(x, list): > for i in x: > elem = i > return myList + foo(i) > else: > return 1 > > And take this to a python list, this is for django. > > Cheers > > François > > > On Jun 10, 2014, at 9:43 AM, hito koto <hitoko...@gmail.com> wrote: > > > hi, > > > > I have this erroes: > > > > >>> def foo(x): > > ... list = [] > > ... if isinstance(x, list): > > ... for i in x: > > ... elem = i > > ... return list + foo(i) > > ... else: > > ... return 1 > > ... > > >>> foo(a) > > Traceback (most recent call last): > > File "", line 1, in > > File "", line 3, in foo > > TypeError: isinstance() arg 2 must be a class, type, or tuple of classes > > and types > > > > > > > > 2014年6月10日火曜日 22時27分42秒 UTC+9 Andrew Farrell: > > In general, I recommend adding the line "import pdb;pdb.set_trace()" > > to the top of your function and walking through it to see why it doesn't > > work. > > > > def foo(x): > > import pdb;pdb.set_trace() > > > > list = [] > > if isinstance(x, list): > > for i in x: > > elem = i > > return list + foo(i) > > else: > > return 1 > > > > See https://docs.python.org/2/library/pdb.html on how pdb works. > > There are faster ways to debug, but when starting out, pdb lets you see > > what is happening as you run the function. > > > > > > Some questions I have about this function: > > - What is the purpose of the "elem" function? It is never accessed. > > - What is the purpose of returning 1 if the argument is not a list? > > - Why is it named "foo" rather than something that tells me what the > > purpose of the function is? > > > > > > On Tue, Jun 10, 2014 at 8:16 AM, hito koto <hitoko...@gmail.com> wrote: > > Hello, > > > > I don't know how can i do to change to write python function > > I want to following code change to write python function or change to write > > recursive definition > > >>> y = [10, 12, [13, [14, 9], 16], 7] > > >>> z = copy.deepcopy(y) > > >>> y > > [10, 12, [13, [14, 9], 16], 7] > > >>> z > > [10, 12, [13, [14, 9], 16], 7] > > >>> z[2][1][1] > > 9 > > >>> z[2][1][1] = 88 > > >>> z > > [10, 12, [13, [14, 88], 16], 7] > > [10, 12, [13, [14, 9], 16], 7] > > >>> > > > > this is my use function but not work: > > > > def foo(x): > > list = [] > > if isinstance(x, list): > > for i in x: > > elem = i > > return list + foo(i) > > else: > > return 1 > > > > > > > > > > > > -- > > You receiv