Re: Re: Massive import in Django database

2014-06-11 Thread moqianc...@gmail.com
Hi, John:
Sorry! The pseudo code write by me is not correct, and It's slow..   I will 
come back tonight. 

With Regards,
Qiancong,Mo

From: moqianc...@gmail.com
Date: 2014-06-11 23:47
To: django-users
Subject: Re: Massive import in Django database
Hi, John:
I think your code is right, except "Doc.object" should be "Doc.objects";

The following pseudo code maybe fater than what you write:

doc_map = {}
for each xml:
extract from the xml data -> mydoc_code, mydoc_text, myRelated_doc_codes
doc = Doc.objects.create(doc_code=mydoc_code, doc_text=mydoc_text)
doc_map[mydoc_code] = (doc, myRelated_doc_codes)
for (doc, rcodes) in doc_map.values():
for rcode in rcodes:
doc.related_doc.add(doc_map[rcode])
doc.save()

I have checked, It's okay;
The object have be cached in doc_map, and no need re-query related_codes for 
related_doc from database,  the speed should speed up.

With Regards.




moqianc...@gmail.com

From: John Carlo
Date: 2014-06-11 21:14
To: django-users
Subject: Massive import in Django database
Hello everybody, 


I've fallen in love with Django two years ago and I've been using it for my job 
projects. In the past I found very useful information in this group, so a big 
thank you guys!


I have a little doubt.
I have to import in Django db (sqlite for local development, mySql on the 
server) about 1.000.000 xml documents.


The model class is the following:


class Doc(models.Model):
doc_code =  models.CharField(max_length=20, unique=True, primary_key=True, 
db_index = True) 

doc_text = models.TextField(null=True, blank=True) 
related_doc= models.ManyToManyField('self', null=True, blank=True, db_index 
= True) 



>From what I know bulk insertion is not possibile because I have a 
>ManyToManyField relation.


So I have this simple loop (in pseudo code)


for each xml:
   extract from the xml  date-> mydoc_code, mydoc_text, myRelated_doc_codes


   myDoc = Doc.object.get_or_create(doc_code = mydoc_code)[0]
   myDoc.doc_text = mydoc_text
   
   for reldoc_code in myRelated_doc_codes:
myRelDoc =  Doc.object.get_or_create(doc_code = reldoc_code )[0]
myDoc.related_doc.add(myRelDoc )


  myDoc.save()




I'm doing it right? Do you have some suggestions, recommendation? I fear that 
since I have 1.000.000 docs to import, it will take a lt of time, 
especially during the get_or_create routines


thank you in advance everybody!


John








 
-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/5b88deaf-d806-4a64-9e8d-528d95599c80%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/2014061208110509465878%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Massive import in Django database

2014-06-11 Thread moqianc...@gmail.com
Hi, John:
I think your code is right, except "Doc.object" should be "Doc.objects";

The following pseudo code maybe fater than what you write:

doc_map = {}
for each xml:
extract from the xml data -> mydoc_code, mydoc_text, myRelated_doc_codes
doc = Doc.objects.create(doc_code=mydoc_code, doc_text=mydoc_text)
doc_map[mydoc_code] = (doc, myRelated_doc_codes)
for (doc, rcodes) in doc_map.values():
for rcode in rcodes:
doc.related_doc.add(doc_map[rcode])
doc.save()

I have checked, It's okay;
The object have be cached in doc_map, and no need re-query related_codes for 
related_doc from database,  the speed should speed up.

With Regards.




moqianc...@gmail.com

From: John Carlo
Date: 2014-06-11 21:14
To: django-users
Subject: Massive import in Django database
Hello everybody,


I've fallen in love with Django two years ago and I've been using it for my job 
projects. In the past I found very useful information in this group, so a big 
thank you guys!


I have a little doubt.
I have to import in Django db (sqlite for local development, mySql on the 
server) about 1.000.000 xml documents.


The model class is the following:


class Doc(models.Model):
doc_code =  models.CharField(max_length=20, unique=True, primary_key=True, 
db_index = True) 

doc_text = models.TextField(null=True, blank=True) 
related_doc= models.ManyToManyField('self', null=True, blank=True, db_index 
= True) 



>From what I know bulk insertion is not possibile because I have a 
>ManyToManyField relation.


So I have this simple loop (in pseudo code)


for each xml:
   extract from the xml  date-> mydoc_code, mydoc_text, myRelated_doc_codes


   myDoc = Doc.object.get_or_create(doc_code = mydoc_code)[0]
   myDoc.doc_text = mydoc_text
   
   for reldoc_code in myRelated_doc_codes:
myRelDoc =  Doc.object.get_or_create(doc_code = reldoc_code )[0]
myDoc.related_doc.add(myRelDoc )


  myDoc.save()




I'm doing it right? Do you have some suggestions, recommendation? I fear that 
since I have 1.000.000 docs to import, it will take a lt of time, 
especially during the get_or_create routines


thank you in advance everybody!


John








 
-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/5b88deaf-d806-4a64-9e8d-528d95599c80%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/2014061123474049956470%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Re: Django Python roop

2014-06-11 Thread moqianc...@gmail.com
Sometimes, I have to use while statement too. But any time, I think 
write right code, write clean code,  It's a basic rule for our develope work.




moqianc...@gmail.com

From: hito koto
Date: 2014-06-11 21:05
To: django-users
Subject: Re: Django Python roop

Hi, qiancong:
Thank you,
Do you not use the while statement to Django?


2014年6月11日水曜日 21時52分54秒 UTC+9 Qiancong:
 
Hi, hito koto:
I think  the problems you asked should be post in python-lang mail-list. 
For python program,  I prefer "for", not "while";  It's more simpler.
But  if you like while, I think the following code maybe helpful:

def fff(x):
y = []
i = 0
xlen = len(x)
while i< xlen:
y.append(x[i])
i += 1
return y

but I more like the follow writing:
def fff(x):
return x[:]

or you just do as following:
y  = x[:]

python is a power & beautiful language,   
please read the  https://docs.python.org/2/tutorial/ at first like  François 
said.

With Regards.




moqia...@gmail.com

From: hito koto
Date: 2014-06-11 19:21
To: django-users
Subject: Django Python roop
Hello, all

I want to change to while statement from for statement, so how can i do to?

this is my correct for statement codes:

def fff(x):
y = []
for i in range(len(x)):
y.append(x[i])
return y

and i want change to while statement 

So, this code have erroes:
TypeError: list indices must be integers, not list
def fff(x):
y = []
while x !=[]:
   for i in x:
   y.append(x[i])
return y

-- 


-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/8afde78f-9910-4d26-b1e6-e5f1eaaca63b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/2014061121121170324647%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Re: Django Python roop

2014-06-11 Thread moqianc...@gmail.com
In your function, if x is not a empty list,  then while will be a infinite 
loop, x always not be  [];
then the inner for-loop  will append many many items into y list, Then your 
computer will say "have the Memory Error",  to tell you the memory not enought.





moqianc...@gmail.com

From: hito koto
Date: 2014-06-11 20:49
To: django-users
Subject: Re: Django Python roop
MemoryError , Why? idon't know.
I try this have Traceback (most recent call last):
  File "", line 1, in 
  File "", line 5, in foo
MemoryError errors:  

I'm change to this code: have the Memory Error,

def foo(x):
y = []
while x != []:
for i in range(len(x)):
y.append(x[i])
return y



2014年6月11日水曜日 20時21分42秒 UTC+9 hito koto:
Hello, all

I want to change to while statement from for statement, so how can i do to?

this is my correct for statement codes:

def fff(x):
y = []
for i in range(len(x)):
y.append(x[i])
return y

and i want change to while statement 

So, this code have erroes:
TypeError: list indices must be integers, not list
def fff(x):
y = []
while x !=[]:
   for i in x:
   y.append(x[i])
return y

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/cae941c1-3af8-4ba7-9a1c-2d1232dbed9b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/2014061121074166965044%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Django Python roop

2014-06-11 Thread moqianc...@gmail.com
Hi, hito koto:
I think  the problems you asked should be post in python-lang mail-list. 
For python program,  I prefer "for", not "while";  It's more simpler.
But  if you like while, I think the following code maybe helpful:

def fff(x):
y = []
i = 0
xlen = len(x)
while i< xlen:
y.append(x[i])
i += 1
return y

but I more like the follow writing:
def fff(x):
return x[:]

or you just do as following:
y  = x[:]

python is a power & beautiful language,   
please read the  https://docs.python.org/2/tutorial/ at first like  François 
said.

With Regards.




moqianc...@gmail.com

From: hito koto
Date: 2014-06-11 19:21
To: django-users
Subject: Django Python roop
Hello, all

I want to change to while statement from for statement, so how can i do to?

this is my correct for statement codes:

def fff(x):
y = []
for i in range(len(x)):
y.append(x[i])
return y

and i want change to while statement 

So, this code have erroes:
TypeError: list indices must be integers, not list
def fff(x):
y = []
while x !=[]:
   for i in x:
   y.append(x[i])
return y

-- 

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/2014061120521739307138%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Re: Django python fuction

2014-06-10 Thread moqianc...@gmail.com
Yeah, I think hito koto need a function like copy.deepcopy.. I think he known 
copy.deepcopy before(as his example said),  just not known how to write a 
funtion work as copy.deecopy does for list.




moqianc...@gmail.com

From: François Schiettecatte
Date: 2014-06-11 00:21
To: django-users
Subject: Re: Django python fuction
Wouldn't the deep copy module handle this for you:

https://docs.python.org/2/library/copy.html

François

On Jun 10, 2014, at 12:17 PM, hito koto <hitokoto2...@gmail.com> wrote:

> Hi,  Qiancong :
> 
> I was looking for exactly this、
> Thank you ver much!
> 
> 
> 2014年6月10日火曜日 23時35分52秒 UTC+9 Qiancong:
> 
> Hi, hito koto:
> I think you want to deep copy the list. The following code maybe  helpful:
> def dcopy(obj):
> if not isinstance(obj, list):
> return obj
> return [dcopy(x) for x in obj]
>  
> But this function only deep copy for list.  You can change code as what I did 
> for dic, set, tuple , etc ..
> moqia...@gmail.com
>  
> From: François Schiettecatte
> Date: 2014-06-10 22:07
> To: django-users
> Subject: Re: Django python fuction
> You need to use .append() to add elements to a list, I suggest you take a 
> look at the python tutorial:
>  
> https://docs.python.org/2/tutorial/
>  
> Not quite sure what you are doing with i or elem either.
>  
> François
>  
> On Jun 10, 2014, at 10:01 AM, hito koto <hitoko...@gmail.com> wrote:
>  
> > So, I also have errors:
> >
> >
> > >>> def foo(x):
> > ... myList = []
> > ... if isinstance(x, list):
> > ... for i in x:
> > ... elem = i
> > ... return myList + foo(elem)
> > ... else:
> > ... return 1
> > ...
> > >>>
> > >>> foo(a)
> > Traceback (most recent call last):
> >   File "", line 1, in 
> >   File "", line 6, in foo
> >   File "", line 6, in foo
> > TypeError: can only concatenate list (not "int") to list
> >
> >
> >
> >
> >
> > 2014年6月10日火曜日 22時53分28秒 UTC+9 François Schiettecatte:
> > You are redefining 'list' on  line 2, rename list to myList as follows:
> >
> > def foo(x):
> > myList = []
> > if isinstance(x, list):
> > for i in x:
> > elem = i
> > return myList + foo(i)
> > else:
> > return 1
> >
> > And take this to a python list, this is for django.
> >
> > Cheers
> >
> > François
> >
> >
> > On Jun 10, 2014, at 9:43 AM, hito koto <hitoko...@gmail.com> wrote:
> >
> > > hi,
> > >
> > > I have this erroes:
> > >
> > > >>> def foo(x):
> > > ... list = []
> > > ... if isinstance(x, list):
> > > ... for i in x:
> > > ... elem = i
> > > ... return list + foo(i)
> > > ... else:
> > > ... return 1
> > > ...
> > > >>> foo(a)
> > > Traceback (most recent call last):
> > >   File "", line 1, in 
> > >   File "", line 3, in foo
> > > TypeError: isinstance() arg 2 must be a class, type, or tuple of classes 
> > > and types
> > >
> > >
> > >
> > > 2014年6月10日火曜日 22時27分42秒 UTC+9 Andrew Farrell:
> > > In general, I recommend adding the line "import pdb;pdb.set_trace()"
> > > to the top of your function and walking through it to see why it doesn't 
> > > work.
> > >
> > > def foo(x):
> > > import pdb;pdb.set_trace()
> > >
> > > list = []
> > > if isinstance(x, list):
> > > for i in x:
> > > elem = i
> > > return list + foo(i)
> > > else:
> > > return 1
> > >
> > > See https://docs.python.org/2/library/pdb.html on how pdb works.
> > > There are faster ways to debug, but when starting out, pdb lets you see 
> > > what is happening as you run the function.
> > >
> > >
> > > Some questions I have about this function:
> > > - What is the purpose of the "elem" function? It is never accessed.
> > > - What is the purpose of returning 1 if the argument is not a list?
> > > - Why is it named "foo" rather than something that tells me what the 
> > > purpose of the function is?
> > &g

Re: Re: Django python fuction

2014-06-10 Thread moqianc...@gmail.com
Hi, hito koto:
I think you want to deep copy the list. The following code maybe  helpful:
def dcopy(obj):
if not isinstance(obj, list):
return obj
return [dcopy(x) for x in obj]

But this function only deep copy for list.  You can change code as what I did 
for dic, set, tuple , etc ..



moqianc...@gmail.com

From: François Schiettecatte
Date: 2014-06-10 22:07
To: django-users
Subject: Re: Django python fuction
You need to use .append() to add elements to a list, I suggest you take a look 
at the python tutorial:

https://docs.python.org/2/tutorial/

Not quite sure what you are doing with i or elem either.

François

On Jun 10, 2014, at 10:01 AM, hito koto <hitokoto2...@gmail.com> wrote:

> So, I also have errors:
> 
> 
> >>> def foo(x):
> ... myList = []
> ... if isinstance(x, list):
> ... for i in x:
> ... elem = i
> ... return myList + foo(elem)
> ... else:
> ... return 1
> ...
> >>>
> >>> foo(a)
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "", line 6, in foo
>   File "", line 6, in foo
> TypeError: can only concatenate list (not "int") to list
> 
> 
> 
> 
> 
> 2014年6月10日火曜日 22時53分28秒 UTC+9 François Schiettecatte:
> You are redefining 'list' on  line 2, rename list to myList as follows: 
> 
> def foo(x): 
> myList = [] 
> if isinstance(x, list): 
> for i in x: 
> elem = i 
> return myList + foo(i) 
> else: 
> return 1 
> 
> And take this to a python list, this is for django. 
> 
> Cheers 
> 
> François 
> 
> 
> On Jun 10, 2014, at 9:43 AM, hito koto <hitoko...@gmail.com> wrote: 
> 
> > hi, 
> > 
> > I have this erroes: 
> > 
> > >>> def foo(x): 
> > ... list = [] 
> > ... if isinstance(x, list): 
> > ... for i in x: 
> > ... elem = i 
> > ... return list + foo(i) 
> > ... else: 
> > ... return 1 
> > ... 
> > >>> foo(a) 
> > Traceback (most recent call last): 
> >   File "", line 1, in  
> >   File "", line 3, in foo 
> > TypeError: isinstance() arg 2 must be a class, type, or tuple of classes 
> > and types 
> > 
> > 
> > 
> > 2014年6月10日火曜日 22時27分42秒 UTC+9 Andrew Farrell: 
> > In general, I recommend adding the line "import pdb;pdb.set_trace()" 
> > to the top of your function and walking through it to see why it doesn't 
> > work. 
> > 
> > def foo(x): 
> > import pdb;pdb.set_trace() 
> > 
> > list = [] 
> > if isinstance(x, list): 
> > for i in x: 
> > elem = i 
> > return list + foo(i) 
> > else: 
> > return 1 
> > 
> > See https://docs.python.org/2/library/pdb.html on how pdb works. 
> > There are faster ways to debug, but when starting out, pdb lets you see 
> > what is happening as you run the function. 
> > 
> > 
> > Some questions I have about this function: 
> > - What is the purpose of the "elem" function? It is never accessed. 
> > - What is the purpose of returning 1 if the argument is not a list? 
> > - Why is it named "foo" rather than something that tells me what the 
> > purpose of the function is? 
> > 
> > 
> > On Tue, Jun 10, 2014 at 8:16 AM, hito koto <hitoko...@gmail.com> wrote: 
> > Hello, 
> > 
> > I don't know how can i do to change to write python function 
> > I want to following code change to write python function or change to write 
> >  recursive definition 
> > >>> y = [10, 12, [13, [14, 9], 16], 7] 
> > >>> z = copy.deepcopy(y) 
> > >>> y 
> > [10, 12, [13, [14, 9], 16], 7] 
> > >>> z 
> > [10, 12, [13, [14, 9], 16], 7] 
> > >>> z[2][1][1] 
> > 9 
> > >>> z[2][1][1] = 88 
> > >>> z 
> > [10, 12, [13, [14, 88], 16], 7] 
> > [10, 12, [13, [14, 9], 16], 7] 
> > >>> 
> > 
> > this is my use function but not work: 
> > 
> > def foo(x): 
> > list = [] 
> > if isinstance(x, list): 
> > for i in x: 
> > elem = i 
> > return list + foo(i) 
> > else: 
> > return 1 
> > 
> > 
> > 
> > 
> > 
> > -- 
> > You receiv