Re: model query get_or_create() thread safety and uniqueness -- proposal for transactional commit before final get

2013-04-04 Thread charettes
There was a discussion on the subject 
here
.

Le jeudi 4 avril 2013 23:23:43 UTC-4, Matteius a écrit :
>
>
> Greetings Developers,
>
> Today at work I fixed a thread safety issue in our code whereby two 
> threads were calling get_or_create on the same URL which has a uniqueness 
> constraint.  It was in some cases raising an IntegrityError in sentry even 
> after I converted our atomically incorrect code to use get_or_create 
> proving to me it must be two different threads performing the same action.  
> Let me elaborate ...
>
> Django's get_or_create does a Get, failing that a create, failing that a 
> rollback and another get.   For some reason in MySQL, and to me 
> conceptually the code roll back to prior to the commit when the other 
> thread wins in the create block and saves the object, it still doesn't 
> exist when the other thread tries to get it a final time, because I believe 
> it rolled back to the earlier pre-save savepoint.
>
> Here is the final part of the get_or_create:
>
> obj = self.model(**params)
> sid = transaction.*savepoint*(using=self.db)
> obj.save(force_insert=True, using=self.db)
> transaction.savepoint_commit(sid, using=self.db)
> return obj, True
> except *IntegrityError* as e:
> transaction.savepoint_rollback(sid, using=self.db)
> exc_info = sys.exc_info()
> try:
> return self.get(**lookup), False
> except self.model.DoesNotExist:
>  *   # Re-raise the IntegrityError* with its original 
> traceback.
> six.reraise(*exc_info)
>
>
> The question for me is:  *Why wouldn't we want this to be this way?*:
>
> obj = self.model(**params)
> sid = transaction.*savepoint*(using=self.db)
> obj.save(force_insert=True, using=self.db)
> transaction.savepoint_commit(sid, using=self.db)
> return obj, True
> except *IntegrityError* as e:
> transaction.savepoint_rollback(sid, using=self.db)
> exc_info = sys.exc_info()
> try:
> *transaction.commit()*  # To refresh the DB view for 
> thread safety
> return self.get(**lookup), False   # ... Get succeeds 
> now in a thread unsafe world
> except self.model.DoesNotExist:
>  *   # Re-raise the IntegrityError* with its original 
> traceback.
> six.reraise(*exc_info)
>
>
> *Also ---*  On the Django Website, The Django Downloads page Django 
> gtihub tarball master is a 1.4 zip file.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




model query get_or_create() thread safety and uniqueness -- proposal for transactional commit before final get

2013-04-04 Thread Matteius

Greetings Developers,

Today at work I fixed a thread safety issue in our code whereby two threads 
were calling get_or_create on the same URL which has a uniqueness 
constraint.  It was in some cases raising an IntegrityError in sentry even 
after I converted our atomically incorrect code to use get_or_create 
proving to me it must be two different threads performing the same action.  
Let me elaborate ...

Django's get_or_create does a Get, failing that a create, failing that a 
rollback and another get.   For some reason in MySQL, and to me 
conceptually the code roll back to prior to the commit when the other 
thread wins in the create block and saves the object, it still doesn't 
exist when the other thread tries to get it a final time, because I believe 
it rolled back to the earlier pre-save savepoint.

Here is the final part of the get_or_create:

obj = self.model(**params)
sid = transaction.*savepoint*(using=self.db)
obj.save(force_insert=True, using=self.db)
transaction.savepoint_commit(sid, using=self.db)
return obj, True
except *IntegrityError* as e:
transaction.savepoint_rollback(sid, using=self.db)
exc_info = sys.exc_info()
try:
return self.get(**lookup), False
except self.model.DoesNotExist:
 *   # Re-raise the IntegrityError* with its original 
traceback.
six.reraise(*exc_info)


The question for me is:  *Why wouldn't we want this to be this way?*:

obj = self.model(**params)
sid = transaction.*savepoint*(using=self.db)
obj.save(force_insert=True, using=self.db)
transaction.savepoint_commit(sid, using=self.db)
return obj, True
except *IntegrityError* as e:
transaction.savepoint_rollback(sid, using=self.db)
exc_info = sys.exc_info()
try:
*transaction.commit()*  # To refresh the DB view for 
thread safety
return self.get(**lookup), False   # ... Get succeeds 
now in a thread unsafe world
except self.model.DoesNotExist:
 *   # Re-raise the IntegrityError* with its original 
traceback.
six.reraise(*exc_info)


*Also ---*  On the Django Website, The Django Downloads page Django gtihub 
tarball master is a 1.4 zip file.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Kickstarter for Django Admin?

2013-04-04 Thread Jason Kraus
Currently you get the links through a link collection provider object 
attached to state; state.links.get__links() which collects 
links from its parents (endpoint and resource) and returns an empty set if 
none exist. I have been thinking about re-factoring this to something more 
robust. I find that links do need some sort of "group" property and I do 
like the idea of having an API where you query for links available, perhaps 
like: state.links(rel='sortby') and state.links(group='filter')

hyperadmin.resources.models is a good place to look to see how CRUD 
resources are implemented. The to main files in that module are 
resources.py which defines the ModelResource much like ModelAdmin and 
endpoints.py which contain the business end of executing CRUD actions.
hyperadmin.mediatypes.collectionjson shows how a hypermedia json content 
type can be implemented.

On Thursday, April 4, 2013 2:19:50 AM UTC-7, Florian Apolloner wrote:
>
> Hi,
>
> I already wanted to look at hyperadmin, but got caught up reading what 
> HATEOAS is and how it works first ;) I do have one question if you don't 
> mind: With a REST/HATEOS backend you'd export links with appropriate rel 
> attributes to tell the client what we can do. So far so good, but let's 
> pick the template version as example, how does it know where to put which 
> links, does it solely look on rel and figure out what to do with it? Also 
> how do you sensible provide extradata like the _meta information of the 
> model etc…
>
> Which files do you think are a good start when looking at hyperadmin?
>
> Thx & Regards,
> Florian
>
> P.S.: I agree that hyperadmin takes a good approach to the admin problem, 
> imo something along this line should power a new admin.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Extending the DATABASES = {'xx': {'NAME': construct to carry additional information and / or virtual names

2013-04-04 Thread Michael Manfre
On Thu, Apr 4, 2013 at 10:10 AM, VernonCole  wrote:

> In order to meet these goals, is it reasonable to do this?: Are there
> better alternatives?
>
> 1) Assume that a Proxy based connection will be acceptable for inclusion
> in django?
>

Best way to mentally think of this is as a database engine (a.k.a. django
database backend) that connects to SQL Server, which has different
dependencies on Linux (Pyro) vs Windows (pywin32). Django core shouldn't
care, or even know about the specifics. The database backend would be the
only bit of Django that knows about the proxy or how to connect to it.


> 2) Add more keys to the DATABASES entries as needed by different backends?
> (e.g. "CONNECTION_STRING": "xxx", "PROXY_ADDRESS: "10.0.0.7")
>

The DATABASES dict provides an "OPTIONS" dict [1] that specific backends
can populate as they deem necessary. Django-mssql has a few of these [2]. I
have been contemplating adding a generic "CONNECTION_STRING" to allow full
control over the connection string that would ignore all of the other
database settings when constructing the string. SqlLocalDB with its
different connection string has so far been the only motivator for this
additional OPTIONS key.

[1] https://docs.djangoproject.com/en/dev/ref/settings/#std:setting-OPTIONS
[2] http://django-mssql.readthedocs.org/en/latest/settings.html#options


> 3) Pass the DATABASES dictionary down-chain to the proxy? (so it gets the
> USER & PORT items, etc. and can process them itself)
>

The database backend has access to the full settings dict and would be able
to connect to the proxy and pass along whatever information is needed.


> 4) Have the proxy have additional routing information so it can add its
> own intellegence to the connection attempt?
>

The proxy and any routing logic would be contained within the specific
database backend.


> 5) (In some future time) have the backend feed back more information to
> the core, such as "I am connected to an XYZ engine, switch to the XYZ
> dialect of SQL'?
>

The database backends are responsible for providing their own SQL
translations by using a combination of custom SQLCompilers,
DatabaseOperations, and DatabaseFeatures. Trying to support many different
SQL dialects within the same database backend should be possible, but would
be complex and require maintaining additional state information on the
DatabaseWrapper and connection to allow all the various places to figure
out what it should be doing. I do this in a very limited way for
django-mssql to allow for newer features in SQL Server 2008 that do not
exist in SQL Server 2005 (e.g. microseconds).

Regards,
Michael Manfre

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Extending the DATABASES = {'xx': {'NAME': construct to carry additional information and / or virtual names

2013-04-04 Thread VernonCole
Dear Gurus:
 
At the bottom of this text there will be a question.  In order for you to 
provide a quality answer to the question, you need more information.  
Perhaps a lot more information. The next few paragraphs will attempt to 
provide that. They will also be an introduction to a couple of concepts, 
and a bit of a 'sales pitch."  Please accept my apologies for this and bear 
with me.  The question will be delimited by a 'v v v v v' construction...

1) My present situation (not important to the discussion here, just so you 
understand how I got into this):  I am working for a company called eHealth 
Africa. We provide software and logistics support to help stop endemic 
diseases (such as polio) and promote improved health care. Our principle 
focus is sub-Saharan Africa, and our principle office is located in the 
heart of the last remaining reservoir of wild polio, in Kano, Nigeria.  
Yes, the same Kano where nine girls from a polio vaccination team were shot 
by Islamic extremists a few weeks ago.  It is not really safe here, 
particularly for someone who looks like a cowboy from Wyoming (USA).  But I 
digress.  We use django and Ubuntu whenever we can. However, the present 
project consists of a lot of data collected by field workers using a tool ( 
http://wwwn.cdc.gov/epiinfo/ )  which writes ACCESS data files.  The same 
tool will also be used to read and analyze the data after it has been 
collected into an SQL Server database.  I have to get the data from the 
field, clean and verify it, and get it into the big database.

2) In order to collect, clean and warehouse the data, I need a tool which 
will read and write those databases.  Fortunately, I am the maintainer of 
just such a tool: adodbapi .  
Unfortunately, it would be best done on a Linux webserver ... and ADO is a 
Windows only tool.  My past week has been spent cleaning up the code of 
adodbapi and writing a remote proxy so that I can call it from a Linux box 
using Pyro.  Last night it executed a query and returned a rowcount and 
description.  Today I hope to return an actual data row, too.

3) An early fork of adodbapi was used as the data engine for django-mssql.  
I rolled the improvements from that fork back in to production adodbapi a 
couple of releases ago.  I am working with Michael Manfre to re-integrate 
the two, so that the next release of django-mssql will be using the next 
release of adodbapi, and will, therefore, be accessible from Linux. It is 
hoped that the resulting product will be included in a future release of 
mainstream django.

4) Adding a database proxy server into the chain complicates data naming in 
django.  The present model assumes three computers: The user's, who types a 
URL to find the django server, (which may manipulate that URL to make use 
of other web resources.)  Then the django server uses the DATABASES 
dictionary from settings.py to find its data storage using the 'NAME", 
"USER", "HOST" and "PORT" items.  The database server (or cluster of 
servers) is the third computer in the chain. The proxy server adds an extra 
bump between the second and third logical nodes.  I need a way to address 
it and pass to it the identity of the actual third node.

5) The construction of the actual text to put build an ADO data source is 
something between science, magic, and luck.  There is an entire web site 
dedicated to this arcane language ( http://connectionstrings.com ).  The 
place where one injects information (such as the USER name) into this 
jumble varies depending on which database engine will be used and which 
route is used to get there. A completely different vocabulary is used to 
open the same database depending on whether ADO or ODBC is used for 
routing. If ODBC is used, the "ODBC Data Source Administrator" tool can add 
its own information, selecting the HOST, USER, PORT, and database NAME for 
you -- all you pass is a DSN. Here is a little sample (slightly altered) 
from my test program:

> from adodbapi import is64bit

if doMySqlTest:
> c = {'host' :"25.116.170.194",
> 'database' : 'test',
> 'user' : 'adotest',
> 'password' : '12345678',
> 'provider' : '',
> 'driver' : "MySQL ODBC 5.2a Driver"}
> if is64bit.Python():
> c['provider'] = 'Provider=MSDASQL;'
> cs = '%(provider)sDriver={%(driver)s};Server=%(host)s;Port=3306;' + \
> 
> 'Database=%(database)s;user=%(user)s;password=%(password)s;Option=3;'
> print('...Testing MySql login...')
> connStrMySql = test_connect(cs % c, timeout=5)
>
Note that I had to write a module to detect whether I am running on 32 bit 
or 64 bit Python, because the connection string changes.  
I don't really want to put that in settings.py -- especially the part about 
which python the proxy machine is running.

6) ADO is database agnostic. Last week, just to prove the point, I wrote a 
pair of programs, one of them creates a database on an 

Re: Kickstarter for Django Admin?

2013-04-04 Thread Florian Apolloner
Hi,

I already wanted to look at hyperadmin, but got caught up reading what 
HATEOAS is and how it works first ;) I do have one question if you don't 
mind: With a REST/HATEOS backend you'd export links with appropriate rel 
attributes to tell the client what we can do. So far so good, but let's 
pick the template version as example, how does it know where to put which 
links, does it solely look on rel and figure out what to do with it? Also 
how do you sensible provide extradata like the _meta information of the 
model etc…

Which files do you think are a good start when looking at hyperadmin?

Thx & Regards,
Florian

P.S.: I agree that hyperadmin takes a good approach to the admin problem, 
imo something along this line should power a new admin.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Epydoc failed to generate documentation for Django 1.5

2013-04-04 Thread Reinout van Rees

On 04-04-13 05:19, Kevin Veroneau wrote:


| Import failed (but source code parsing was successful).
| Error: ImproperlyConfigured: Requested setting DATABASES, but
settings are not configured.
|You must either define the environment variable
DJANGO_SETTINGS_MODULE or call
|settings.configure() before accessing settings. (line 1)



Is this Epydoc error related to Epydoc or something that Django
shouldn't have done in it's source code?


I first thought "isn't this just the common problem of needing to set 
DJANGO_SETTINGS_MODULE?". I have to do it all the time in my own sphinx 
documentation. A quick os.environ call from sphinx' conf.py fixes that.


Looking at the code of the management command that epydoc crashes on, it 
seems that epydoc simply executes too much. This might be one of the 
cases where there's no handy way of supporting such an introspective tool.


Can you ignore files in epydoc?


Reinout

--
Reinout van Reeshttp://reinout.vanrees.org/
rein...@vanrees.org http://www.nelen-schuurmans.nl/
"If you're not sure what to do, make something. -- Paul Graham"

--
You received this message because you are subscribed to the Google Groups "Django 
developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.