Dear Gurus:
 
At the bottom of this text there will be a question.  In order for you to 
provide a quality answer to the question, you need more information.  
Perhaps a lot more information. The next few paragraphs will attempt to 
provide that. They will also be an introduction to a couple of concepts, 
and a bit of a 'sales pitch."  Please accept my apologies for this and bear 
with me.  The question will be delimited by a 'v v v v v' construction...

1) My present situation (not important to the discussion here, just so you 
understand how I got into this):  I am working for a company called eHealth 
Africa. We provide software and logistics support to help stop endemic 
diseases (such as polio) and promote improved health care. Our principle 
focus is sub-Saharan Africa, and our principle office is located in the 
heart of the last remaining reservoir of wild polio, in Kano, Nigeria.  
Yes, the same Kano where nine girls from a polio vaccination team were shot 
by Islamic extremists a few weeks ago.  It is not really safe here, 
particularly for someone who looks like a cowboy from Wyoming (USA).  But I 
digress.  We use django and Ubuntu whenever we can. However, the present 
project consists of a lot of data collected by field workers using a tool ( 
http://wwwn.cdc.gov/epiinfo/ )  which writes ACCESS data files.  The same 
tool will also be used to read and analyze the data after it has been 
collected into an SQL Server database.  I have to get the data from the 
field, clean and verify it, and get it into the big database.

2) In order to collect, clean and warehouse the data, I need a tool which 
will read and write those databases.  Fortunately, I am the maintainer of 
just such a tool: adodbapi <http://sourceforge.net/projects/adodbapi>.  
Unfortunately, it would be best done on a Linux webserver ... and ADO is a 
Windows only tool.  My past week has been spent cleaning up the code of 
adodbapi and writing a remote proxy so that I can call it from a Linux box 
using Pyro.  Last night it executed a query and returned a rowcount and 
description.  Today I hope to return an actual data row, too.

3) An early fork of adodbapi was used as the data engine for django-mssql.  
I rolled the improvements from that fork back in to production adodbapi a 
couple of releases ago.  I am working with Michael Manfre to re-integrate 
the two, so that the next release of django-mssql will be using the next 
release of adodbapi, and will, therefore, be accessible from Linux. It is 
hoped that the resulting product will be included in a future release of 
mainstream django.

4) Adding a database proxy server into the chain complicates data naming in 
django.  The present model assumes three computers: The user's, who types a 
URL to find the django server, (which may manipulate that URL to make use 
of other web resources.)  Then the django server uses the DATABASES 
dictionary from settings.py to find its data storage using the 'NAME", 
"USER", "HOST" and "PORT" items.  The database server (or cluster of 
servers) is the third computer in the chain. The proxy server adds an extra 
bump between the second and third logical nodes.  I need a way to address 
it and pass to it the identity of the actual third node.

5) The construction of the actual text to put build an ADO data source is 
something between science, magic, and luck.  There is an entire web site 
dedicated to this arcane language ( http://connectionstrings.com ).  The 
place where one injects information (such as the USER name) into this 
jumble varies depending on which database engine will be used and which 
route is used to get there. A completely different vocabulary is used to 
open the same database depending on whether ADO or ODBC is used for 
routing. If ODBC is used, the "ODBC Data Source Administrator" tool can add 
its own information, selecting the HOST, USER, PORT, and database NAME for 
you -- all you pass is a DSN. Here is a little sample (slightly altered) 
from my test program:

> from adodbapi import is64bit

if doMySqlTest:
>     c = {'host' :"25.116.170.194",
>         'database' : 'test',
>         'user' : 'adotest',
>         'password' : '12345678',
>         'provider' : '',
>         'driver' : "MySQL ODBC 5.2a Driver"}
>     if is64bit.Python():
>         c['provider'] = 'Provider=MSDASQL;'
>     cs = '%(provider)sDriver={%(driver)s};Server=%(host)s;Port=3306;' + \
>         
> 'Database=%(database)s;user=%(user)s;password=%(password)s;Option=3;'
>     print('    ...Testing MySql login...')
>     connStrMySql = test_connect(cs % c, timeout=5)
>
Note that I had to write a module to detect whether I am running on 32 bit 
or 64 bit Python, because the connection string changes.  
I don't really want to put that in settings.py -- especially the part about 
which python the proxy machine is running.

6) ADO is database agnostic. Last week, just to prove the point, I wrote a 
pair of programs, one of them creates a database on an .xls spreadsheet. 
The other one reads it back. Here is the body of the writer:
        # create the sheet and the header row and set the types for the 
columns
        crsr.execute('create table SheetOne (Header1 text, Header2 text, 
Header3 text, Header4 text, Header5 text)')
        sql = "INSERT INTO SheetOne (Header1, Header2 ,Header3, Header4, 
Header5) values (?,?,?,?,?)"
        data = (1, 2, 3, 4, 5)
        crsr.execute(sql, data)  # write the first row of data
        crsr.execute(sql, (6, 7, 8, 9, 10))  # another row of data
Notice that it is just normal SQL. The reader is even easier.  I could have 
used a django model -- if I had hooks for the backend. Nine lines of the 31 
line program were used to compute the correct value for the connection 
string. After that, everything was easy.

7) The originating node may not have all of the information needed to open 
the database on the remote node.  For example, it does not know whether the 
remote is running 64 bit or 32 bit code. I need a system of passing the 
information down the chain until it is needed, possibly adding on dropping 
off elements on the way. The 1980 era database system I liked best had a 
database database (logical name table) so that you could easily refer to 
"table X on database Y"  as simply Y:X. The logical name table would recurs 
through the definition of Y until it found a physical location and make the 
connection. I guess my mind is stuck in that track. I want to send a 
generic level "hook me to database Y" request to the proxy and have it 
supply the other parameters to actually make the connection.
v  v  v  v  v  v  v  v  v  v  v  v  v  v  v  v  v  v  v
In order to meet these goals, is it reasonable to do this?: Are there 
better alternatives?

1) Assume that a Proxy based connection will be acceptable for inclusion in 
django?

2) Add more keys to the DATABASES entries as needed by different backends? 
(e.g. "CONNECTION_STRING": "xxx", "PROXY_ADDRESS: "10.0.0.7")

3) Pass the DATABASES dictionary down-chain to the proxy? (so it gets the 
USER & PORT items, etc. and can process them itself)

4) Have the proxy have additional routing information so it can add its own 
intellegence to the connection attempt?

5) (In some future time) have the backend feed back more information to the 
core, such as "I am connected to an XYZ engine, switch to the XYZ dialect 
of SQL'?

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to