Dear Gurus: At the bottom of this text there will be a question. In order for you to provide a quality answer to the question, you need more information. Perhaps a lot more information. The next few paragraphs will attempt to provide that. They will also be an introduction to a couple of concepts, and a bit of a 'sales pitch." Please accept my apologies for this and bear with me. The question will be delimited by a 'v v v v v' construction...
1) My present situation (not important to the discussion here, just so you understand how I got into this): I am working for a company called eHealth Africa. We provide software and logistics support to help stop endemic diseases (such as polio) and promote improved health care. Our principle focus is sub-Saharan Africa, and our principle office is located in the heart of the last remaining reservoir of wild polio, in Kano, Nigeria. Yes, the same Kano where nine girls from a polio vaccination team were shot by Islamic extremists a few weeks ago. It is not really safe here, particularly for someone who looks like a cowboy from Wyoming (USA). But I digress. We use django and Ubuntu whenever we can. However, the present project consists of a lot of data collected by field workers using a tool ( http://wwwn.cdc.gov/epiinfo/ ) which writes ACCESS data files. The same tool will also be used to read and analyze the data after it has been collected into an SQL Server database. I have to get the data from the field, clean and verify it, and get it into the big database. 2) In order to collect, clean and warehouse the data, I need a tool which will read and write those databases. Fortunately, I am the maintainer of just such a tool: adodbapi <http://sourceforge.net/projects/adodbapi>. Unfortunately, it would be best done on a Linux webserver ... and ADO is a Windows only tool. My past week has been spent cleaning up the code of adodbapi and writing a remote proxy so that I can call it from a Linux box using Pyro. Last night it executed a query and returned a rowcount and description. Today I hope to return an actual data row, too. 3) An early fork of adodbapi was used as the data engine for django-mssql. I rolled the improvements from that fork back in to production adodbapi a couple of releases ago. I am working with Michael Manfre to re-integrate the two, so that the next release of django-mssql will be using the next release of adodbapi, and will, therefore, be accessible from Linux. It is hoped that the resulting product will be included in a future release of mainstream django. 4) Adding a database proxy server into the chain complicates data naming in django. The present model assumes three computers: The user's, who types a URL to find the django server, (which may manipulate that URL to make use of other web resources.) Then the django server uses the DATABASES dictionary from settings.py to find its data storage using the 'NAME", "USER", "HOST" and "PORT" items. The database server (or cluster of servers) is the third computer in the chain. The proxy server adds an extra bump between the second and third logical nodes. I need a way to address it and pass to it the identity of the actual third node. 5) The construction of the actual text to put build an ADO data source is something between science, magic, and luck. There is an entire web site dedicated to this arcane language ( http://connectionstrings.com ). The place where one injects information (such as the USER name) into this jumble varies depending on which database engine will be used and which route is used to get there. A completely different vocabulary is used to open the same database depending on whether ADO or ODBC is used for routing. If ODBC is used, the "ODBC Data Source Administrator" tool can add its own information, selecting the HOST, USER, PORT, and database NAME for you -- all you pass is a DSN. Here is a little sample (slightly altered) from my test program: > from adodbapi import is64bit if doMySqlTest: > c = {'host' :"25.116.170.194", > 'database' : 'test', > 'user' : 'adotest', > 'password' : '12345678', > 'provider' : '', > 'driver' : "MySQL ODBC 5.2a Driver"} > if is64bit.Python(): > c['provider'] = 'Provider=MSDASQL;' > cs = '%(provider)sDriver={%(driver)s};Server=%(host)s;Port=3306;' + \ > > 'Database=%(database)s;user=%(user)s;password=%(password)s;Option=3;' > print(' ...Testing MySql login...') > connStrMySql = test_connect(cs % c, timeout=5) > Note that I had to write a module to detect whether I am running on 32 bit or 64 bit Python, because the connection string changes. I don't really want to put that in settings.py -- especially the part about which python the proxy machine is running. 6) ADO is database agnostic. Last week, just to prove the point, I wrote a pair of programs, one of them creates a database on an .xls spreadsheet. The other one reads it back. Here is the body of the writer: # create the sheet and the header row and set the types for the columns crsr.execute('create table SheetOne (Header1 text, Header2 text, Header3 text, Header4 text, Header5 text)') sql = "INSERT INTO SheetOne (Header1, Header2 ,Header3, Header4, Header5) values (?,?,?,?,?)" data = (1, 2, 3, 4, 5) crsr.execute(sql, data) # write the first row of data crsr.execute(sql, (6, 7, 8, 9, 10)) # another row of data Notice that it is just normal SQL. The reader is even easier. I could have used a django model -- if I had hooks for the backend. Nine lines of the 31 line program were used to compute the correct value for the connection string. After that, everything was easy. 7) The originating node may not have all of the information needed to open the database on the remote node. For example, it does not know whether the remote is running 64 bit or 32 bit code. I need a system of passing the information down the chain until it is needed, possibly adding on dropping off elements on the way. The 1980 era database system I liked best had a database database (logical name table) so that you could easily refer to "table X on database Y" as simply Y:X. The logical name table would recurs through the definition of Y until it found a physical location and make the connection. I guess my mind is stuck in that track. I want to send a generic level "hook me to database Y" request to the proxy and have it supply the other parameters to actually make the connection. v v v v v v v v v v v v v v v v v v v In order to meet these goals, is it reasonable to do this?: Are there better alternatives? 1) Assume that a Proxy based connection will be acceptable for inclusion in django? 2) Add more keys to the DATABASES entries as needed by different backends? (e.g. "CONNECTION_STRING": "xxx", "PROXY_ADDRESS: "10.0.0.7") 3) Pass the DATABASES dictionary down-chain to the proxy? (so it gets the USER & PORT items, etc. and can process them itself) 4) Have the proxy have additional routing information so it can add its own intellegence to the connection attempt? 5) (In some future time) have the backend feed back more information to the core, such as "I am connected to an XYZ engine, switch to the XYZ dialect of SQL'? -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/django-developers?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
