Hi there,
My ongoing project will be centered around an SQLite db.
Not a bad way to start. There are many possible ways to access SQL DBs. I'll talk about one of my favorites, since I'm a big fan of sqlalchemy [0], which provides a broad useful toolkit for dealing with SQL DBs and an abstraction layer.
To start, often the question is why any such abstraction tool, given the additional complexity of a module, a.k.a. another layer of code?
Briefly, my main two reasons: A) abstraction of data model from SQL implementation for the Python program (allows switching from SQLite another DBAPI, e.g. postgres, later with a minimum effort) B) somebody has already implemented the tricky bits, such as ORMs (see below), failover, connection pooling (see below) and other DB-specific features
Since almost all data needed by the program will be stored in this db, my thought is that I should create a connection to this db shortly after program startup and keep this connection open until program closure.
That is one possible approach. But, consider using a "connection pooling" technique that somebody else has already implemented and tested. This saves your time for working on the logic of your program.
There are many different pooling strategies, which include things like "Use only one connection at a time." or "Connect on demand." or "Hold a bunch of connections open and let me use one when I need one, and I'll release it when I'm done." and even "When the connection fails, retry quietly in the background until a successful connection can be re-established."
I am assuming that opening and closing a db connection has enough overhead that I should only do this once. But I do not *know* that this is true. Is it? If not, then the alternative would make more sense, i.e., open and close the db as needed.
Measure, measure, measure. Profile it before coming to such a conclusion. You may be correct, but, it behooves you to measure. (My take on an old computing adage: Premature optimization can lead you down unnecessarily painful or time consuming paths.)
N.B. Only you (or your development cohort) can anticipate the load on the DB, the growth of records (i.e. data set size), the growth of the complexity of the project, or the user count. So, even if the measurements tell you one thing, be sure to consider the longer-term plan for the data and application.
Also, see Steven D'Aprano's comments about concurrency and other ACIDic concerns.
In the first iteration of my project, my intent is to create and populate the db with tables external to the program. The program will only add entries to tables, query the db, etc. That is, the structure of the db will be pre-set outside of the program, and the program will only deal with data interactions with the db.
If the structure of the DB is determined outside the program, this sounds like a great reason to use an Object Relational Modeler (ORM). An ORM which supports reflection (sqlalchemy does) can create Pythonic objects for you.
My intent is to make the overall design of the program OO, but I am wondering how to handle the db manager module. Should I go OO here as well? With each pertinent method handling a very specific means of interacting with the db? Or go a procedural route with functions similar to the aforementioned methods? It is not clear to me that OOP provides a real benefit here, but, then again, I am learning how to OOP during this project as well, so I don't have enough knowledge yet to realistically answer this question.
I'm not sure I can weigh in intelligently here (OOP v. procedural), but I'd guess that you could get that Object-Oriented feel by taking advantage of an ORM, rather than writing one yourself. Getting used to the idea of an ORM can be tricky, but if you can get reflection working [1], I think you will be surprised at how quickly your application logic (at the business layer) comes together and you can (mostly) stop worrying about things like connection logic and SQL statements executing from your Python program [2].
There probably are a few people on this list who have used sqlalchemy and are competent to answer it, but if you have questions specifically about sqlalchemy, you might find better answers on their mailing list [3].
Now, back to the beginnings...a SQLite DB is a fine place to start if you have only one thread/user/program accessing the data at any time. Don't host it on a network(ed) file system if you have the choice. If your application grows so much in usage or volume that it needs a new and different DB, consider it all a success and migrate accordingly.
Best of luck, -Martin [0] http://www.sqlalchemy.org/ [1] http://docs.sqlalchemy.org/en/rel_1_0/core/reflection.html [2] Here, naturally, I'm assuming that you know your way around SQL, since you are asserting that the DB already exists, is maintained and designed outside of the Python program. [3] https://groups.google.com/forum/#!forum/sqlalchemy -- Martin A. Brown http://linux-ip.net/ _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor