Re: What is the recommended python module for SQL database access?
Chris Angelico wrote: Broad recommendation: Single application, tiny workload, concurrency not an issue, simplicity desired? Go SQLite. Big complex job, need performance, lots of things reading and writing at once, want networked access? Go PGSQL. And don't go MySQL if PG is an option. And definitely don't go for a non-free option (MS-SQL, DB2, etc) unless you've looked into it really closely and you are absolutely thoroughly *sure* that you need that system (which probably means you need your app to integrate with someone else's, and that other app demands one particular database). I agree 100% with this. And speaking as an ex Oracle and DB2 DBA - not to mention MS-SQL (spit), with which I occasionally had to dabble, avoid them like the plague unless circumstances dictate. -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
On Tuesday, February 11, 2014 4:57:30 AM UTC+2, Walter Hurry wrote: Chris Angelico wrote: And definitely don't go for a non-free option (MS-SQL, DB2, etc) unless you've looked into it really closely and you are absolutely thoroughly *sure* that you need that system (which probably means you need your app to integrate with someone else's, and that other app demands one particular database). I agree 100% with this. And speaking as an ex Oracle and DB2 DBA - not to mention MS-SQL (spit), with which I occasionally had to dabble, avoid them like the plague unless circumstances dictate. What is about clustering? Do we have such option for free alternatives? Thanks -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
On Tue, Feb 11, 2014 at 1:57 PM, Walter Hurry walterhu...@gmail.com wrote: Chris Angelico wrote: Broad recommendation: Single application, tiny workload, concurrency not an issue, simplicity desired? Go SQLite. Big complex job, need performance, lots of things reading and writing at once, want networked access? Go PGSQL. And don't go MySQL if PG is an option. And definitely don't go for a non-free option (MS-SQL, DB2, etc) unless you've looked into it really closely and you are absolutely thoroughly *sure* that you need that system (which probably means you need your app to integrate with someone else's, and that other app demands one particular database). I agree 100% with this. And speaking as an ex Oracle and DB2 DBA - not to mention MS-SQL (spit), with which I occasionally had to dabble, avoid them like the plague unless circumstances dictate. I can't speak for Oracle as I've never used it, but DB2 is not at all a bad product. In fact, it's excellent. I'm not sorry to have spent a couple of decades using it; it's part of what taught me to assume transactions everywhere, for instance (you don't BEGIN TRANSACTION, you simply are in one - after you COMMIT, another transaction is automatically opened (at next query, I think), so you have to explicitly COMMIT everything), and its networking support is excellent. (Also, back in the 1990s, PostgreSQL wasn't nearly as easy to use as it is now.) But it's non-free, and that makes a HUGE difference when you start deploying servers. You have to count up how many boxes you're using, and then factor in the number of test machines you have too. Licenses for your OS, database, etc, etc, all add up pretty quickly. When there are no license fees whatsoever, life's easy - I can create myself a Debian Linux VM image, install all our stuff on it, and then clone it a whole bunch of times to try different things; doing that with Windows or DB2 or anything pay-for is a lot less convenient (or even straight-up illegal, depending on the license terms). That's a pretty huge downside, and I've yet to use any pay-for database engine or operating system that outdoes that. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
On Tue, Feb 11, 2014 at 2:02 PM, Asaf Las roeg...@gmail.com wrote: On Tuesday, February 11, 2014 4:57:30 AM UTC+2, Walter Hurry wrote: Chris Angelico wrote: And definitely don't go for a non-free option (MS-SQL, DB2, etc) unless you've looked into it really closely and you are absolutely thoroughly *sure* that you need that system (which probably means you need your app to integrate with someone else's, and that other app demands one particular database). I agree 100% with this. And speaking as an ex Oracle and DB2 DBA - not to mention MS-SQL (spit), with which I occasionally had to dabble, avoid them like the plague unless circumstances dictate. What is about clustering? Do we have such option for free alternatives? Thanks PostgreSQL has replication in-built now, which will do most forms of clustering. With some third-party software like Slony (also free), you can do even more (including replicating between different PostgreSQL versions, so you can upgrade progressively without any downtime; PG's internal replication has tight restrictions on that). I've used PG's streaming replication to fairly good effect. You do need some kind of system to decide when to promote a slave to master, though - my boss had this weird idea that each node had to be a perfect peer with no external authority [1], which led to unsolvable problems, but if you have an external system that declares which of several slaves should be promoted, it's pretty easy to do. I could whip you up a proof-of-concept in an hour, probably; just needs a heartbeat script and some way of signalling them to fail over to the new master. Clustering for performance, as opposed to reliability, is a bit trickier. You can do read-only queries on slaves (so if you have a many-readers-few-writers model, this can work nicely), but otherwise, you probably need some third-party middleware. I haven't looked into that side of things. Ultimately your biggest bottleneck is going to be locking, which fundamentally has to be done in one place... or else you have to deal with merge conflicts (the bane of true multi-master replication). So, it all depends on what you need to accomplish, and how much work you're willing to do. Postgres offers a particular set of primitives (including replication, promotion of a slave to master, etc), and lets you trigger things from scripts (execute pg_ctl promote to make this node become master). Advanced logic can be done by writing a Python script that edits config files, runs programs, sends Unix signals, whatever. There are pay-for Postgres support companies, too, if you need that sort of thing. tl;dr: Yeah, you can do that too. :) ChrisA [1] He had a weird issue with the concept of authority, actually. I think his dislike of any form of government polluted his thinking so he wouldn't accept even the IT sense of the word authority. Never mind that that's the best way to solve a lot of problems. But I digress. -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
On Tuesday, February 11, 2014 5:31:35 AM UTC+2, Chris Angelico wrote: On Tue, Feb 11, 2014 at 2:02 PM, Asaf Las r...@gmail.com wrote: On Tuesday, February 11, 2014 4:57:30 AM UTC+2, Walter Hurry wrote: Chris Angelico wrote: And definitely don't go for a non-free option (MS-SQL, DB2, etc) unless you've looked into it really closely and you are absolutely thoroughly *sure* that you need that system (which probably means you need your app to integrate with someone else's, and that other app demands one particular database). I agree 100% with this. And speaking as an ex Oracle and DB2 DBA - not to mention MS-SQL (spit), with which I occasionally had to dabble, avoid them like the plague unless circumstances dictate. What is about clustering? Do we have such option for free alternatives? Thanks PostgreSQL has replication in-built now, which will do most forms of clustering. With some third-party software like Slony (also free), you can do even more (including replicating between different PostgreSQL versions, so you can upgrade progressively without any downtime; PG's internal replication has tight restrictions on that). I've used PG's streaming replication to fairly good effect. You do need some kind of system to decide when to promote a slave to master, though - my boss had this weird idea that each node had to be a perfect peer with no external authority [1], which led to unsolvable problems, but if you have an external system that declares which of several slaves should be promoted, it's pretty easy to do. I could whip you up a proof-of-concept in an hour, probably; just needs a heartbeat script and some way of signalling them to fail over to the new master. Clustering for performance, as opposed to reliability, is a bit trickier. You can do read-only queries on slaves (so if you have a many-readers-few-writers model, this can work nicely), but otherwise, you probably need some third-party middleware. I haven't looked into that side of things. Ultimately your biggest bottleneck is going to be locking, which fundamentally has to be done in one place... or else you have to deal with merge conflicts (the bane of true multi-master replication). So, it all depends on what you need to accomplish, and how much work you're willing to do. Postgres offers a particular set of primitives (including replication, promotion of a slave to master, etc), and lets you trigger things from scripts (execute pg_ctl promote to make this node become master). Advanced logic can be done by writing a Python script that edits config files, runs programs, sends Unix signals, whatever. There are pay-for Postgres support companies, too, if you need that sort of thing. tl;dr: Yeah, you can do that too. :) ChrisA [1] He had a weird issue with the concept of authority, actually. I think his dislike of any form of government polluted his thinking so he wouldn't accept even the IT sense of the word authority. Never mind that that's the best way to solve a lot of problems. But I digress. Chris, Thank You very much for your detailed answer Regards /Asaf -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
As Chris said, if your needs are simple, use SQLite back-end. It's probably already installed on your computer and Python has a nice interface to it in its standard library. [1] If you decide to use MySQL back-end instead, consider using PyMySQL [2]. It's compatible with both Python 2 and Python 3. Also, being written in pure Python, it's easier to install compared to MySQLdb. [1] http://docs.python.org/3/library/sqlite3.html#module-sqlite3 [2] https://pypi.python.org/pypi/PyMySQL 2014-02-08 6:55 GMT-02:00 Sam lightai...@gmail.com: Is MySQLdb the recommended python module for SQL database access? Are there other modules? What I want in a module is to be able to write readable and maintainable code. -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
On Sun, Feb 9, 2014 at 9:20 PM, Marcel Rodrigues marcel...@gmail.com wrote: As Chris said, if your needs are simple, use SQLite back-end. It's probably already installed on your computer and Python has a nice interface to it in its standard library. Already installed? I thought the point of SQLite3 being in the Python stdlib was that Python actually included the entire engine (that's why there's no, for instance, PostgreSQL client in the stdlib - because there's no server; I disagree with the reasoning, but it is consistent and valid), so you don't need _anything_ externally installed. In any case, SQLite is ideal for really simple databasing. Back in the 1990s, I had DB2, DB2, and DB2, for all my database work. I wanted a way to query a dictionary of English words using SQL, so I created a DB2 database and threw ~60K rows into a table. Massive overkill for a one-column table. These days, I could use SQLite (or more likely, just use grep on /usr/share/dict/words - grep does everything that I wanted SQL for, if you include piping from one grep into another), cutting the overhead down enormously. The biggest downside of SQLite3 is concurrency. I haven't dug into the exact details of the pager system and such, but it seems to be fairly coarse in its locking. Also, stuff gets a bit complicated when you do a single transaction involving multiple files. So if you have lots of processes writing to the same set of SQLite tables, you'll see pretty poor performance. PostgreSQL handles that situation far better, but has a lot more overhead, so it's a poor choice for a single simple application. MySQL's locking/concurrency system is specifically optimized for a model that's common for web applications: a huge number of readers and a tiny number of writers (sometimes referred to as Data Warehousing, because you basically stuff a warehouse full of data and then everyone comes looking for it). For the write-heavy model (sometimes called OLTP or On-Line Transaction Processing), PostgreSQL will hugely outperform MySQL, thanks to its MVCC model. Broad recommendation: Single application, tiny workload, concurrency not an issue, simplicity desired? Go SQLite. Big complex job, need performance, lots of things reading and writing at once, want networked access? Go PGSQL. And don't go MySQL if PG is an option. And definitely don't go for a non-free option (MS-SQL, DB2, etc) unless you've looked into it really closely and you are absolutely thoroughly *sure* that you need that system (which probably means you need your app to integrate with someone else's, and that other app demands one particular database). ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
I just checked in the Python sources and apparently you're right about SQLite3. The Python distribution includes pysqlite which seems to be a self-contained SQLite engine. No external dependencies. Sorry for the confusion. 2014-02-09 9:00 GMT-02:00 Chris Angelico ros...@gmail.com: On Sun, Feb 9, 2014 at 9:20 PM, Marcel Rodrigues marcel...@gmail.com wrote: As Chris said, if your needs are simple, use SQLite back-end. It's probably already installed on your computer and Python has a nice interface to it in its standard library. Already installed? I thought the point of SQLite3 being in the Python stdlib was that Python actually included the entire engine (that's why there's no, for instance, PostgreSQL client in the stdlib - because there's no server; I disagree with the reasoning, but it is consistent and valid), so you don't need _anything_ externally installed. In any case, SQLite is ideal for really simple databasing. Back in the 1990s, I had DB2, DB2, and DB2, for all my database work. I wanted a way to query a dictionary of English words using SQL, so I created a DB2 database and threw ~60K rows into a table. Massive overkill for a one-column table. These days, I could use SQLite (or more likely, just use grep on /usr/share/dict/words - grep does everything that I wanted SQL for, if you include piping from one grep into another), cutting the overhead down enormously. The biggest downside of SQLite3 is concurrency. I haven't dug into the exact details of the pager system and such, but it seems to be fairly coarse in its locking. Also, stuff gets a bit complicated when you do a single transaction involving multiple files. So if you have lots of processes writing to the same set of SQLite tables, you'll see pretty poor performance. PostgreSQL handles that situation far better, but has a lot more overhead, so it's a poor choice for a single simple application. MySQL's locking/concurrency system is specifically optimized for a model that's common for web applications: a huge number of readers and a tiny number of writers (sometimes referred to as Data Warehousing, because you basically stuff a warehouse full of data and then everyone comes looking for it). For the write-heavy model (sometimes called OLTP or On-Line Transaction Processing), PostgreSQL will hugely outperform MySQL, thanks to its MVCC model. Broad recommendation: Single application, tiny workload, concurrency not an issue, simplicity desired? Go SQLite. Big complex job, need performance, lots of things reading and writing at once, want networked access? Go PGSQL. And don't go MySQL if PG is an option. And definitely don't go for a non-free option (MS-SQL, DB2, etc) unless you've looked into it really closely and you are absolutely thoroughly *sure* that you need that system (which probably means you need your app to integrate with someone else's, and that other app demands one particular database). ChrisA -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
On Sun, Feb 9, 2014 at 11:04 PM, Marcel Rodrigues marcel...@gmail.com wrote: I just checked in the Python sources and apparently you're right about SQLite3. The Python distribution includes pysqlite which seems to be a self-contained SQLite engine. No external dependencies. Sorry for the confusion. Comes to the same thing, anyhow. If anything, that strengthens your original point: it's easy, it's right there, you can give it a go. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
On Sunday, February 9, 2014 1:00:58 PM UTC+2, Chris Angelico wrote: The biggest downside of SQLite3 is concurrency. I haven't dug into the exact details of the pager system and such, but it seems to be fairly coarse in its locking. Also, stuff gets a bit complicated when you do a single transaction involving multiple files. So if you have lots of processes writing to the same set of SQLite tables, you'll see pretty poor performance. ChrisA Hi Chris i simply tested running 2 independent processes started at same time in parallel towards same sqlite database and never get 2 in that row though used exclusive lock on DB. might be i did something wrong. p.s. threading locks don't do anything in this example. import sqlite3 import threading #from threading import Lock # Constants CNT_DB_NAME = messageid.db class MessageId: ''' -- Following must be done in advance: - Create DB sqlite3 sqlite_msgd.db - Create table CREATE TABLE msgid (id INTEGER PRIMARY KEY, val INTEGER); - Inser def INSERT INTO msgid VALUES (0, 0); --''' def __init__(self, dflag = False, dbname = 'messageid.db'): #print(type(self)) #print(id(self)) self._debug = dflag self._lock = threading.Lock() self._dbname = dbname if self._debug: print(MessageId.__init__(dbname = {0}).format(dbname)) def get_msgid(self): ''' -- - Acquire lock - Connect to database - Select current value SELECT val FROM msgid WHERE id = 0; - Increment current value - Insert a row of data UPDATE msgid SET val = 1 WHERE id = new_val; --''' self._lock.acquire(True, 1) conn = sqlite3.connect('example.db', 10.0, True, EXCLUSIVE) c = conn.cursor() c.execute(SELECT val FROM msgid WHERE id = 0) tint = int(c.fetchone()[0]) + 1 c.execute(UPDATE msgid SET val={0} WHERE id = 0.format(tint)) conn.commit() conn.close() self._lock.release() if self._debug: print(MessageId.get_msgid() = , tint) return tint tclass = MessageId() for k in range(1): tclass.get_msgid() -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
On Sun, Feb 9, 2014 at 11:47 PM, Asaf Las roeg...@gmail.com wrote: i simply tested running 2 independent processes started at same time in parallel towards same sqlite database and never get 2 in that row though used exclusive lock on DB. might be i did something wrong. The threading locks aren't doing anything, because you don't have multiple threads here; what you need is SQLite locks, which you'll already have. I don't know why it wouldn't work. Unfortunately I wasn't able to test your code directly - SQLite complained that the table didn't exist, despite my having created it (at least, so I thought) in interactive Python. So probably I mucked something up there. Someone who actually knows SQLite might be better able to explain. A few points, though. c.execute(UPDATE msgid SET val={0} WHERE id = 0.format(tint)) Don't do this; instead, let c.execute() do the interpolation, by giving it extra args rather than using .format(). It makes no difference when you're working with integers, but with strings, it makes the difference between safe and not-safe (or easy and unnecessarily fiddly, if you actually take the time to get your escaping right). Also, you're connecting and disconnecting repeatedly... oh, I see why it didn't work when I tried. You're also using two completely different database names: 'messageid.db' which is named in a constant and in the default argument, and 'example.db' which is what you actually use. Should have a single source of truth, otherwise you confuse the people who might otherwise be able to test your code :) Anyway. This code is extremely inefficient: conn = sqlite3.connect('example.db', 10.0, True, EXCLUSIVE) c = conn.cursor() c.execute(SELECT val FROM msgid WHERE id = 0) tint = int(c.fetchone()[0]) + 1 c.execute(UPDATE msgid SET val={0} WHERE id = 0.format(tint)) conn.commit() conn.close() Much more common would be to retain a connection and repeatedly perform queries. Then you won't need to open it EXCLUSIVE, and you can simply query, update, and then commit (all your locks should be released at the commit). Do that, and you should see at least reasonable performance, plus everything should work correctly. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
On Sunday, February 9, 2014 3:14:50 PM UTC+2, Chris Angelico wrote: On Sun, Feb 9, 2014 at 11:47 PM, Asaf Las r...@gmail.com wrote: Thanks Also, you're connecting and disconnecting repeatedly... oh, I see why it didn't work when I tried. You're also using two completely different database names: 'messageid.db' which is named in a constant and in the default argument, and 'example.db' which is what you actually use. Should have a single source of truth, otherwise you confuse the people who might otherwise be able to test your code :) my apologies , it was deep night, when i got disappointed and forget to update names so left as it is and did not check when posted :-) Anyway. This code is extremely inefficient: conn = sqlite3.connect('example.db', 10.0, True, EXCLUSIVE) c = conn.cursor() c.execute(SELECT val FROM msgid WHERE id = 0) tint = int(c.fetchone()[0]) + 1 c.execute(UPDATE msgid SET val={0} WHERE id = 0.format(tint)) conn.commit() conn.close() Much more common would be to retain a connection and repeatedly perform queries. Then you won't need to open it EXCLUSIVE, and you can simply query, update, and then commit (all your locks should be released at the commit). Do that, and you should see at least reasonable performance, plus everything should work correctly. ChrisA i did it just to test sqlite3 behavior and actually test was related to simulation of unique incremental sequence number/counter for independently spawned tasks accessing counter in non deterministic manner. /Asaf -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
On Mon, Feb 10, 2014 at 12:27 AM, Asaf Las roeg...@gmail.com wrote: i did it just to test sqlite3 behavior and actually test was related to simulation of unique incremental sequence number/counter for independently spawned tasks accessing counter in non deterministic manner. Sure. I would expect that you'd get steadily increasing sequence IDs, but that they might be a major bottleneck. SQLite is (far as I can tell, at least - haven't personally tested it) quite solid with its locking; at the expense of performance, but carefully reliable. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
On 2014-02-09 22:00, Chris Angelico wrote: On Sun, Feb 9, 2014 at 9:20 PM, Marcel Rodrigues marcel...@gmail.com wrote: As Chris said, if your needs are simple, use SQLite back-end. It's probably already installed on your computer and Python has a nice interface to it in its standard library. Already installed? I thought the point of SQLite3 being in the Python stdlib was that Python actually included the entire engine It's been a part of the stdlib since 2.5 (so those of us that maintain a bit of 2.4 code still in the wild had to add-on that module) -tkc -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
On Mon, Feb 10, 2014 at 2:40 AM, Dennis Lee Bieber wlfr...@ix.netcom.com wrote: Any opinion on Firebird? Just curiosity given how often the advice seems to be start with SQLite, avoid MySQL, end with PostgreSQL No, because I've never used it. Has anyone here? What are its strengths and weaknesses? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What is the recommended python module for SQL database access?
On Sat, Feb 8, 2014 at 7:55 PM, Sam lightai...@gmail.com wrote: Is MySQLdb the recommended python module for SQL database access? Are there other modules? What I want in a module is to be able to write readable and maintainable code. As long as you use some module that speaks the Python Database API (PEP 249, if a quick Google search has given me the right number), use whatever talks to the database back-end you want - all your code will be basically the same. Are you starting a completely new project and creating its database? Go with SQLite or PostgreSQL (the former if your needs are simple, the latter if you want a full-featured database engine); both are open source and excellent. Are you connecting to an existing database? Use that database engine, obviously :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list