Re: What is the recommended python module for SQL database access?

2014-02-10 Thread Walter Hurry
Chris Angelico wrote:

 Broad recommendation: Single application, tiny workload, concurrency
 not an issue, simplicity desired? Go SQLite. Big complex job, need
 performance, lots of things reading and writing at once, want
 networked access? Go PGSQL. And don't go MySQL if PG is an option.

 And definitely don't go for a non-free option (MS-SQL, DB2, etc)
 unless you've looked into it really closely and you are absolutely
 thoroughly *sure* that you need that system (which probably means you
 need your app to integrate with someone else's, and that other app
 demands one particular database).

I agree 100% with this. And speaking as an ex Oracle and DB2 DBA - not to 
mention MS-SQL (spit), with which I occasionally had to dabble, avoid them like 
the plague unless circumstances dictate.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-10 Thread Asaf Las
On Tuesday, February 11, 2014 4:57:30 AM UTC+2, Walter Hurry wrote:
 Chris Angelico wrote:
 
  And definitely don't go for a non-free option (MS-SQL, DB2, etc)
  unless you've looked into it really closely and you are absolutely
  thoroughly *sure* that you need that system (which probably means you
  need your app to integrate with someone else's, and that other app
  demands one particular database).
 
 
 I agree 100% with this. And speaking as an ex Oracle and DB2 DBA - 
 not to mention MS-SQL (spit), with which I occasionally had to dabble, 
 avoid them like the plague unless circumstances dictate.

What is about clustering? Do we have such option for free alternatives?

Thanks
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-10 Thread Chris Angelico
On Tue, Feb 11, 2014 at 1:57 PM, Walter Hurry walterhu...@gmail.com wrote:
 Chris Angelico wrote:

 Broad recommendation: Single application, tiny workload, concurrency
 not an issue, simplicity desired? Go SQLite. Big complex job, need
 performance, lots of things reading and writing at once, want
 networked access? Go PGSQL. And don't go MySQL if PG is an option.

 And definitely don't go for a non-free option (MS-SQL, DB2, etc)
 unless you've looked into it really closely and you are absolutely
 thoroughly *sure* that you need that system (which probably means you
 need your app to integrate with someone else's, and that other app
 demands one particular database).

 I agree 100% with this. And speaking as an ex Oracle and DB2 DBA - not to 
 mention MS-SQL (spit), with which I occasionally had to dabble, avoid them 
 like the plague unless circumstances dictate.


I can't speak for Oracle as I've never used it, but DB2 is not at all
a bad product. In fact, it's excellent. I'm not sorry to have spent a
couple of decades using it; it's part of what taught me to assume
transactions everywhere, for instance (you don't BEGIN TRANSACTION,
you simply are in one - after you COMMIT, another transaction is
automatically opened (at next query, I think), so you have to
explicitly COMMIT everything), and its networking support is
excellent. (Also, back in the 1990s, PostgreSQL wasn't nearly as easy
to use as it is now.)

But it's non-free, and that makes a HUGE difference when you start
deploying servers. You have to count up how many boxes you're using,
and then factor in the number of test machines you have too. Licenses
for your OS, database, etc, etc, all add up pretty quickly. When there
are no license fees whatsoever, life's easy - I can create myself a
Debian Linux VM image, install all our stuff on it, and then clone it
a whole bunch of times to try different things; doing that with
Windows or DB2 or anything pay-for is a lot less convenient (or even
straight-up illegal, depending on the license terms). That's a pretty
huge downside, and I've yet to use any pay-for database engine or
operating system that outdoes that.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-10 Thread Chris Angelico
On Tue, Feb 11, 2014 at 2:02 PM, Asaf Las roeg...@gmail.com wrote:
 On Tuesday, February 11, 2014 4:57:30 AM UTC+2, Walter Hurry wrote:
 Chris Angelico wrote:
 
  And definitely don't go for a non-free option (MS-SQL, DB2, etc)
  unless you've looked into it really closely and you are absolutely
  thoroughly *sure* that you need that system (which probably means you
  need your app to integrate with someone else's, and that other app
  demands one particular database).
 

 I agree 100% with this. And speaking as an ex Oracle and DB2 DBA -
 not to mention MS-SQL (spit), with which I occasionally had to dabble,
 avoid them like the plague unless circumstances dictate.

 What is about clustering? Do we have such option for free alternatives?

 Thanks

PostgreSQL has replication in-built now, which will do most forms of
clustering. With some third-party software like Slony (also free), you
can do even more (including replicating between different PostgreSQL
versions, so you can upgrade progressively without any downtime; PG's
internal replication has tight restrictions on that). I've used PG's
streaming replication to fairly good effect. You do need some kind of
system to decide when to promote a slave to master, though - my boss
had this weird idea that each node had to be a perfect peer with no
external authority [1], which led to unsolvable problems, but if you
have an external system that declares which of several slaves should
be promoted, it's pretty easy to do. I could whip you up a
proof-of-concept in an hour, probably; just needs a heartbeat script
and some way of signalling them to fail over to the new master.

Clustering for performance, as opposed to reliability, is a bit
trickier. You can do read-only queries on slaves (so if you have a
many-readers-few-writers model, this can work nicely), but otherwise,
you probably need some third-party middleware. I haven't looked into
that side of things. Ultimately your biggest bottleneck is going to be
locking, which fundamentally has to be done in one place... or else
you have to deal with merge conflicts (the bane of true multi-master
replication).

So, it all depends on what you need to accomplish, and how much work
you're willing to do. Postgres offers a particular set of primitives
(including replication, promotion of a slave to master, etc), and lets
you trigger things from scripts (execute pg_ctl promote to make this
node become master). Advanced logic can be done by writing a Python
script that edits config files, runs programs, sends Unix signals,
whatever. There are pay-for Postgres support companies, too, if you
need that sort of thing.

tl;dr: Yeah, you can do that too. :)

ChrisA

[1] He had a weird issue with the concept of authority, actually. I
think his dislike of any form of government polluted his thinking so
he wouldn't accept even the IT sense of the word authority. Never
mind that that's the best way to solve a lot of problems. But I
digress.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-10 Thread Asaf Las
On Tuesday, February 11, 2014 5:31:35 AM UTC+2, Chris Angelico wrote:
 On Tue, Feb 11, 2014 at 2:02 PM, Asaf Las r...@gmail.com wrote:
 
  On Tuesday, February 11, 2014 4:57:30 AM UTC+2, Walter Hurry wrote:
  Chris Angelico wrote:
  
   And definitely don't go for a non-free option (MS-SQL, DB2, etc)
   unless you've looked into it really closely and you are absolutely
   thoroughly *sure* that you need that system (which probably means you
   need your app to integrate with someone else's, and that other app
   demands one particular database).
  
 
  I agree 100% with this. And speaking as an ex Oracle and DB2 DBA -
  not to mention MS-SQL (spit), with which I occasionally had to dabble,
  avoid them like the plague unless circumstances dictate.
 
  What is about clustering? Do we have such option for free alternatives?
 
  Thanks
 
 PostgreSQL has replication in-built now, which will do most forms of
 clustering. With some third-party software like Slony (also free), you
 can do even more (including replicating between different PostgreSQL
 versions, so you can upgrade progressively without any downtime; PG's
 internal replication has tight restrictions on that). I've used PG's
 streaming replication to fairly good effect. You do need some kind of
 system to decide when to promote a slave to master, though - my boss
 had this weird idea that each node had to be a perfect peer with no
 external authority [1], which led to unsolvable problems, but if you
 have an external system that declares which of several slaves should
 be promoted, it's pretty easy to do. I could whip you up a
 proof-of-concept in an hour, probably; just needs a heartbeat script
 and some way of signalling them to fail over to the new master.
 
 Clustering for performance, as opposed to reliability, is a bit
 trickier. You can do read-only queries on slaves (so if you have a
 many-readers-few-writers model, this can work nicely), but otherwise,
 you probably need some third-party middleware. I haven't looked into
 that side of things. Ultimately your biggest bottleneck is going to be
 locking, which fundamentally has to be done in one place... or else
 you have to deal with merge conflicts (the bane of true multi-master
 replication).
 
 So, it all depends on what you need to accomplish, and how much work
 you're willing to do. Postgres offers a particular set of primitives
 (including replication, promotion of a slave to master, etc), and lets
 you trigger things from scripts (execute pg_ctl promote to make this
 node become master). Advanced logic can be done by writing a Python
 script that edits config files, runs programs, sends Unix signals,
 whatever. There are pay-for Postgres support companies, too, if you
 need that sort of thing.
 
 tl;dr: Yeah, you can do that too. :)
 
 ChrisA
 
 [1] He had a weird issue with the concept of authority, actually. I
 think his dislike of any form of government polluted his thinking so
 he wouldn't accept even the IT sense of the word authority. Never
 mind that that's the best way to solve a lot of problems. But I
 digress.

Chris, Thank You very much for your detailed answer 

Regards

/Asaf
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-09 Thread Marcel Rodrigues
As Chris said, if your needs are simple, use SQLite back-end. It's probably
already installed on your computer and Python has a nice interface to it in
its standard library. [1]

If you decide to use MySQL back-end instead, consider using PyMySQL [2].
It's compatible with both Python 2 and Python 3. Also, being written in
pure Python, it's easier to install compared to MySQLdb.

[1] http://docs.python.org/3/library/sqlite3.html#module-sqlite3
[2] https://pypi.python.org/pypi/PyMySQL


2014-02-08 6:55 GMT-02:00 Sam lightai...@gmail.com:

 Is MySQLdb the recommended python module for SQL database access? Are
 there other modules? What I want in a module is to be able to write
 readable and maintainable code.
 --
 https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-09 Thread Chris Angelico
On Sun, Feb 9, 2014 at 9:20 PM, Marcel Rodrigues marcel...@gmail.com wrote:
 As Chris said, if your needs are simple, use SQLite back-end. It's probably
 already installed on your computer and Python has a nice interface to it in
 its standard library.

Already installed? I thought the point of SQLite3 being in the Python
stdlib was that Python actually included the entire engine (that's why
there's no, for instance, PostgreSQL client in the stdlib - because
there's no server; I disagree with the reasoning, but it is consistent
and valid), so you don't need _anything_ externally installed.

In any case, SQLite is ideal for really simple databasing. Back in the
1990s, I had DB2, DB2, and DB2, for all my database work. I wanted a
way to query a dictionary of English words using SQL, so I created a
DB2 database and threw ~60K rows into a table. Massive overkill for a
one-column table. These days, I could use SQLite (or more likely, just
use grep on /usr/share/dict/words - grep does everything that I wanted
SQL for, if you include piping from one grep into another), cutting
the overhead down enormously.

The biggest downside of SQLite3 is concurrency. I haven't dug into the
exact details of the pager system and such, but it seems to be fairly
coarse in its locking. Also, stuff gets a bit complicated when you do
a single transaction involving multiple files. So if you have lots of
processes writing to the same set of SQLite tables, you'll see pretty
poor performance. PostgreSQL handles that situation far better, but
has a lot more overhead, so it's a poor choice for a single simple
application. MySQL's locking/concurrency system is specifically
optimized for a model that's common for web applications: a huge
number of readers and a tiny number of writers (sometimes referred to
as Data Warehousing, because you basically stuff a warehouse full of
data and then everyone comes looking for it). For the write-heavy
model (sometimes called OLTP or On-Line Transaction Processing),
PostgreSQL will hugely outperform MySQL, thanks to its MVCC model.

Broad recommendation: Single application, tiny workload, concurrency
not an issue, simplicity desired? Go SQLite. Big complex job, need
performance, lots of things reading and writing at once, want
networked access? Go PGSQL. And don't go MySQL if PG is an option.

And definitely don't go for a non-free option (MS-SQL, DB2, etc)
unless you've looked into it really closely and you are absolutely
thoroughly *sure* that you need that system (which probably means you
need your app to integrate with someone else's, and that other app
demands one particular database).

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-09 Thread Marcel Rodrigues
I just checked in the Python sources and apparently you're right about
SQLite3. The Python distribution includes pysqlite which seems to be a
self-contained SQLite engine. No external dependencies. Sorry for the
confusion.


2014-02-09 9:00 GMT-02:00 Chris Angelico ros...@gmail.com:

 On Sun, Feb 9, 2014 at 9:20 PM, Marcel Rodrigues marcel...@gmail.com
 wrote:
  As Chris said, if your needs are simple, use SQLite back-end. It's
 probably
  already installed on your computer and Python has a nice interface to it
 in
  its standard library.

 Already installed? I thought the point of SQLite3 being in the Python
 stdlib was that Python actually included the entire engine (that's why
 there's no, for instance, PostgreSQL client in the stdlib - because
 there's no server; I disagree with the reasoning, but it is consistent
 and valid), so you don't need _anything_ externally installed.

 In any case, SQLite is ideal for really simple databasing. Back in the
 1990s, I had DB2, DB2, and DB2, for all my database work. I wanted a
 way to query a dictionary of English words using SQL, so I created a
 DB2 database and threw ~60K rows into a table. Massive overkill for a
 one-column table. These days, I could use SQLite (or more likely, just
 use grep on /usr/share/dict/words - grep does everything that I wanted
 SQL for, if you include piping from one grep into another), cutting
 the overhead down enormously.

 The biggest downside of SQLite3 is concurrency. I haven't dug into the
 exact details of the pager system and such, but it seems to be fairly
 coarse in its locking. Also, stuff gets a bit complicated when you do
 a single transaction involving multiple files. So if you have lots of
 processes writing to the same set of SQLite tables, you'll see pretty
 poor performance. PostgreSQL handles that situation far better, but
 has a lot more overhead, so it's a poor choice for a single simple
 application. MySQL's locking/concurrency system is specifically
 optimized for a model that's common for web applications: a huge
 number of readers and a tiny number of writers (sometimes referred to
 as Data Warehousing, because you basically stuff a warehouse full of
 data and then everyone comes looking for it). For the write-heavy
 model (sometimes called OLTP or On-Line Transaction Processing),
 PostgreSQL will hugely outperform MySQL, thanks to its MVCC model.

 Broad recommendation: Single application, tiny workload, concurrency
 not an issue, simplicity desired? Go SQLite. Big complex job, need
 performance, lots of things reading and writing at once, want
 networked access? Go PGSQL. And don't go MySQL if PG is an option.

 And definitely don't go for a non-free option (MS-SQL, DB2, etc)
 unless you've looked into it really closely and you are absolutely
 thoroughly *sure* that you need that system (which probably means you
 need your app to integrate with someone else's, and that other app
 demands one particular database).

 ChrisA
 --
 https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-09 Thread Chris Angelico
On Sun, Feb 9, 2014 at 11:04 PM, Marcel Rodrigues marcel...@gmail.com wrote:
 I just checked in the Python sources and apparently you're right about
 SQLite3. The Python distribution includes pysqlite which seems to be a
 self-contained SQLite engine. No external dependencies. Sorry for the
 confusion.

Comes to the same thing, anyhow. If anything, that strengthens your
original point: it's easy, it's right there, you can give it a go.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-09 Thread Asaf Las
On Sunday, February 9, 2014 1:00:58 PM UTC+2, Chris Angelico wrote:
 The biggest downside of SQLite3 is concurrency. I haven't dug into the
 exact details of the pager system and such, but it seems to be fairly
 coarse in its locking. Also, stuff gets a bit complicated when you do
 a single transaction involving multiple files. So if you have lots of
 processes writing to the same set of SQLite tables, you'll see pretty
 poor performance.
 ChrisA

Hi Chris 

i simply tested running 2 independent processes started at same time in 
parallel towards same sqlite database and never get 2 in that row
though used exclusive lock on DB. might be i did something wrong.

p.s. threading locks don't do anything in this example.

import sqlite3
import threading
#from threading import Lock

# Constants 
CNT_DB_NAME = messageid.db

class MessageId:
''' 
--
Following must be done in advance:
- Create DB sqlite3 sqlite_msgd.db
- Create table  CREATE TABLE msgid (id INTEGER PRIMARY KEY, val 
INTEGER);
- Inser def INSERT INTO msgid VALUES (0, 0);

--'''
def __init__(self, dflag = False, dbname = 'messageid.db'):
#print(type(self))
#print(id(self))
self._debug = dflag
self._lock = threading.Lock()
self._dbname = dbname
if self._debug:
print(MessageId.__init__(dbname = {0}).format(dbname))

def get_msgid(self):
''' 
--
- Acquire lock
- Connect to database
- Select current value  SELECT val FROM msgid WHERE id = 0;
- Increment current value
- Insert a row of data  UPDATE msgid SET val = 1 WHERE id = 
new_val; 

--'''
self._lock.acquire(True, 1)
conn = sqlite3.connect('example.db', 10.0, True, EXCLUSIVE)
c = conn.cursor()
c.execute(SELECT val FROM msgid WHERE id = 0)
tint = int(c.fetchone()[0]) + 1
c.execute(UPDATE msgid SET val={0} WHERE id = 0.format(tint))
conn.commit()
conn.close()
self._lock.release()
if self._debug:
print(MessageId.get_msgid() = , tint)
return tint

tclass = MessageId()

for k in range(1):
tclass.get_msgid()

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-09 Thread Chris Angelico
On Sun, Feb 9, 2014 at 11:47 PM, Asaf Las roeg...@gmail.com wrote:
 i simply tested running 2 independent processes started at same time in
 parallel towards same sqlite database and never get 2 in that row
 though used exclusive lock on DB. might be i did something wrong.

The threading locks aren't doing anything, because you don't have
multiple threads here; what you need is SQLite locks, which you'll
already have.

I don't know why it wouldn't work. Unfortunately I wasn't able to test
your code directly - SQLite complained that the table didn't exist,
despite my having created it (at least, so I thought) in interactive
Python. So probably I mucked something up there. Someone who actually
knows SQLite might be better able to explain.

A few points, though.

 c.execute(UPDATE msgid SET val={0} WHERE id = 0.format(tint))

Don't do this; instead, let c.execute() do the interpolation, by
giving it extra args rather than using .format(). It makes no
difference when you're working with integers, but with strings, it
makes the difference between safe and not-safe (or easy and
unnecessarily fiddly, if you actually take the time to get your
escaping right).

Also, you're connecting and disconnecting repeatedly... oh, I see why
it didn't work when I tried. You're also using two completely
different database names: 'messageid.db' which is named in a constant
and in the default argument, and 'example.db' which is what you
actually use. Should have a single source of truth, otherwise you
confuse the people who might otherwise be able to test your code :)
Anyway. This code is extremely inefficient:

conn = sqlite3.connect('example.db', 10.0, True, EXCLUSIVE)
c = conn.cursor()
c.execute(SELECT val FROM msgid WHERE id = 0)
tint = int(c.fetchone()[0]) + 1
c.execute(UPDATE msgid SET val={0} WHERE id = 0.format(tint))
conn.commit()
conn.close()

Much more common would be to retain a connection and repeatedly
perform queries. Then you won't need to open it EXCLUSIVE, and you can
simply query, update, and then commit (all your locks should be
released at the commit). Do that, and you should see at least
reasonable performance, plus everything should work correctly.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-09 Thread Asaf Las
On Sunday, February 9, 2014 3:14:50 PM UTC+2, Chris Angelico wrote:
 On Sun, Feb 9, 2014 at 11:47 PM, Asaf Las r...@gmail.com wrote:
 

Thanks

 
 Also, you're connecting and disconnecting repeatedly... oh, I see why
 it didn't work when I tried. You're also using two completely
 different database names: 'messageid.db' which is named in a constant
 and in the default argument, and 'example.db' which is what you
 actually use. Should have a single source of truth, otherwise you
 confuse the people who might otherwise be able to test your code :)

my apologies , it was deep night, when i got disappointed and forget to 
update names so left as it is and did not check when posted :-) 

 Anyway. This code is extremely inefficient:
 
 conn = sqlite3.connect('example.db', 10.0, True, EXCLUSIVE)
 c = conn.cursor()
 c.execute(SELECT val FROM msgid WHERE id = 0)
 tint = int(c.fetchone()[0]) + 1
 c.execute(UPDATE msgid SET val={0} WHERE id = 0.format(tint))
 conn.commit()
 conn.close()
 
 Much more common would be to retain a connection and repeatedly
 perform queries. Then you won't need to open it EXCLUSIVE, and you can
 simply query, update, and then commit (all your locks should be
 released at the commit). Do that, and you should see at least
 reasonable performance, plus everything should work correctly.
 ChrisA

i did it just to test sqlite3 behavior and actually test was related to
simulation of  unique incremental sequence number/counter for 
independently spawned tasks accessing counter in non deterministic manner. 

/Asaf
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-09 Thread Chris Angelico
On Mon, Feb 10, 2014 at 12:27 AM, Asaf Las roeg...@gmail.com wrote:
 i did it just to test sqlite3 behavior and actually test was related to
 simulation of  unique incremental sequence number/counter for
 independently spawned tasks accessing counter in non deterministic manner.

Sure. I would expect that you'd get steadily increasing sequence IDs,
but that they might be a major bottleneck. SQLite is (far as I can
tell, at least - haven't personally tested it) quite solid with its
locking; at the expense of performance, but carefully reliable.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-09 Thread Tim Chase
On 2014-02-09 22:00, Chris Angelico wrote:
 On Sun, Feb 9, 2014 at 9:20 PM, Marcel Rodrigues
 marcel...@gmail.com wrote:
  As Chris said, if your needs are simple, use SQLite back-end.
  It's probably already installed on your computer and Python has a
  nice interface to it in its standard library.  
 
 Already installed? I thought the point of SQLite3 being in the
 Python stdlib was that Python actually included the entire engine 

It's been a part of the stdlib since 2.5 (so those of us that
maintain a bit of 2.4 code still in the wild had to add-on that
module)

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-09 Thread Chris Angelico
On Mon, Feb 10, 2014 at 2:40 AM, Dennis Lee Bieber
wlfr...@ix.netcom.com wrote:
 Any opinion on Firebird? Just curiosity given how often the advice
 seems to be start with SQLite, avoid MySQL, end with PostgreSQL

No, because I've never used it. Has anyone here? What are its
strengths and weaknesses?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the recommended python module for SQL database access?

2014-02-08 Thread Chris Angelico
On Sat, Feb 8, 2014 at 7:55 PM, Sam lightai...@gmail.com wrote:
 Is MySQLdb the recommended python module for SQL database access? Are there 
 other modules? What I want in a module is to be able to write readable and 
 maintainable code.


As long as you use some module that speaks the Python Database API
(PEP 249, if a quick Google search has given me the right number), use
whatever talks to the database back-end you want - all your code will
be basically the same. Are you starting a completely new project and
creating its database? Go with SQLite or PostgreSQL (the former if
your needs are simple, the latter if you want a full-featured database
engine); both are open source and excellent. Are you connecting to an
existing database? Use that database engine, obviously :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list