> Dear All,
> I'm trying to read ten 200 MB textfiles into a MySQL MyISAM database
> (Linux, ext4). The script output is suddenly stopping, while the Python
> process is still running (or should I say sleeping?). It's not in top,
> but in ps visible.
> Why is it stopping? Is there a way to make it continue, without calling
> "kill -9", deleting the processed lines and starting it again?
> Thank you in advance.
>  http://pastebin.com/CxHCA9eB
> import MySQLdb, pprint, re
> db = None
> daten = "/home/chris/temp/data/data/"
> host = "localhost"
> user = "data"
> passwd = "data"
> database = "data"
> table = "data"
> def connect_mysql():
> global db, host, user, passwd, database
> db = MySQLdb.connect(host, user, passwd, database)
> def read_file(srcfile):
> lines = 
> f = open(srcfile, 'r')
> while True:
> line = f.readline()
> #print line
> if len(line) == 0:
The read_file() function looks suspicious. It uses a round-about way to read
the whole file into memory. Maybe your system is just swapping?
Throw read_file() away and instead iterate over the file directly (see
> def write_db(anonid, query, querytime, itemrank, clickurl):
> global db, table
> print "write_db aufgerufen."
> cur = db.cursor()
> cur.execute("""INSERT INTO data
(anonid,query,querytime,itemrank,clickurl) VALUES (%s,%s,%s,%s,%s)""",
> def split_line(line):
> print "split_line called."
> print "line is:", line
> searchObj = re.split(r'(\d*)\t(.*)\t([0-9: -]+)\t(\d*)\t([A-Za-
z0-9._:/ -]*)',line, re.I|re.U)
> db = connect_mysql()
with open(daten + "test-07b.txt") as lines:
for line in lines:
result = split_line(line)
write_db(result, result, result, result, result)
- A bare except is evil. You lose valuable information.
- A 'global' statement is only needed to rebind a module-global variable,
not to access such a variable. At first glance all your 'global'
declarations seem superfluous.
- You could change the signature of write_db() to accept result[1:6].
- Do you really need a new cursor for every write? Keep one around as a
- You might try cur.executemany() to speed things up a bit.