( somewhat related to Re: [sqlite] Safe to use SQLite over a sketchy network?)
Dear SQLiters, I am using SQLite over GPFS distributed file system. I was told it honestly implements file locking. I never experienced corruption. But it is slow in the sense that when many jobs from many compute nodes try to access the same database things slow down considerably. I suspect, from the point of view of file system, there is lots of pressure to develop fast grabbing of a lock and slow release. I think this is because the key to fast network file system in general is making it as independent as possible, thus distributed. Avoid bottlenecks. But locking is by definition a bottleneck. On purpose. I think code requiring file locking is a sign of the code not intended for concurrent access from multiple compute nodes. SQLite uses file locking to ensure data integrity. This is fine for imbedded systems. We use SQLite over 100 compute nodes, not as intended. To try to speed up locking I combined SQLite with FLoM, distributed file lock manager. It is client/server application. From experience, it seems that because SQLite still requests file locks, the performance increase is not that big. I wonder if there is a way to disable SQLite's internal file locking mechanism. I know this seems strange to ask. But FLoM should be able to do it faster over many compute nodes. Or, perhaps the right way is for me to combine SQLIte with simple client/server code to create light mySQL, mySQLite? Roman ________________________________________ From: sqlite-users [sqlite-users-boun...@mailinglists.sqlite.org] on behalf of Simon Slavin [slav...@bigfraud.org] Sent: Wednesday, September 25, 2019 12:58 AM To: SQLite mailing list Subject: Re: [sqlite] Safe to use SQLite over a sketchy network? When I first learned the SQLite had problems with Network File Systems I read a ton of stuff to learn why there doesn't seem to be a Network File Systems that implements locking properly. I ended up with … A) It slows access a lot. Even with clever hashing to check for collisions it takes time to figure out whether your range is already locked. B) Different use-cases have different preferences for retry-and-timeout times. It's one more thing for admins to configure and many admins don't get it right. C) It's hard to debug. There are numerous different orders in which different clients can lock and unlock ranges. You have to run a random simulator to try them all. The logic to deal with them properly is not as simple as you'd think. Consider, for example, ranges which are not identical but do overlap. D) It's mostly a waste of time. Most client software doesn't care how to deal with a BUSY status and either crashes – which annoys the admin and user – or retries immediately – which makes the management CPU hot. After all, most client software just wants to read a whole file or write a whole file. And if two people save the same word processing document at nearly the same time, who's to say who was first ? Still, I wonder why someone working on a Linux network file system, or APFS, or ZFS, hasn't done it. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users&data=02%7C01%7Croman.fleysher%40einstein.yu.edu%7C0f95ef2f5d454df697b808d74174fdc3%7C04c70eb48f2648079934e02e89266ad0%7C1%7C1%7C637049843046630883&sdata=1jUmQCBsPi6VOToz%2Fx75E%2Fi9VLR%2Flj3Wbx6um5aAXnk%3D&reserved=0 _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users