On Mon, Sep 01, 2008 at 03:10:05PM -0400, [EMAIL PROTECTED] scratched on the wall: > Does anyone have any benchmarks to share that compare common SQLite > operations running under a 32 and 64 bit versions of SQLite? This > question is OS neutral so please feel to share your experience with 32 > and 64 bit versions of Windows or Linux. > > Background: Will 64 bit versions of SQLite buy us any significant > difference in performance? I may have a chance to get our department 64 > bit AMD workstations with 8G, but I need to justify the extra cost > against a reasonable guess at what the performance improvement may be > against an equivalent 32 bit AMD 4G workstation. These workstations will > be processing very large text based log files (> 16G each). We will be > using Python 2.52 as our SQLite scripting tool.
If money is limited, the first thing I'd be looking at is I/O systems and fast storage. It doesn't matter how fast your CPU and application are if they're sitting around waiting for the drives to process data. I'd be looking at the fastest drives you can find or, if funds are sufficient, a RAID systems with lots of drives (and lots of heads) to get as much parallel I/O as you can. This is a case when eight smaller drives is better than four larger ones, assuming all else is the same. If you're processing text based log files, I assume you're just scanning those top to bottom. As a result, cache size is fairly unimportant, as you're going to read everything once and then dump it. Caches systems only provide an advantage if you come back to the data later. As the file size grows, that all comes down to raw-I/O speeds. The memory situation is less demanding as memory-mapped I/O, double-/triple-buffered asyc I/O calls, or one of the other mass-I/O techniques is very likely going to serve quite well with only 256MB or less, even for the faster RAID systems. Where memory and cache size does become very important is the database size. If you process a 20GB log file into a 1GB database, the extra address space afforded by the 64 bit system isn't going to buy you anything, since you can comfortably cache the whole database within the 32 bit address space. On the other hand, if you're talking about databases in the 3 to 6GB range and, to a lesser degree, beyond, the 64 bit system (and the extra RAM) may serve you quite well. BUT... (and there is always a "but") don't forget that writing the database doesn't get a massive boost from caching anyways, unless you play things fast and loose and turn off most of SQLite's safety nets. Assuming you're running in the normal mode where disk writes are sync'ed as much as possible, I/O speeds still play a very strong part of your total processing time any time INSERTs or UPDATEs are involved. If the processing just INSERTs records to the database, you get back into a situation much like reading the log files... pages tend to go one way (out to disk) and caching is less important. As long as you have enough cache to hold your indexes, you're fairly good to go. On the other hand, if you do stuff like record event counts, incrementing fields with each log file record that is processed, the total database cache size becomes a lot more important. That gets back to the question of database size, and if you really need the 64 bit system or not, and how much it really matters once you get much beyond ~6GB. There is also the question of what you do with these database files. If you do a lot of processing once the database files are formed, then total memory becomes important, as we can assume you're pulling a lot of pages in and out and doing all kinds of interesting things. On the other hand, if all you do is scan the data and generate a report or two and that's it, you're back to the issue of limited cache use and I/O becoming king. The "best" decision really depends on what you're trying to do with all this data. The two important points I'd make, however, are that cache doesn't matter unless you re-use the data, and that disk I/O tends to be thousands (if not millions) of times slower than CPU actions-- and that SQLite is extremely disk I/O intensive. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "'People who live in bamboo houses should not throw pandas.' Jesus said that." - "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006" _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users