On Sunday 07 January 2007 19:22, Norberto Meijome wrote: > I could be wrong and this only apply to generating indexes?
That's what we're doing now. By dropping the table and recreating it, all the index maintenance gets deferred until one batch at the end (which is vastly faster in practice). We also wrap the whole thing in a transaction so that select queries still have access to the old data until the instant the commit is finished, assuming that the import doesn't fail for some reason and get rolled back. > Something also to consider is , do you *have* to import all that data every > time? ie., can you create data partitions, assign a new table to each > day ,create the appropriate rules, and then only dump from FoxPro the last > day's info? I wish. First, we run the import hourly (more or less). I save the output of the last hour's Foxpro->PostgreSQL conversion, and run a hand-written optimized diff against it. If nothing has changed, that table gets skipped. If a reasonably small percentage of rows have changed (which is almost always the case), then I re-write it as a serious of deletes followed by a bulk import. Basically, I patch the table. It's nice to see a twenty million row table update reduced to a patch file 100 lines long. Oh, for the record, my diff algorithm is written in Python and is still IO limited, even when several copies are running in parallel. > if cpu is hitting to hard, you can always run nice pgsql while importing > this - it will still take 100% cpu, but yield as soon as something else > needs to do some work . Of course, this wont help if you are IO bound (i > think? ). *are* you CPU bound ? {Un,}fortunately, no. The CPUs are still idle enough to get a lot of other processing done without slowing the import process. > Let us know what solution you come up with, and good luck :) Will do, and thanks! -- Kirk Strauser
pgpkGwbXhcq1H.pgp
Description: PGP signature