On 8/2/07, Don Hamilton <[EMAIL PROTECTED]> wrote: > You're welcome, Mike... geeze, you are a fast adaptor.... > > While I have your attention though, might I ask about alternate > approaches to pg-loader.pl. Because it keeps EVERTHING in memory til its > done reading the complete input file, I get into severe swapping after > 30000 records or so... (on a system with 1gb memory). I've seen others > mention similar problems. Yes I could break my input into chunks, > but... > > In the past, I've used a technique where I wrote separate files for > each table of data, piped those to a 'sort unique', and then done a bulk > copy to load the individual tables. Is that something that you (or > others) would find useful? I'd still like to get to a place where I
We won't turn down an alternate script. I think the way I'd like to see it (and can do it if your not feeling saucy) is to specify an output directory as a command line param, instead of a file, and write a file per table plus one master script to pull the table scripts in, ordered correctly, within one transaction. Another option would be to treat those table scripts as temp files and just cat them together at the end of the process, which I think is approximately what you were suggesting. I personally favor the script-per-table approach because it makes editing one tables worth of data managable -- 7 or 8 100M files instead of a single 1G file. > could load a fresh evergreen d/b every week or so from a dump of my 5 > million bibliographic records. I do do that now (in 3 or 4 hours, > elapsed) with my Simple OPAC Backup at http://library.wlu.ca/searchme, > and don't see why I can't accomplish the same feat with evergreen. > I have reservations about promoting EG as an OPAC alternative. It's far from optimized for that, and there is a ton of overhead that just isn't needed in order to do simple searches but cannot be turned off. It's just not meant for that purpose and there are many good projects out there that fill that niche ... That being said, what you want is possible, but it will require a good bit of extra scripting (inside and outside the database) to automate. (Using GIN indexes, removing/re-adding indexes during reloads, etc.) Sorry if I'm being a downer. ;) I don't mean to be a wet blanket, but I don't want to set EG up to get a reputation for "failing" at something for which it wasn't really meant, if that makes sense. --miker > don > > > > > >>> [EMAIL PROTECTED] 02-Aug-2007 1:37 PM >>> > > [snip] > > Hope that helps in the future, and thanks for the idea! > > --miker > > > -- Mike Rylander Equinox Software, Inc [EMAIL PROTECTED] http://esilibrary.com/
