Re: Fast diff command for large files?

2005-11-07 Thread Kirk Strauser
On Sunday 06 November 2005 07:39, Andrew P. wrote: Note, that the difference must be kept in RAM, so it won't work if there are multi-gig diffs, but it will work very fast if the diffs are only 10-100Mb, it will work at close to I/O speed if the diff is under 10Mb. Thanks, Andrew! My

Re: Fast diff command for large files?

2005-11-07 Thread Kirk Strauser
On Sunday 06 November 2005 22:19, Olivier Nicole wrote: if you have access to the legacy/FoxPro application, it should be modifed to add a timestamp to each reccord modification. Don't underestimate the strength of the word legacy. To be honest, if we had the manhours to rewrite it, we'd

Re: Fast diff command for large files?

2005-11-07 Thread [EMAIL PROTECTED]
Kirk Strauser writes: Our legacy application runs on FoxPro. Our web application runs on a PostgreSQL database that's a mirror of the FoxPro tables. I had the same setup a while back. A few suggestions. * Add a date/changed field in Foxpro and update. * If only recent records are updated,

Re: Fast diff command for large files?

2005-11-07 Thread Kirk Strauser
On Monday 07 November 2005 10:40, [EMAIL PROTECTED] wrote: I had the same setup a while back. A few suggestions. Thanks for the tips; unfortunately, any fix that involves touching the FoxPro code is basically impossible. It's not that we *can't*, but that the sole FoxPro programmer at our

Re: Fast diff command for large files?

2005-11-07 Thread Olivier Nicole
Don't underestimate the strength of the word legacy. To be honest, if we= had the manhours to rewrite it, we'd take the opportunity to run it=20 directly against the PostgreSQL server. What we're gaining out of this system is the ability to migrate our old=20 applications at our leisure,

Re: Fast diff command for large files?

2005-11-06 Thread Kirk Strauser
On Friday 04 November 2005 02:04 pm, you wrote: Does the overall order of lines change every time you dump the tables? No, although an arbitrary number of lines might get deleted. If it does/can, then there's a trivial solution (a few lines in perl, or a hundred lines in C) that'll make the

Re: Fast diff command for large files?

2005-11-06 Thread Andrew P.
On 11/6/05, Kirk Strauser [EMAIL PROTECTED] wrote: On Friday 04 November 2005 02:04 pm, you wrote: Does the overall order of lines change every time you dump the tables? No, although an arbitrary number of lines might get deleted. If it does/can, then there's a trivial solution (a few

Re: Fast diff command for large files?

2005-11-06 Thread Olivier Nicole
We do the mirroring by running a program that dumps the FoxPro tables out as tab-delimited files. Thus far, we'd been using PostgreSQL's copy from command to read those files into the database. In reality, though, a very, very small percentage of rows in those tables actually change. So, I

Re: Fast diff command for large files?

2005-11-05 Thread Jan Grant
On Fri, 4 Nov 2005, Kirk Strauser wrote: thinking out loud I wonder if rsync could be modified to output its patches rather than silently applying them to a target file. It seems to be pretty good at comparing large files quickly... /thinking More thinking out loud: since these are

Fast diff command for large files?

2005-11-04 Thread Kirk Strauser
I need to routinely find the diffs between two multigigabyte text files (exporting a set of FoxPro tables to a PostgreSQL database without doing a complete dump/reload each time, in case you were wondering). GNU diff from the base system and from ports chokes. The textproc/2bsd-diff works OK,

Re: Fast diff command for large files?

2005-11-04 Thread Chuck Swiger
Kirk Strauser wrote: I need to routinely find the diffs between two multigigabyte text files (exporting a set of FoxPro tables to a PostgreSQL database without doing a complete dump/reload each time, in case you were wondering). GNU diff from the base system and from ports chokes. The

Re: Fast diff command for large files?

2005-11-04 Thread Kirk Strauser
On Friday 04 November 2005 10:22, Chuck Swiger wrote: Multigigabyte? Find another approach to solving the problem, a text-base diff is going to require excessive resources and time. A 64-bit platform with 2 GB of RAM 3GB of swap requires ~1000 seconds to diff ~400MB. There really aren't

Re: Fast diff command for large files?

2005-11-04 Thread Charles Swiger
On Nov 4, 2005, at 12:29 PM, Kirk Strauser wrote: Multigigabyte? Find another approach to solving the problem, a text-base diff is going to require excessive resources and time. A 64-bit platform with 2 GB of RAM 3GB of swap requires ~1000 seconds to diff ~400MB. There really aren't

Re: Fast diff command for large files?

2005-11-04 Thread Kirk Strauser
On Friday 04 November 2005 13:39, Charles Swiger wrote: OK, but even if only one line out of 1000 changes, you still can't make either diff or Colin Percival's bsdiff run on gigabyte sized files and have it fit into MAXDSIZE on 32-bit address space. For the record, textproc/2bsd-diff works

Re: Fast diff command for large files?

2005-11-04 Thread Andrew P.
On 11/4/05, Kirk Strauser [EMAIL PROTECTED] wrote: On Friday 04 November 2005 10:22, Chuck Swiger wrote: Multigigabyte? Find another approach to solving the problem, a text-base diff is going to require excessive resources and time. A 64-bit platform with 2 GB of RAM 3GB of swap