A couple things ...

1.  there was a time I needed to get data from files into a
database ... I think I've done it for both SQL Server and Oracle ...
for SQL Server, they've got an .exe called something like Bulk
Copy ... anyhow, if you google that, you should find it ... it worked
quite well ... of  course, part of the reason for that is it bypasses
any triggers ... also, possibly foreign-key constraint checking ...
and such ... but it's VERY fast ... wouldn't take much to write a new
program that simply launches the Bulk Copy pointing at the appropriate
file

2.  I'd be extremely surprised if one of the Stream classes used for
file I/O didn't already buffer blocks ... in fact, while double-
checking, I've even found a comment saying as much ... I'd also avoid
reading ALL records into RAM ... sometimes users run programs which
use a lot of it ... and they get cranky if your program adversely
affects the performance of another one (though I have no experience
with such things :-0 ) ... anyhow, if you're using a Stream class for
file I/O, chances are extremely good appropriate buffering is already
being done for you ... so no need to *cache* up n records ... one at a
time is fine

3.  I abstract all my Db access ... part of that is any Execute is
done from a modal form (for attended programs, of course) ... this
keeps the user from attempting to mess with whatever form might
otherwise be open ... anyhow, if you're going to spawn a new thread
for such a thing, I'd highly recommend having that thread create and
show a new Form ... on this form, you can have a status label (I use
textboxes so the user can copy/paste anything they want to
question) ... and a progress-bar ... of course, as it's a new thread,
IIRC, even if the new Form is modal, your other Form will still be
usable ... at which, you'll probably want your file-open dialog from
the new thread ... along with anything else related to that particular
import operation ... if you design it right, the user should be able
to *kick-off* any number of these ... all running at once

4.  oops ... almost forgot ... your database insert should use
parameters ... do NOT embed actual values in the SQL ... this way, the
database doesn't re-optimize the SQL every time ... it's been a while,
so maybe the newer servers actually figure that out for you ... either
way, it's still a good way to go ... I usually try to keep the same
Command object alive and change its parameters for every execution ...
I'm using SQL 2000 'er such ... and Oracle 7.3 ... and still getting
excellent performance, that way

5.  oh ya ... and while I'm at it ... I usually use .BeginTrans
and .Rollback/.Commit ... however, when you're talking in excess of a
few hundred rows, you run the risk of blowing out your rollback
segment ... so I'd use the aforementioned "n % 200" 'er such to do
a .Commit and a new .BeginTrans ... if you don't use any at all, I
believe the default behavior is an *automatic* .BeginTrans/.Commit for
EVERY row ... I couldn't say if that might be more or less efficient
than *caching* 'em up every few hundred ... after all, the database
still needs to .Commit n rows ... whether they're one at a time, or
not ... regardless, that's what I'd do ... again, just because YOUR
program ain't blowing up, doesn't mean someone ELSE's isn't suffering
because of poor use of such resources


On Feb 9, 2:23 pm, rbdavidson <[email protected]> wrote:
> You will get MUCH better performance if you read blocks of records
> into memory, then process all the record in that block.  This results
> in far fewer seeks and better data throughput.  For example, if you
> read 1000 records at a time, for 300,000 records you only hit the hard
> drive 300 times.  300 *.01 avg seek time = 3 seconds instead of 3000
> seconds.
>
> Depending upon the size of the file and available RAM, you might well
> want to read the entire file before you start processing records.
>
> -- R.B. Davidson
>
> On Feb 8, 1:02 am, "Joe C." <[email protected]> wrote:
>

Reply via email to