Craig Cardimon wrote:
I am working with huge ASCII text files and large text fields.
Define "huge" and "large"
As needs and wants have changed, I will be reprocessing data we have
already gone through to see if more records can be extracted.
I will need to compare strings to ensure that records I am inserting
into our SQL Sever 2000 database are not duplicates of records already
there.
I have come up with two ways:
(1) use string length (number of characters a string holds):
$length = length($name);
If both strings are identical in length, I have a duplicate and would
cease processing.
How do you know that?
(2) compare strings (or 200-characer substrings thereof) directly:
if ( $str1 eq $str2 )
{
....
}
If the test-for-identical-length thing is actually valid, fill an array
with 1 for each length that is in use.
Or, if all the keys will fit into RAM, fill a hash with the
already-loaded keys and test each new record for presence in the hash.
If not, the best thing may simply be to make the field a UNIQUE key (if
the SQL Server of the Beast will allow it) and /try/ to insert the
record. If you can't, don't.
--
John W. Kennedy
"The poor have sometimes objected to being governed badly; the rich have
always objected to being governed at all."
-- G. K. Chesterton. "The Man Who Was Thursday"
_______________________________________________
ActivePerl mailing list
[email protected]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs