On Sep 26, 2017, at 11:24 AM, Ron Barnes <rbar...@njdevils.net> wrote:
> 
> I have approximately 600 million records that need to be sorted

Where is the data now?

> There are 18 table entries.

You mean 18 columns per row, right?

> I also need to deduplicate the records based upon the sorted output file.

You speak of VB.NET, which means you don’t have a uniq tool as on POSIX systems:

   https://linux.die.net/man/1/uniq

If you can install Cygwin or WSL on these Windows boxes, then you’d have uniq, 
as well as a cross-platform solution.  SQLite is available for both Cygwin and 
WSL.

> I can take care of the deduplication (I think).

The basic functionality of uniq is indeed pretty simple: given sorted input, 
write as output only lines that don’t repeat the content of the previous input 
line.

The primary reason to mess with Cygwin or WSL on Windows is simply because 
using pre-built tools, you don’t have to debug and maintain it.  There’s value 
in “just run it through uniq.”  Even if you can write it in VB.net in half an 
hour, you’re vastly over-budget compared to the half second it takes me to type 
“ | uniq”.

> Sort Field 1 Ascending
> Sort Field 2 Ascending WITHIN field 1

I’m not sure what you mean by “WITHIN”.  Are you simply saying that you want 
the data sorted first by field 2 and then by field 1, so that when two records 
have the same field 1 content, that the output has that pair of records ordered 
by field 2?  E.g.

    Field 1    Field 2
    ---------- ------------
    A          B
    A          C

As opposed to:

    Field 1    Field 2
    ---------- ------------
    A          C
    A          B

If so, that’s trivial SQL, well-covered in Simon’s reply.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to