Re: [sqlite] Is there a way to perform a muti-level sort and extract of large data sets?

2017-09-27 Thread Donald Griggs
Hi Ron,

It may be that increasing the sqlite cache size will substantially reduce
the time for either the CREATE INDEX or the SELECT ... ORDER BY
instructions (depending upon which method you choose).

https://sqlite.org/pragma.html#pragma_cache_size



On Wed, Sep 27, 2017 at 1:58 AM, jungle Boogie 
wrote:

> Hi there,
>
> Just because I'm interested, I'm wondering if you can identify your
> hardware, and how long it takes your system to do your desired operation on
> such a large number of records.
>
> Do let us know which option you performed the query with as well.
>
> P.S. For best results, I'd recommend using the latest release of SQLite.
>
> Thanks!
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Is there a way to perform a muti-level sort and extract of large data sets?

2017-09-26 Thread jungle Boogie
Hi there,

Just because I'm interested, I'm wondering if you can identify your
hardware, and how long it takes your system to do your desired operation on
such a large number of records.

Do let us know which option you performed the query with as well.

P.S. For best results, I'd recommend using the latest release of SQLite.

Thanks!
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Is there a way to perform a muti-level sort and extract of large data sets?

2017-09-26 Thread Ron Barnes


Than you!

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Stephen Chrzanowski
Sent: Tuesday, September 26, 2017 4:31 PM
To: SQLite mailing list 
Subject: Re: [sqlite] Is there a way to perform a muti-level sort and extract 
of large data sets?

uh.. UNIQUE... DISTINCT... ,

On Tue, Sep 26, 2017 at 4:30 PM, Stephen Chrzanowski 
wrote:

>
>
> On Tue, Sep 26, 2017 at 1:36 PM, Simon Slavin 
> wrote:
>
>>
>>
>> My one concern in reading your post is how your dates are formatted.
>> When putting your date fields into your SQL table you will have to 
>> ensure that dates are saved as a day number, or as text which 
>> naturally sorts into date order, e.g. /DD/MM.  You should not 
>> expect SQL to sort text such as "19 October 16" correctly.
>>
>> Simon.
>>
>
> @OP, Simon is dead on, however, the only correction and clarification 
> to that statement is you'll want (if required) to sort by /MM/DD, 
> not /DD/MM.
>
> Also, for deduplication, if you're executing one instruction, you can 
> add a UNIQUE clause after the SELECT, unless you've got other 
> requirements that make defines what a duplicate entry is, then you'd 
> have to rely on your software for those decisions.
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Is there a way to perform a muti-level sort and extract of large data sets?

2017-09-26 Thread Stephen Chrzanowski
uh.. UNIQUE... DISTINCT... ,

On Tue, Sep 26, 2017 at 4:30 PM, Stephen Chrzanowski 
wrote:

>
>
> On Tue, Sep 26, 2017 at 1:36 PM, Simon Slavin 
> wrote:
>
>>
>>
>> My one concern in reading your post is how your dates are formatted.
>> When putting your date fields into your SQL table you will have to ensure
>> that dates are saved as a day number, or as text which naturally sorts into
>> date order, e.g. /DD/MM.  You should not expect SQL to sort text such
>> as "19 October 16" correctly.
>>
>> Simon.
>>
>
> @OP, Simon is dead on, however, the only correction and clarification to
> that statement is you'll want (if required) to sort by /MM/DD, not
> /DD/MM.
>
> Also, for deduplication, if you're executing one instruction, you can add
> a UNIQUE clause after the SELECT, unless you've got other requirements that
> make defines what a duplicate entry is, then you'd have to rely on your
> software for those decisions.
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Is there a way to perform a muti-level sort and extract of large data sets?

2017-09-26 Thread Stephen Chrzanowski
On Tue, Sep 26, 2017 at 1:36 PM, Simon Slavin  wrote:

>
>
> My one concern in reading your post is how your dates are formatted.  When
> putting your date fields into your SQL table you will have to ensure that
> dates are saved as a day number, or as text which naturally sorts into date
> order, e.g. /DD/MM.  You should not expect SQL to sort text such as "19
> October 16" correctly.
>
> Simon.
>

@OP, Simon is dead on, however, the only correction and clarification to
that statement is you'll want (if required) to sort by /MM/DD, not
/DD/MM.

Also, for deduplication, if you're executing one instruction, you can add a
UNIQUE clause after the SELECT, unless you've got other requirements that
make defines what a duplicate entry is, then you'd have to rely on your
software for those decisions.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Is there a way to perform a muti-level sort and extract of large data sets?

2017-09-26 Thread Ron Barnes
That is exactly what I want to do

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Keith Medcalf
Sent: Tuesday, September 26, 2017 3:04 PM
To: SQLite mailing list 
Subject: Re: [sqlite] Is there a way to perform a muti-level sort and extract 
of large data sets?


SELECT DISTINCT * FROM table ORDER BY field1,field2,field3,field4 DESC,field5;

to do the whole sorting and de-duplication in one step ... assuming you want to 
report duplicate entire rows only once ...


---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.


>-Original Message-
>From: sqlite-users [mailto:sqlite-users- 
>boun...@mailinglists.sqlite.org] On Behalf Of Simon Slavin
>Sent: Tuesday, 26 September, 2017 11:37
>To: SQLite mailing list
>Subject: Re: [sqlite] Is there a way to perform a muti-level sort and 
>extract of large data sets?
>
>
>
>On 26 Sep 2017, at 6:24pm, Ron Barnes  wrote:
>
>> I need to sort them as follows...
>>
>> Sort Field 1 Ascending
>> Sort Field 2 Ascending WITHIN field 1 Sort Field 3 Ascending WITHIN 
>> field 2 WITHIN field 1 Sort Field 4 Descending WITHIN field 3 WITHIN 
>> field 2 WITHIN field
>1 <== This is a Date field and the most current (Highest) Date to float 
>up
>> Sort Field 5 Ascending WITHIN field 4 WITHIN field 3 WITHIN field 2
>WITHIN field 1
>
>Trivial in any SQL engine including SQLite.  It looks something like
>
>SELECT * FROM MyTable ORDER BY field1,field2,field3,field4
>DESC,field5
>
>SQL users do that sort of thing all the time.
>
>However, if you use just the above command, SQLite will have to perform 
>this sorting of 600M records each time you execute the command, which 
>could take quite a long time — minutes or hours depending on your 
>hardware.  So for any flavour of SQL you would probably tell it to 
>create an index …
>
>CREATE INDEX m_12345 ON MyTable (field1,field2,field3,field4
>DESC,field5)
>
>This tells SQL to perform the sorting and save the resulting order on 
>disk.  Then every time you perform the above SELECT command SQL notices 
>it already has the sort-order saved and just uses that one.
>This can change the amount of time taken to a few seconds.
>
>My one concern in reading your post is how your dates are formatted.
>When putting your date fields into your SQL table you will have to 
>ensure that dates are saved as a day number, or as text which naturally 
>sorts into date order, e.g. /DD/MM.  You should not expect SQL to 
>sort text such as "19 October 16" correctly.
>
>Simon.
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Is there a way to perform a muti-level sort and extract of large data sets?

2017-09-26 Thread Ron Barnes
Thank you very much! 

I will research the two suggestions below.

As for your sort assumption, you are correct.

A  A  1  2
A  A  2  1
A  B  1  2
A  B  1  3
A  C  1   1

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Warren Young
Sent: Tuesday, September 26, 2017 2:14 PM
To: SQLite mailing list 
Subject: Re: [sqlite] Is there a way to perform a muti-level sort and extract 
of large data sets?

On Sep 26, 2017, at 11:24 AM, Ron Barnes  wrote:
> 
> I have approximately 600 million records that need to be sorted

Where is the data now?

> There are 18 table entries.

You mean 18 columns per row, right?

> I also need to deduplicate the records based upon the sorted output file.

You speak of VB.NET, which means you don’t have a uniq tool as on POSIX systems:

   https://linux.die.net/man/1/uniq

If you can install Cygwin or WSL on these Windows boxes, then you’d have uniq, 
as well as a cross-platform solution.  SQLite is available for both Cygwin and 
WSL.

> I can take care of the deduplication (I think).

The basic functionality of uniq is indeed pretty simple: given sorted input, 
write as output only lines that don’t repeat the content of the previous input 
line.

The primary reason to mess with Cygwin or WSL on Windows is simply because 
using pre-built tools, you don’t have to debug and maintain it.  There’s value 
in “just run it through uniq.”  Even if you can write it in VB.net in half an 
hour, you’re vastly over-budget compared to the half second it takes me to type 
“ | uniq”.

> Sort Field 1 Ascending
> Sort Field 2 Ascending WITHIN field 1

I’m not sure what you mean by “WITHIN”.  Are you simply saying that you want 
the data sorted first by field 2 and then by field 1, so that when two records 
have the same field 1 content, that the output has that pair of records ordered 
by field 2?  E.g.

Field 1Field 2
-- 
A  B
A  C

As opposed to:

Field 1Field 2
-- 
A  C
A  B

If so, that’s trivial SQL, well-covered in Simon’s reply.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Is there a way to perform a muti-level sort and extract of large data sets?

2017-09-26 Thread Ron Barnes
Thank you so much - I will test this as soon as I get home!

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Simon Slavin
Sent: Tuesday, September 26, 2017 1:37 PM
To: SQLite mailing list 
Subject: Re: [sqlite] Is there a way to perform a muti-level sort and extract 
of large data sets?



On 26 Sep 2017, at 6:24pm, Ron Barnes  wrote:

> I need to sort them as follows...
> 
> Sort Field 1 Ascending
> Sort Field 2 Ascending WITHIN field 1
> Sort Field 3 Ascending WITHIN field 2 WITHIN field 1 Sort Field 4 
> Descending WITHIN field 3 WITHIN field 2 WITHIN field 1 <== This is a 
> Date field and the most current (Highest) Date to float up Sort Field 
> 5 Ascending WITHIN field 4 WITHIN field 3 WITHIN field 2 WITHIN field 
> 1

Trivial in any SQL engine including SQLite.  It looks something like

SELECT * FROM MyTable ORDER BY field1,field2,field3,field4 DESC,field5

SQL users do that sort of thing all the time.

However, if you use just the above command, SQLite will have to perform this 
sorting of 600M records each time you execute the command, which could take 
quite a long time — minutes or hours depending on your hardware.  So for any 
flavour of SQL you would probably tell it to create an index …

CREATE INDEX m_12345 ON MyTable (field1,field2,field3,field4 DESC,field5)

This tells SQL to perform the sorting and save the resulting order on disk.  
Then every time you perform the above SELECT command SQL notices it already has 
the sort-order saved and just uses that one.  This can change the amount of 
time taken to a few seconds.

My one concern in reading your post is how your dates are formatted.  When 
putting your date fields into your SQL table you will have to ensure that dates 
are saved as a day number, or as text which naturally sorts into date order, 
e.g. /DD/MM.  You should not expect SQL to sort text such as "19 October 
16" correctly.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Is there a way to perform a muti-level sort and extract of large data sets?

2017-09-26 Thread Keith Medcalf

SELECT DISTINCT * FROM table ORDER BY field1,field2,field3,field4 DESC,field5;

to do the whole sorting and de-duplication in one step ... assuming you want to 
report duplicate entire rows only once ...


---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.


>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of Simon Slavin
>Sent: Tuesday, 26 September, 2017 11:37
>To: SQLite mailing list
>Subject: Re: [sqlite] Is there a way to perform a muti-level sort and
>extract of large data sets?
>
>
>
>On 26 Sep 2017, at 6:24pm, Ron Barnes  wrote:
>
>> I need to sort them as follows...
>>
>> Sort Field 1 Ascending
>> Sort Field 2 Ascending WITHIN field 1
>> Sort Field 3 Ascending WITHIN field 2 WITHIN field 1
>> Sort Field 4 Descending WITHIN field 3 WITHIN field 2 WITHIN field
>1 <== This is a Date field and the most current (Highest) Date to
>float up
>> Sort Field 5 Ascending WITHIN field 4 WITHIN field 3 WITHIN field 2
>WITHIN field 1
>
>Trivial in any SQL engine including SQLite.  It looks something like
>
>SELECT * FROM MyTable ORDER BY field1,field2,field3,field4
>DESC,field5
>
>SQL users do that sort of thing all the time.
>
>However, if you use just the above command, SQLite will have to
>perform this sorting of 600M records each time you execute the
>command, which could take quite a long time — minutes or hours
>depending on your hardware.  So for any flavour of SQL you would
>probably tell it to create an index …
>
>CREATE INDEX m_12345 ON MyTable (field1,field2,field3,field4
>DESC,field5)
>
>This tells SQL to perform the sorting and save the resulting order on
>disk.  Then every time you perform the above SELECT command SQL
>notices it already has the sort-order saved and just uses that one.
>This can change the amount of time taken to a few seconds.
>
>My one concern in reading your post is how your dates are formatted.
>When putting your date fields into your SQL table you will have to
>ensure that dates are saved as a day number, or as text which
>naturally sorts into date order, e.g. /DD/MM.  You should not
>expect SQL to sort text such as "19 October 16" correctly.
>
>Simon.
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Is there a way to perform a muti-level sort and extract of large data sets?

2017-09-26 Thread Warren Young
On Sep 26, 2017, at 11:24 AM, Ron Barnes  wrote:
> 
> I have approximately 600 million records that need to be sorted

Where is the data now?

> There are 18 table entries.

You mean 18 columns per row, right?

> I also need to deduplicate the records based upon the sorted output file.

You speak of VB.NET, which means you don’t have a uniq tool as on POSIX systems:

   https://linux.die.net/man/1/uniq

If you can install Cygwin or WSL on these Windows boxes, then you’d have uniq, 
as well as a cross-platform solution.  SQLite is available for both Cygwin and 
WSL.

> I can take care of the deduplication (I think).

The basic functionality of uniq is indeed pretty simple: given sorted input, 
write as output only lines that don’t repeat the content of the previous input 
line.

The primary reason to mess with Cygwin or WSL on Windows is simply because 
using pre-built tools, you don’t have to debug and maintain it.  There’s value 
in “just run it through uniq.”  Even if you can write it in VB.net in half an 
hour, you’re vastly over-budget compared to the half second it takes me to type 
“ | uniq”.

> Sort Field 1 Ascending
> Sort Field 2 Ascending WITHIN field 1

I’m not sure what you mean by “WITHIN”.  Are you simply saying that you want 
the data sorted first by field 2 and then by field 1, so that when two records 
have the same field 1 content, that the output has that pair of records ordered 
by field 2?  E.g.

Field 1Field 2
-- 
A  B
A  C

As opposed to:

Field 1Field 2
-- 
A  C
A  B

If so, that’s trivial SQL, well-covered in Simon’s reply.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Is there a way to perform a muti-level sort and extract of large data sets?

2017-09-26 Thread Simon Slavin


On 26 Sep 2017, at 6:24pm, Ron Barnes  wrote:

> I need to sort them as follows...
> 
> Sort Field 1 Ascending
> Sort Field 2 Ascending WITHIN field 1
> Sort Field 3 Ascending WITHIN field 2 WITHIN field 1
> Sort Field 4 Descending WITHIN field 3 WITHIN field 2 WITHIN field 1 <== This 
> is a Date field and the most current (Highest) Date to float up
> Sort Field 5 Ascending WITHIN field 4 WITHIN field 3 WITHIN field 2 WITHIN 
> field 1

Trivial in any SQL engine including SQLite.  It looks something like

SELECT * FROM MyTable ORDER BY field1,field2,field3,field4 DESC,field5

SQL users do that sort of thing all the time.

However, if you use just the above command, SQLite will have to perform this 
sorting of 600M records each time you execute the command, which could take 
quite a long time — minutes or hours depending on your hardware.  So for any 
flavour of SQL you would probably tell it to create an index …

CREATE INDEX m_12345 ON MyTable (field1,field2,field3,field4 DESC,field5)

This tells SQL to perform the sorting and save the resulting order on disk.  
Then every time you perform the above SELECT command SQL notices it already has 
the sort-order saved and just uses that one.  This can change the amount of 
time taken to a few seconds.

My one concern in reading your post is how your dates are formatted.  When 
putting your date fields into your SQL table you will have to ensure that dates 
are saved as a day number, or as text which naturally sorts into date order, 
e.g. /DD/MM.  You should not expect SQL to sort text such as "19 October 
16" correctly.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Is there a way to perform a muti-level sort and extract of large data sets?

2017-09-26 Thread Ron Barnes
Hello All,

I have approximately 600 million records that need to be sorted and then 
extracted to a flat file.  I am unable to code a solution using visual Basic 
.NET.  It was suggested to me that a DB engine could perform my task for me.

Is there a way to accomplish this using the multi-level sort example below?

There are 18 table entries.  I need to extract all 18 entries to create 
individual records but in a certain order.  I also need to deduplicate the 
records based upon the sorted output file.  I can take care of the 
deduplication (I think).  I just need to get the records in the right order 
first.

I need to sort them as follows...

Sort Field 1 Ascending
Sort Field 2 Ascending WITHIN field 1
Sort Field 3 Ascending WITHIN field 2 WITHIN field 1
Sort Field 4 Descending WITHIN field 3 WITHIN field 2 WITHIN field 1 <== This 
is a Date field and the most current (Highest) Date to float up
Sort Field 5 Ascending WITHIN field 4 WITHIN field 3 WITHIN field 2 WITHIN 
field 1

Thank you in advance!

-Ron

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users