Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-23 Thread Francisco Olarte
Hi Rob:

On Tue, Aug 23, 2016 at 4:52 PM, Rob Sargent  wrote:
> By 'this' I was referring to the optimizations mentioned, and am wondering
> if this holds true under user load.

For that you'll have to refer to the source, or ask someone more
versed in pg source arcanes.

> Much magic can happen in a custom data
> load, but do these optimization apply to an application loading single (or
> perhaps several) records per transaction.  Does one, in that scenario, not
> suffer any consequence for continuously loading one side of the tree (the
> rightmost node?).

Not that much magic is neccesary. The time I did it I just needed to
detect on every insertion whether I was at the rightmost position (
made easier because I had minimum/maximum keys cached in the tree
object header ), and have a special routine for inserting a new last
node ( put in last page, whose pointer I had, grabbing a new one of
needed, whose pointer will be appended at the tail of the parent,
etc.., it was just a pruned down version of the general insert
routine, but made insertions run easily 20 times faster by avoiding
nearly every check knowing I was on the right edge ). I do not know if
pg inserts several items at a time in bulk loading, but I doubt it.
Normally every btree indexing library has some optimization for this
cases, as they are common, just like every real sort routine has some
optimization for presorted input.

Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-23 Thread Rob Sargent


On 08/23/2016 08:34 AM, Francisco Olarte wrote:

On Tue, Aug 23, 2016 at 4:28 PM, Rob Sargent  wrote:

On 08/23/2016 07:44 AM, Francisco Olarte wrote:

On Tue, Aug 23, 2016 at 2:26 PM, pinker  wrote:

I am just surprised by the order of magnitude in the difference though. 2
and 27 minutes that's the huge difference... I did another, simplified
test,
to make sure there is no duplicates and the only difference between both
sets is the order:

...

INSERT INTO t_sequential SELECT * FROM source_sequential;
102258,949 ms
INSERT INTO t_random SELECT * FROM source_random;
1657575,699 ms

If I read correctly, you are getting 100s/10Mkeys=10us/key in
sequential, and 165 in random.

I'm not surprissed at all. I've got greater differences on a memory
tree, sorted insertion can be easily optimized to be very fast. AS an
example, sequential insertion can easily avoid moving data while
filling the pages and, with a little care, it can also avoid some of
them when splitting. I'm not current with the current postgres
details, but it does not surprise me they have big optimizations for
this, especially when index ordered insertion is quite common in
things like bulk loads or timestamped log lines.

And if each insert is in a separate transaction, does this still hold true?

What are you referring to by 'this'? ( BTW, bear in mind one
transaction needs at least a disk flush, and, if done via network, at
least one RTT, so I doubt you can achieve 10us/transaction unless you
have very special conditions ).

Francisco Olarte.
By 'this' I was referring to the optimizations mentioned, and am 
wondering if this holds true under user load.  Much magic can happen in 
a custom data load, but do these optimization apply to an application 
loading single (or perhaps several) records per transaction.  Does one, 
in that scenario, not suffer any consequence for continuously loading 
one side of the tree (the rightmost node?).



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-23 Thread Francisco Olarte
On Tue, Aug 23, 2016 at 4:28 PM, Rob Sargent  wrote:
> On 08/23/2016 07:44 AM, Francisco Olarte wrote:
>> On Tue, Aug 23, 2016 at 2:26 PM, pinker  wrote:
>>> I am just surprised by the order of magnitude in the difference though. 2
>>> and 27 minutes that's the huge difference... I did another, simplified
>>> test,
>>> to make sure there is no duplicates and the only difference between both
>>> sets is the order:
>>
>> ...
>>>
>>> INSERT INTO t_sequential SELECT * FROM source_sequential;
>>> 102258,949 ms
>>> INSERT INTO t_random SELECT * FROM source_random;
>>> 1657575,699 ms
>>
>> If I read correctly, you are getting 100s/10Mkeys=10us/key in
>> sequential, and 165 in random.
>>
>> I'm not surprissed at all. I've got greater differences on a memory
>> tree, sorted insertion can be easily optimized to be very fast. AS an
>> example, sequential insertion can easily avoid moving data while
>> filling the pages and, with a little care, it can also avoid some of
>> them when splitting. I'm not current with the current postgres
>> details, but it does not surprise me they have big optimizations for
>> this, especially when index ordered insertion is quite common in
>> things like bulk loads or timestamped log lines.

> And if each insert is in a separate transaction, does this still hold true?

What are you referring to by 'this'? ( BTW, bear in mind one
transaction needs at least a disk flush, and, if done via network, at
least one RTT, so I doubt you can achieve 10us/transaction unless you
have very special conditions ).

Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-23 Thread Rob Sargent



On 08/23/2016 07:44 AM, Francisco Olarte wrote:

Hi pinker:

On Tue, Aug 23, 2016 at 2:26 PM, pinker  wrote:

I am just surprised by the order of magnitude in the difference though. 2
and 27 minutes that's the huge difference... I did another, simplified test,
to make sure there is no duplicates and the only difference between both
sets is the order:

...

INSERT INTO t_sequential SELECT * FROM source_sequential;
102258,949 ms
INSERT INTO t_random SELECT * FROM source_random;
1657575,699 ms

If I read correctly, you are getting 100s/10Mkeys=10us/key in
sequential, and 165 in random.

I'm not surprissed at all. I've got greater differences on a memory
tree, sorted insertion can be easily optimized to be very fast. AS an
example, sequential insertion can easily avoid moving data while
filling the pages and, with a little care, it can also avoid some of
them when splitting. I'm not current with the current postgres
details, but it does not surprise me they have big optimizations for
this, especially when index ordered insertion is quite common in
things like bulk loads or timestamped log lines.

Francisco Olarte.



And if each insert is in a separate transaction, does this still hold true?




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-23 Thread Francisco Olarte
Hi pinker:

On Tue, Aug 23, 2016 at 2:26 PM, pinker  wrote:
> I am just surprised by the order of magnitude in the difference though. 2
> and 27 minutes that's the huge difference... I did another, simplified test,
> to make sure there is no duplicates and the only difference between both
> sets is the order:
...
> INSERT INTO t_sequential SELECT * FROM source_sequential;
> 102258,949 ms
> INSERT INTO t_random SELECT * FROM source_random;
> 1657575,699 ms

If I read correctly, you are getting 100s/10Mkeys=10us/key in
sequential, and 165 in random.

I'm not surprissed at all. I've got greater differences on a memory
tree, sorted insertion can be easily optimized to be very fast. AS an
example, sequential insertion can easily avoid moving data while
filling the pages and, with a little care, it can also avoid some of
them when splitting. I'm not current with the current postgres
details, but it does not surprise me they have big optimizations for
this, especially when index ordered insertion is quite common in
things like bulk loads or timestamped log lines.

Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-23 Thread pinker
Francisco Olarte wrote
> It's already been told that btrees work that way, if you find itstrange
> read a bit about them, this is completely normal, but ...

I am just surprised by the order of magnitude in the difference though. 2
and 27 minutes that's the huge difference...I did another, simplified test,
to make sure there is no duplicates and the only difference between both
sets is the order:
CREATE TABLE source_sequential AS SELECT s from generate_series(1,1000)
as s; CREATE TABLE  source_randomAS SELECT * from source_sequential
ORDER BY random();CREATE TABLE t_sequential (id bigint);CREATE INDEX
i_sequential ON t_sequential (id);CREATE TABLE t_random (id bigint);CREATE
INDEX i_random ON t_random (id);INSERT INTO t_sequential SELECT * FROM
source_sequential;*102258,949 ms*INSERT INTO t_random SELECT * FROM
source_random;*1657575,699 ms*




--
View this message in context: 
http://postgresql.nabble.com/Sequential-vs-random-values-number-of-pages-in-B-tree-tp5916956p5917292.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-19 Thread Francisco Olarte
On Fri, Aug 19, 2016 at 3:20 PM, Daniel Verite  wrote:
> There's a simple technique that works on top of a Feistel network,
> called the cycle-walking cipher. Described for instance at:
> http://web.cs.ucdavis.edu/~rogaway/papers/subset.pdf
> I'm using the opportunity to add a wiki page:
> https://wiki.postgresql.org/wiki/Pseudo_encrypt_constrained_to_an_arbitrary_range
> with sample plgsql code for the [0..10,000,000] range that might be useful
> for other cases.

Nice reference, nice WikiPage, nice job. Bookmarking every thing.

> But for the btree fragmentation and final size issue, TBH I don't expect
> that constraining the values within that smaller range will make any
> difference
> in the tests, because it's the dispersion that matters, not the values
> themselves.

Neither do I, that is why I stated probabley good enough for tests,  but

> I mean that, whether the values are well dispersed in the [0..1e7] range or
> equally well dispersed in the [0..2**32] range, the probability of a newly
> inserted value to compare greater or lower to any previous values of the list
> should be the same, so shouldn't the page splits be the same, statistically
> speaking?

I know some btrees do prefix-coding of the keys, and some do it even
with long integers, and some others do delta-coding for integer keys.
I seem to recall postgres does not, that's whay I do not think it will
make a difference.

But anyway, to compare two things like that, as the original poster
was doing, I normally prefer to test just one thing at a time, that's
why I would normally try to do it by writing a sorted file, shuffling
it with sort -R, and copying it, server side if posible, to eliminate
so both

Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-19 Thread Daniel Verite
Francisco Olarte wrote:

> I think there are some pseudo-random number generators which
> can be made to work with any range, but do not recall which ones right
> now.

There's a simple technique that works on top of a Feistel network,
called the cycle-walking cipher. Described for instance at:
http://web.cs.ucdavis.edu/~rogaway/papers/subset.pdf

I'm using the opportunity to add a wiki page:
https://wiki.postgresql.org/wiki/Pseudo_encrypt_constrained_to_an_arbitrary_range
with sample plgsql code for the [0..10,000,000] range that might be useful
for other cases.

But for the btree fragmentation and final size issue, TBH I don't expect
that constraining the values within that smaller range will make any
difference
in the tests, because it's the dispersion that matters, not the values
themselves.

I mean that, whether the values are well dispersed in the [0..1e7] range or
equally well dispersed in the [0..2**32] range, the probability of a newly
inserted value to compare greater or lower to any previous values of the list
should be the same, so shouldn't the page splits be the same, statistically
speaking?

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-18 Thread Francisco Olarte
Daniel:
On Thu, Aug 18, 2016 at 5:24 PM, Daniel Verite  wrote:
>> unless you know of an easy way to generate a random permutation on the
>> fly without using a lot of memory, I do not.
> It could be done by encrypting the stream.
> For 32 bits integers:
> https://wiki.postgresql.org/wiki/Skip32
> For 64 bits integers:
> https://wiki.postgresql.org/wiki/XTEA

Nearly, probably good enough for tests, but only generates a
pseudorandom permutation if you encrypt 2**32/64 values, not with the
1..1E7 range, it will map them into 1E7 different numbers in the range
2**32/64. I think there are some pseudo-random number generators which
can be made to work with any range, but do not recall which ones right
now.

Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-18 Thread Daniel Verite
Francisco Olarte wrote:

> unless you know of an easy way to generate a random permutation on the
> fly without using a lot of memory, I do not.

It could be done by encrypting the stream.

For 32 bits integers:
https://wiki.postgresql.org/wiki/Skip32

For 64 bits integers:
https://wiki.postgresql.org/wiki/XTEA


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-18 Thread Francisco Olarte
Hi:

On Thu, Aug 18, 2016 at 1:32 PM, pinker  wrote:
...
> create table t01 (id bigint);
> create index i01 on t01(id);
> insert into t01 SELECT s from generate_series(1,1000) as s;
>
> and random values:
> create table t02 (id bigint);
> create index i02 on t02(id);
> insert into t02 SELECT random()*100 from generate_series(1,1000) as s;

It's already been told that btrees work that way, if you find it
strange read a bit about them, this is completely normal, but ...

... what I come to point is your test is severely flawed. It probably
does not matter in this case, but you are inserting 10M DIFFERENT
VALUES in the first case and only 100 in the second one, which an
average of 100K DUPLICATES of each. This affects btrees too. You could
try using random*1G, or at least 100M, for a better test ( which may
have even worse behaviour, ideally I would just write 10M integers to
a disk file, then shuffle it and compare COPY FROM times from both ) (
unless you know of an easy way to generate a random permutation on the
fly without using a lot of memory, I do not ).

Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-18 Thread pinker


W dniu 2016-08-18 14:19:25 użytkownik Ilya Kazakevich 
 napisał:
> >Thank you. So if that is the reason changing the fillfactor parameter should
> >help?
> 
> Fillfactor is not about rebalancing, but about page split. If you have many 
> insertions you may decrease fillfactor to minimize  page splits, but I am not 
> sure it will help in your case.  But you should try)
> Better approach is to create index _after_ insertion, but it is not always 
> possible.
> 
> 
> Ilya Kazakevich
> 
> JetBrains
> http://www.jetbrains.com
> The Drive to Develop
> 
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
> 

>From link you have pasted:
"Both insertions and deletions are fast as long as space is available on a 
block. If an insertion won't fit on the block, then some free space on some 
nearby block must be found and the auxiliary indices adjusted. The hope is that 
enough space is nearby such that a lot of blocks do not need to be reorganized."

and from postgres documentation:
fillfactor

The fillfactor for an index is a percentage that determines how full the 
index method will try to pack index pages. For B-trees, leaf pages are filled 
to this percentage during initial index build, and also when extending the 
index at the right (adding new largest key values)

So spliting happens when no room left on the page. But before that room can be 
used for further insertions...




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-18 Thread Ilya Kazakevich
>Thank you. So if that is the reason changing the fillfactor parameter should
>help?

Fillfactor is not about rebalancing, but about page split. If you have many 
insertions you may decrease fillfactor to minimize  page splits, but I am not 
sure it will help in your case.  But you should try)
Better approach is to create index _after_ insertion, but it is not always 
possible.


Ilya Kazakevich

JetBrains
http://www.jetbrains.com
The Drive to Develop



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-18 Thread pinker


W dniu 2016-08-18 14:00:31 użytkownik Ilya Kazakevich 
 napisał:
> Hi, 
> 
> >What's the reason that postgres needs more index pages to store random
> >data
> >than sequential ones?
> 
> I assume that is because B-Tree is self-balanced tree, so it needs to be
> rebalanced after each insertion.
> Random insertions may go to the head of index where no space left leading to
> huge data moving.
> https://en.wikipedia.org/wiki/B-tree#Insertions_and_deletions
> 
> 
> 
> Ilya Kazakevich
> 
> JetBrains
> http://www.jetbrains.com
> The Drive to Develop
> 
> 

Thank you. So if that is the reason changing the fillfactor parameter should 
help?



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-18 Thread Ilya Kazakevich
Hi, 

>What's the reason that postgres needs more index pages to store random
>data
>than sequential ones?

I assume that is because B-Tree is self-balanced tree, so it needs to be
rebalanced after each insertion.
Random insertions may go to the head of index where no space left leading to
huge data moving.
https://en.wikipedia.org/wiki/B-tree#Insertions_and_deletions



Ilya Kazakevich

JetBrains
http://www.jetbrains.com
The Drive to Develop



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general