Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-05-06 Thread Teodor Sigaev

As I understood it's because we can't move root to another page.


Actually not. Path to node could change completely, For example, for page on 
second level path is 0-234 (where 0 is a root page, 234 is a page on second 
level). After root split path will be 0-new_page-234.


If algorithm could be able to change root then new path could be looked as 
new_root-new_page-234 because old root could be splitted to old_root_page and 
new_page.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-05-06 Thread Alexander Korotkov
2011/5/6 Teodor Sigaev teo...@sigaev.ru

  As I understood it's because we can't move root to another page.


 Actually not. Path to node could change completely, For example, for page
 on second level path is 0-234 (where 0 is a root page, 234 is a page on
 second level). After root split path will be 0-new_page-234.

 If algorithm could be able to change root then new path could be looked as
 new_root-new_page-234 because old root could be splitted to old_root_page
 and new_page.


Ok. Thank you for explanation.


With best regards,
Alexander Korotkov.


Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-05-05 Thread Teodor Sigaev

gistFindCorrectParent function have branch with following comment:
/*
* awful!!, we need search tree to find parent ... , but before we
* should release all old parent
*/
Can you provide me an example of case when this branch works?


See several lines above:
if (parent-blkno == InvalidBlockNumber)

/*
 * end of chain and still didn't found parent, It's very-very
 * rare situation when root splited
 */
break;


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-05-05 Thread Alexander Korotkov
2011/5/5 Teodor Sigaev teo...@sigaev.ru

 See several lines above:
if (parent-blkno == InvalidBlockNumber)

/*
 * end of chain and still didn't found parent, It's
 very-very
 * rare situation when root splited
 */
break;


As I understood it's because we can't move root to another page.


With best regards,
Alexander Korotkov.


Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-05-04 Thread Alexander Korotkov
During studying of existing GiST code I have a question.
gistFindCorrectParent function have branch with following comment:
 /*
 * awful!!, we need search tree to find parent ... , but before we
 * should release all old parent
 */
Can you provide me an example of case when this branch works?


With best regards,
Alexander Korotkov.


Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-04-27 Thread Alexander Korotkov
On Tue, Apr 26, 2011 at 1:10 PM, Alexander Korotkov aekorot...@gmail.comwrote:

 Since algorithm is focused to reduce I/O, we should expect best
 acceleration in the case when index doesn't fitting to memory. Size of
 buffers is comparable to size of whole index. It means that if we can hold
 buffers in memory then we mostly can hold whole index in memory. That's why
 I think we should have simple on-disk buffers management for first
 representative benchmark.

Since we need to free all buffers after index built, I believe that buffers
should be stored separately. If not, index become bloat immediatly after
creation. We probably need to create a temporary relation to store buffers
in it. If my thought is right, then is there any example of using temporary
relation?


With best regards,
Alexander Korotkov.


Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-04-27 Thread Heikki Linnakangas

On 27.04.2011 09:51, Alexander Korotkov wrote:

On Tue, Apr 26, 2011 at 1:10 PM, Alexander Korotkovaekorot...@gmail.comwrote:


Since algorithm is focused to reduce I/O, we should expect best
acceleration in the case when index doesn't fitting to memory. Size of
buffers is comparable to size of whole index. It means that if we can hold
buffers in memory then we mostly can hold whole index in memory. That's why
I think we should have simple on-disk buffers management for first
representative benchmark.


Since we need to free all buffers after index built, I believe that buffers
should be stored separately. If not, index become bloat immediatly after
creation. We probably need to create a temporary relation to store buffers
in it. If my thought is right, then is there any example of using temporary
relation?


A temporary relation is a bit heavy-weight for this, a temporary file 
should be enough. See src/backend/storage/file/buffile.c, 
BufFileCreateTemp() function in particular. Or perhaps a tuplestore 
suits better, see src/backend/utils/sort/tuplestore.c, that's simpler to 
use if you're storing tuples. tuplestore only supports storing heap 
tuples at the moment, but it could easily be extended to store index 
tuples, like tuplesort.c does.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-04-26 Thread Heikki Linnakangas

Congrats on being selected, looking forward to mentor you!

On 25.04.2011 23:09, Alexander Korotkov wrote:

The first question that I would like to discuss is the node buffer storage.
During index build each index page (except leaf) should have several pages
of buffer. So my question is where to store buffers and how to operate with
them? It is somewhat similar to GIN fastupdate buffer, but have differences.
At first, we should take care about many buffers instead of only one. At
second, I belive that we shouldn't take care about concurrency so much,
because algorithm assume to perform relatively huge operations in memory
(entries relocation between several buffers). That require locking of whole
of currently operated buffers. I'm going to store buffers separetely from
index itself, because we should free all of them when index is built.


Just palloc() the buffers in memory, at least in the first phase. 
That'll work fine for index creation. Dealing with concurrent searches 
and inserts makes it a lot more complicated, it's better to make it work 
for the index creation first, and investigate something like the GIN 
fastupdate buffers later if you have time left.



I found some very simple solution about dealing with varlena keys. The
greatest buffer size and minimal level step are achived when key size is
minimal. Thereby, minimal key size is worst case. Since minimal varlena size
is 4 bytes, we can use it in initial calculations. I'm going to hold on this
assumption in first implementation.


Ok, good.

The first priority should be to have something that works enough to be 
benchmarked. The paper you referred to in the GSoC application [1] 
contained empirical results on the number of I/O operations needed with 
the algorithm, but it didn't take operating system cache into account at 
all. That makes the empiric results next to worthless; keeping some 
stuff in in-memory buffers is obviously going to reduce I/O if you don't 
take OS cache into account.


So we're going to need benchmark results that show a benefit, or there's 
no point in doing this at all. The sooner we get to benchmarking, even 
with a very limited and buggy version of the patch, the better. If the 
algorithm described in that paper doesn't give much benefit, you might 
have to switch to some other algorithm half-way through the project. 
Fortunately there's plenty of R-tree bulk loading algorithms in the 
literature, it should be possible to adapt some of them to GiST.


[1] http://dx.doi.org/10.1007/s00453-001-0107-6

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-04-26 Thread Alexander Korotkov
On Tue, Apr 26, 2011 at 10:46 AM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 Just palloc() the buffers in memory, at least in the first phase. That'll
 work fine for index creation. Dealing with concurrent searches and inserts
 makes it a lot more complicated, it's better to make it work for the index
 creation first, and investigate something like the GIN fastupdate buffers
 later if you have time left.

Since algorithm is focused to reduce I/O, we should expect best acceleration
in the case when index doesn't fitting to memory. Size of buffers is
comparable to size of whole index. It means that if we can hold buffers in
memory then we mostly can hold whole index in memory. That's why I think we
should have simple on-disk buffers management for first representative
benchmark.


 The first priority should be to have something that works enough to be
 benchmarked. The paper you referred to in the GSoC application [1] contained
 empirical results on the number of I/O operations needed with the algorithm,
 but it didn't take operating system cache into account at all. That makes
 the empiric results next to worthless; keeping some stuff in in-memory
 buffers is obviously going to reduce I/O if you don't take OS cache into
 account.

 So we're going to need benchmark results that show a benefit, or there's no
 point in doing this at all. The sooner we get to benchmarking, even with a
 very limited and buggy version of the patch, the better. If the algorithm
 described in that paper doesn't give much benefit, you might have to switch
 to some other algorithm half-way through the project. Fortunately there's
 plenty of R-tree bulk loading algorithms in the literature, it should be
 possible to adapt some of them to GiST.

 [1] http://dx.doi.org/10.1007/s00453-001-0107-6

Yes, these priority seems very reasonable. We should have first
effectiveness confirmation as soon as possible. I'll hold on this priority.


With best regards,
Alexander Korotkov.