[HACKERS] 答复: [HACKERS] 答复: GiST API Adancement

2017-06-17 Thread Yuan Dong
2017-06-16 17:06 GMT+03:00 Tom Lane <t...@sss.pgh.pa.us>:
> Yuan Dong <doff...@hotmail.com> writes:
>> ·¢¼þÈË: Andrew Borodin <boro...@octonica.com>
>>> I think there is one more aspect of development: backward
>>> compatibility: it's impossible to update all existing extensions. This
>>> is not that major feature to ignore them.
>
>> I should still maintain original API of GiST after modification.
>
> To put a little context on that: we have always felt that it's okay
> to require extension authors to make small adjustments when going to
> a new major release.


Tom's comment is very helpful, I will keep it in mind.


>Then maybe it worth to consider more general API advancement at once?
>Here's my wish list:
>1. Choose subtree function as an alternative for penalty
>2. Bulk load support
>3. Better split framework: now many extensions peek two seeds and
>spread split candidates among them, some with subtle but noisy
>mistakes
>4. Better way of key compares (this thread, actually)
>5. Split in many portions

Okay, I would try to do more advancements step by step.

I do not understand the problem of choosing subtree as an alternative for 
penalty. What do yo mean by choosing subtree function?


There will be more questions coming, hope you will be comfortable with that~


Best wishes.

--

Dong


____
发件人: Andrew Borodin <boro...@octonica.com>
发送时间: 2017年6月16日 16:48:04
收件人: Tom Lane
抄送: Yuan Dong; pgsql-hackers@postgresql.org
主题: Re: [HACKERS] 答复: GiST API Adancement

2017-06-16 17:06 GMT+03:00 Tom Lane <t...@sss.pgh.pa.us>:
> Yuan Dong <doff...@hotmail.com> writes:
>> ·¢¼þÈË: Andrew Borodin <boro...@octonica.com>
>>> I think there is one more aspect of development: backward
>>> compatibility: it's impossible to update all existing extensions. This
>>> is not that major feature to ignore them.
>
>> I should still maintain original API of GiST after modification.
>
> To put a little context on that: we have always felt that it's okay
> to require extension authors to make small adjustments when going to
> a new major release.

Then maybe it worth to consider more general API advancement at once?
Here's my wish list:
1. Choose subtree function as an alternative for penalty
2. Bulk load support
3. Better split framework: now many extensions peek two seeds and
spread split candidates among them, some with subtle but noisy
mistakes
4. Better way of key compares (this thread, actually)
5. Split in many portions

Best regards, Andrey Borodin.


[HACKERS] 答复: GiST API Adancement

2017-06-16 Thread Yuan Dong
Hi Andrey,


Thank you for your suggestion.



I should still maintain original API of GiST after modification.


--

Dong




发件人: Andrew Borodin <boro...@octonica.com>
发送时间: 2017年6月16日 6:24:03
收件人: Yuan Dong
抄送: pgsql-hackers@postgresql.org
主题: Re: GiST API Adancement

Hi, Dong!

2017-06-15 21:19 GMT+05:00 Yuan Dong <doff...@hotmail.com>:
> I'm going to hack on my own. With the help of Andrew Borodin, I want to
> start the project with adding a third state to collision check. The third
> state is that:
>  subtree is totally within the query. In this case, GiST scan can scan down
> subtree without key checks.

That's fine!

> After reading some code, I get this plan to modify the code:
>
> 1. Modify the consistent function of datatypes supported by GiST.
>
> 1.1 Start with cube, g_cube_consistent should return a third state when
> calling g_cube_internal_consistent. Inspired by Andrew Borodin, I have two
> solutions, 1)modify return value, 2) modify a reference type parameter.
>
> 2. Modify the gistindex_keytest in gistget.c to return multiple states
>
> 2.1 Need declare a new enum type(or define values) in gist_private.h or
> somewhere
>
> 3. Modify the gitsScanPage in gistget.c to deal with the extra state
>
> 4. Add a state to mark the nodes under this GISTSearchItem are all with in
> the query
>
> 4.1 Two ways to do this: 1) add a state to GISTSearchItem (prefered) 2)
> Somewhere else to record all this kind of items
>
> 4.2 We may use the block number as the key
>
> 4.3 Next time when the gistScanPage met this item, just find all the leaves
> directly(need a new function)
>
> After this, I shall start to benchmark the improvement and edit the code of
> other datatypes.
>
> Hope you hackers can give me some suggestions~

I think there is one more aspect of development: backward
compatibility: it's impossible to update all existing extensions. This
is not that major feature to ignore them.

Though for benchmarking purposes backward compatibility can be omitted.

Best regards, Andrey Borodin


[HACKERS] GiST API Adancement

2017-06-15 Thread Yuan Dong
Hi hackers,

I want to hack a little. I applied for GSoC project 
(https://wiki.postgresql.org/wiki/GSoC_2017#GiST_API_advancement), but now I'm 
going to hack on my own. With the help of Andrew Borodin, I want to start the 
project with adding a third state to collision check. The third state is that:
 subtree is totally within the query. In this case, GiST scan can scan down 
subtree without key checks.

After reading some code, I get this plan to modify the code:

1. Modify the consistent function of datatypes supported by GiST.

1.1 Start with cube, g_cube_consistent should return a third state when calling 
g_cube_internal_consistent. Inspired by Andrew Borodin, I have two solutions, 
1)modify return value, 2) modify a reference type parameter.

2. Modify the gistindex_keytest in gistget.c to return multiple states

2.1 Need declare a new enum type(or define values) in gist_private.h or 
somewhere

3. Modify the gitsScanPage in gistget.c to deal with the extra state

4. Add a state to mark the nodes under this GISTSearchItem are all with in the 
query

4.1 Two ways to do this: 1) add a state to GISTSearchItem (prefered) 2) 
Somewhere else to record all this kind of items

4.2 We may use the block number as the key

4.3 Next time when the gistScanPage met this item, just find all the leaves 
directly(need a new function)

After this, I shall start to benchmark the improvement and edit the code of 
other datatypes.

Hope you hackers can give me some suggestions~

--
Dong