Re: [HACKERS] GSoC 2017 : Patch for predicate locking in Gist index

2017-10-02 Thread Andrew Borodin
On Mon, Oct 2, 2017 at 8:00 PM, Alexander Korotkov wrote: > What happen if exactly this "continue" fires? > >> if (GistFollowRight(stack->page)) >> { >> if (!xlocked) >> { >> LockBuffer(stack->buffer, GIST_UNLOCK); >> LockBuffer(stack->buffer, GIST_EXCLUSIVE); >> xlocked = true; >> /* someone migh

Re: [HACKERS] GSoC 2017 : Patch for predicate locking in Gist index

2017-10-02 Thread Andrew Borodin
Hi, Alexander! Thanks for looking into the patch! On Thu, Sep 28, 2017 at 3:59 PM, Alexander Korotkov < a.korot...@postgrespro.ru> wrote: > > > In gistdoinsert() you do CheckForSerializableConflictIn() only if page > wasn't exclusively locked before (xlocked is false). > > if (!xlocked) >> { >>

[HACKERS] Re: [HACKERS] 答复: GiST API Adancement

2017-06-16 Thread Andrew Borodin
2017-06-16 17:06 GMT+03:00 Tom Lane : > Yuan Dong writes: >> ·¢¼þÈË: Andrew Borodin >>> I think there is one more aspect of development: backward >>> compatibility: it's impossible to update all existing extensions. This >>> is not that major feature to

Re: [HACKERS] GiST API Adancement

2017-06-15 Thread Andrew Borodin
Hi, Dong! 2017-06-15 21:19 GMT+05:00 Yuan Dong : > I'm going to hack on my own. With the help of Andrew Borodin, I want to > start the project with adding a third state to collision check. The third > state is that: > subtree is totally within the query. In this case, GiST s

Re: [HACKERS] GSoC 2017 weekly progress reports (week 2)

2017-06-13 Thread Andrew Borodin
2017-06-13 18:00 GMT+05:00 Shubham Barai : > > Project: Explicitly support predicate locks in index AMs besides b-tree > Hi, Shubham Good job! So, in current HEAD test predicate_gist_2.spec generate false positives, but with your patch, it does not? I'd suggest keeping spec tests with your code in

Re: [HACKERS] Range Merge Join v1

2017-06-02 Thread Andrew Borodin
2017-06-02 19:42 GMT+05:00 Jeff Davis : > On Tue, May 30, 2017 at 11:44 PM, Andrew Borodin wrote: >> 1. Are there any types, which could benefit from Range Merge and are >> not covered by this patch? > > I thought about this for a while, and the only thing I can think of &

Re: [HACKERS] GSoC 2017 : Proposal for predicate locking in gist index

2017-05-31 Thread Andrew Borodin
Hi, hackers! 2017-06-01 1:50 GMT+05:00 Kevin Grittner : > > The main difference between b-tree and gist index while searching for a > > target tuple is that in gist index, we can determine if there is a match or > > not at any level of the index. > > Agreed. A gist scan can fail at any level, but

Re: [HACKERS] Range Merge Join v1

2017-05-30 Thread Andrew Borodin
Hi, Jeff! Sorry for being late. Actually, I had several unsuccessful attempts to find something wrong with the patch. Here's my review. in pathkey.c ecs = (EquivalenceClass **) palloc(nClauses * sizeof(EquivalenceClass *)); scores = (int *) palloc(nClauses * sizeof(int)); range_ecs = palloc(nCla

Re: [HACKERS] Allow GiST opcalsses without compress\decompres functions

2017-05-29 Thread Andrew Borodin
2017-05-29 0:00 GMT+05:00 Tom Lane : > But the opclass's consistent function might expect to be always given an > untoasted input, in which case you'd need the decompress function to fix > that up. In cube data is detoasted at least 4 times before going to g_cube_internal_consistent(), including o

Re: [HACKERS] Allow GiST opcalsses without compress\decompres functions

2017-05-28 Thread Andrew Borodin
28 мая 2017 г. 11:15 PM пользователь "Tom Lane" написал: Andrew Borodin writes: > 2017-05-28 22:22 GMT+05:00 Tom Lane : >> 2. If compress/decompress are omitted, then we could support index-only >> scans all the time, that is a no-op fetch function would work. The &

Re: [HACKERS] Allow GiST opcalsses without compress\decompres functions

2017-05-28 Thread Andrew Borodin
Thanks, Tom! 2017-05-28 22:22 GMT+05:00 Tom Lane : > Andrew Borodin writes: >> Maybe we should make compress\decompress functions optional? > > 1. You'll need to adjust the documentation (gist.sgml) not just the code. Sure, I'll check out where compression is mentioned

[HACKERS] Allow GiST opcalsses without compress\decompres functions

2017-05-25 Thread Andrew Borodin
Hi, hackers! Currently, GiST stores each attribute in a compressed form. That is, each time attribute is written it's calling compress function, and when the attribute is accessed the decompress functions is called. Some types can't get any advantage out of this technique since the context of one

[HACKERS] Index Only Scan support for cube

2017-05-23 Thread Andrew Borodin
Hi, hackers! Here's a small patch that implements fetch function necessary for Index Only Scans that use cube data type. I reuse function g_cube_decompress() instead of creating new function g_cube_fetch(). Essentially, they both have to detoast data. How do you think, is it better to create a sh

Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-30 Thread Andrew Borodin
Hi, Kang and everyone in this thread. I'm planning to present the online course "Hacking PostgreSQL: data access methods in action and under the hood" on edX on June 1st. It's not announced yet, links will be available later. This course I'm describing information that was crucial for me to start

Re: [HACKERS] Merge join for GiST

2017-04-27 Thread Andrew Borodin
Hi, hackers! I've adapted crossmatch join from pgSphere to cube for performance tests. I've placed spatial join code here https://github.com/Octonica/postgres/blob/spatialjoin/contrib/cube/spatialjoin.c and node code here https://github.com/Octonica/postgres/blob/spatialjoin/contrib/cube/joinnode.

Re: [HACKERS] PG 10 release notes

2017-04-26 Thread Andrew Borodin
Hi, Bruce! 2017-04-25 6:31 GMT+05:00 Bruce Momjian : > The only unusual thing is that this release has ~180 items while most > recent release have had ~220. The pattern I see that there are more > large features in this release than previous ones. I'm not sure, but, probably, it worth mentioning

Re: [HACKERS] Range Merge Join v1

2017-04-21 Thread Andrew Borodin
Hi, Jeff! I'm planning to provide the review for this patch in future. I've done some performance tesing (see attachment). All in all, nothing important, everything works fine. Tests were executed under current HEAD on Ubuntu 16 over MacBook Air. I observe that: 1. Patch brings performance improv

Re: [HACKERS] Merge join for GiST

2017-04-13 Thread Andrew Borodin
2017-04-13 11:30 GMT+05:00 Jeff Davis : > I don't quite follow. I don't think any of these proposals uses btree, > right? Range merge join doesn't need any index, your proposal uses > gist, and PgSphere's crossmatch uses gist. Merge join will use presorted data, B-tree provides sorted data. Merge

Re: [HACKERS] Merge join for GiST

2017-04-12 Thread Andrew Borodin
2017-04-13 7:01 GMT+05:00 Jeff Davis : > On Tue, Apr 11, 2017 at 8:35 AM, Alexander Korotkov > wrote: >> On Tue, Apr 11, 2017 at 5:46 PM, Jeff Davis wrote: >>> Do you have a sense of how this might compare with range merge join? >> >> >> If you have GiST indexes over ranges for both sides of join

Re: [HACKERS] Merge join for GiST

2017-04-11 Thread Andrew Borodin
2017-04-11 14:17 GMT+05:00 Alexander Korotkov : > FYI, I've implemented this algorithm for pgsphere. See following branch. > https://github.com/akorotkov/pgsphere/tree/experimental > It's implemented as crossmatch() function which takes as arguments names of > two indexes over spoint and maximum d

Re: [HACKERS] Merge join for GiST

2017-04-11 Thread Andrew Borodin
2017-04-10 20:38 GMT+05:00 Robert Haas : > On Mon, Apr 10, 2017 at 7:53 AM, Andrew Borodin wrote: >> I think this idea is somewhat related to this patch [2], but as for >> now cannot describe how exactly GiST merge and Range Merge features >> relate. > > It also seem

[HACKERS] Merge join for GiST

2017-04-10 Thread Andrew Borodin
Hello, hackers! ==Spatial joins== Scientific papers from the dawn of R-trees and multidimensional indexes feature a lot of algorithms for spatial joins. I.e. you have two sets of geometries s1 and s2, you need to produce all colliding pairs (p1,p2) where p1 in s1 and p2 in s2. For 2 R-trees of equ

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-03-22 Thread Andrew Borodin
2017-03-22 22:48 GMT+05:00 Teodor Sigaev : > hasEmptyChild? and hasNonEmptyChild (BTW, isAnyNonempy has missed 't') Yes, I think this naming is good. It's clear what's in common in these flags and what's different. > And if the whole posting tree is empty,then we could mark root page as leaf > an

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-03-21 Thread Andrew Borodin
Hi, Teodor! 2017-03-21 20:32 GMT+05:00 Teodor Sigaev : > I had a look on patch That's great, thanks! > > /* > * All subtree is empty - just return TRUE to indicate that parent > must > * do a cleanup. Unless we are ROOT an there is way to go upper. > */ > >

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-03-16 Thread Andrew Borodin
2017-03-16 23:55 GMT+05:00 Peter Geoghegan : > On Thu, Mar 16, 2017 at 11:53 AM, Andrew Borodin wrote: >> 2. Thus, L&S fully concurrent vacuum is possible, indeed, and >> furthermore Theodor suggested that I should implement not only page >> eviction, but also pag

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-03-16 Thread Andrew Borodin
2017-03-16 21:27 GMT+05:00 David Steele : > This patch applies cleanly and compiles at cccbdde. > > Jeff, any thoughts on Andrew's responses? Hi, David! I've got some updates on the matter of this patch, since the understanding of the B-tree bothered me much. Currently, I'm at PgConf.Russia, wher

Re: [HACKERS] [GSoC] Self introduction and question to mentors

2017-03-09 Thread Andrew Borodin
Hi, Kirill! 2017-03-06 12:41 GMT+05:00 Кирилл Бороздин : > My name is Kirill Borozdin. I am a student of Ural Federal University from > Yekaterinburg, Russia. That's fine. I'm the associate professor at Ural Federal University. But not in your institution, though. > I understand that this task wi

Re: [HACKERS] [GSoC] Personal presentation and request for clarification

2017-03-03 Thread Andrew Borodin
Hi! It's great that you are interested in PostgreSQL! I'll answer the question on the matter of GSoC project proposed by me. I hope someone else will handle questions on your primary objective. 2017-03-03 4:15 GMT+05:00 João Miguel Afonso : > My second choice would be the "Sorting algorithms be

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-02-05 Thread Andrew Borodin
Hi, Jeff! 2017-02-05 3:45 GMT+05:00 Jeff Davis : > On Sun, Jan 22, 2017 at 10:32 PM, Jeff Davis wrote: >> On Sat, Jan 21, 2017 at 4:25 AM, Andrew Borodin wrote: >> One idea I had that might be simpler is to use a two-stage page >> delete. The first stage would remove the

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-01-29 Thread Andrew Borodin
2017-01-25 12:09 GMT+05:00 Andrew Borodin : > 2017-01-24 22:29 GMT+05:00 Jeff Davis : >> By the way, can you show me where the Lehman and Yao paper addresses >> page recycling? > > Here J. Hellerstein comments L&Y paper [1] saying that effectively > there is no deletio

Re: [HACKERS] pg_background contrib module proposal

2017-01-27 Thread Andrew Borodin
2017-01-27 19:14 GMT+05:00 Peter Eisentraut : > I suppose we should decide first whether we want pg_background as a > separate extension or rather pursue extending dblink as proposed elsewhere. > > I don't know if pg_background allows any use case that dblink can't > handle (yet). pg_background in

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-01-24 Thread Andrew Borodin
2017-01-24 22:29 GMT+05:00 Jeff Davis : > On Tue, Jan 24, 2017 at 3:18 AM, Andrew Borodin wrote: >> Technically, approach of locking a subtree is not novel. Lehman and >> Yao focused on "that any process for manipulating the tree uses only a >> small (constant) number

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-01-24 Thread Andrew Borodin
Hello! I see your point on alternatives. I will do my best to generate more ideas on how vacuum can be done withing existing index structure, this will take a day or two. In this message I'll answer some questions. 2017-01-23 22:45 GMT+05:00 Jeff Davis : > 1. I haven't seen the approach before of

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-01-23 Thread Andrew Borodin
Hi! 2017-01-23 11:32 GMT+05:00 Jeff Davis : > I have some reservations about the complexity and novelty of the > approach. I don't think I've seen an approach quite like this one > before. It seems like the main reason you are using this approach is > because the btree structure was too simple (no

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-01-21 Thread Andrew Borodin
Hello Jeff! >Review of the code itself: How do you think, is there anything else to improve in that patch or we can mark it as 'Ready for committer'? I've checked the concurrency algorithm with original work of Lehman and Yao on B-link-tree. For me everything looks fine (safe, deadlock free and n

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-01-19 Thread Andrew Borodin
Hi, Jeff! Thanks for review! Here's a patch corrected according to your suggestions. 2017-01-19 11:48 GMT+05:00 Jeff Davis : > https://www.postgresql.org/message-id/CAJEAwVE4UAmm8fr%2BNW8XTnKV6M--ACoNhL3ES8yoKL2sKhbaiw%40mail.gmail.com > > Let me re-summarize what's been done here to make sure I

Re: [HACKERS] background sessions

2017-01-11 Thread Andrew Borodin
2017-01-12 9:01 GMT+05:00 Peter Eisentraut : > On 1/10/17 10:58 AM, Robert Haas wrote: >> On Thu, Dec 29, 2016 at 5:18 PM, Peter Eisentraut >> wrote: >>> For additional entertainment, I include patches that integrate >>> background sessions into dblink. So dblink can open a connection to a >>> ba

Re: [HACKERS] GSoC 2017

2017-01-10 Thread Andrew Borodin
2017-01-10 14:53 GMT+05:00 Alexander Korotkov : > 1. What project ideas we have? I have one more project of interest which I can mentor. Topic. GiST API advancement ===Idea=== GiST API was designed at the beginning of 90th to reduce boilerplate code around data access methods over balanced tree.

Re: [HACKERS] GSoC 2017

2017-01-10 Thread Andrew Borodin
2017-01-10 14:53 GMT+05:00 Alexander Korotkov : > 1. What project ideas we have? Hi! I'd like to propose project on sorting algorithm research. I’m ready to be a mentor on this project. ===Topic=== Sorting algorithms benchmark and implementation. ===Idea=== Currently the PostgreSQL uses Hoare’s

Re: [HACKERS] pg_background contrib module proposal

2017-01-07 Thread Andrew Borodin
Hi! 2017-01-07 11:44 GMT+05:00 amul sul : > Changes: > 1. pg_background_launch renamed to pg_background_start > 2. pg_background_detach renamed to pg_background_close > 3. Added new api to display previously launch background worker & its > result count waiting to be read. Great work! > Note th

Re: [HACKERS] background sessions

2017-01-04 Thread Andrew Borodin
2017-01-04 10:23 GMT+05:00 amul sul : > One more query, can we modify > BackgroundSessionStart()/BackgroundSession struct to get background > worker PID as well? I think since session always has a PID it's absoultley reasonable to return PID. > I can understand this requirement could be sound usel

Re: [HACKERS] background sessions

2017-01-03 Thread Andrew Borodin
2017-01-03 19:39 GMT+05:00 Peter Eisentraut : > On 1/3/17 1:26 AM, amul sul wrote: > > One more requirement for pg_background is session, command_qh, > > response_qh and worker_handle should be last longer than current > > memory context, for that we might need to allocate these in > > TopMemoryCo

Re: [HACKERS] pg_background contrib module proposal

2016-12-21 Thread Andrew Borodin
2016-12-21 20:42 GMT+05:00 Robert Haas : > This whole subthread seems like a distraction to me. I find it hard > to believe that this test case would be stable enough to survive the > buildfarm where, don't forget, we have things like > CLOBBER_CACHE_ALWAYS machines where queries take 100x longer

Re: [HACKERS] pg_background contrib module proposal

2016-12-20 Thread Andrew Borodin
2016-12-21 4:55 GMT+05:00 David Fetter : > I see. > > I find the following a little easier to follow, and the sleeps still > work even when very short. Now I know a little more SQL :) Thank you. I'm not sure every platform supports microsecond sleeps. But it will work anyway. This test is determi

Re: [HACKERS] pg_background contrib module proposal

2016-12-19 Thread Andrew Borodin
2016-12-19 4:21 GMT+05:00 David Fetter : > Couldn't it sleep in increments smaller than a second? Like maybe > milliseconds? Also, it's probably cleaner (or at least more > comprehensible) to write something using format() and dollar quoting > than the line with the double 's. Right. Here's vers

Re: [HACKERS] background sessions

2016-12-16 Thread Andrew Borodin
2016-12-16 20:17 GMT+05:00 Peter Eisentraut : >> And one more thing... Can we have BackgroundSessionExecute() splitted >> into two parts: start query and wait for results? >> It would allow pg_background to reuse bgsession's code. > > Yes, I will look into that. Thank you. I'm marking both patches

Re: [HACKERS] background sessions

2016-12-14 Thread Andrew Borodin
2016-12-15 0:30 GMT+05:00 Peter Eisentraut : TryBeginSession()? >>> >>> What exactly would that do? >> Return status (success\failure) and session object, if a function succeeded. >> >> If there is max_connections exceeded, then (false,null). >> >> I'm not sure whether this idiom is common for

Re: [HACKERS] background sessions

2016-12-14 Thread Andrew Borodin
2016-12-14 20:45 GMT+05:00 Peter Eisentraut : > On 12/11/16 5:38 AM, Andrew Borodin wrote: >> 2. From my point of view the interface of the feature is the strong >> point here (compared to pg_background). But it does not help >> programmer to follow good practice: bgwor

Re: [HACKERS] pg_background contrib module proposal

2016-12-13 Thread Andrew Borodin
2016-12-13 12:55 GMT+05:00 amul sul : > I think background-session code is not that much deviated from > pg_background code, It is not derived, though it is not much deviated. background sessions code do not have detailed exceptions and code comments, but it is doing somewhat similar things. >IIUC

Re: [HACKERS] pg_background contrib module proposal

2016-12-12 Thread Andrew Borodin
Hi! Just in case you'd like to include sleepsort as a test, here it is wrapped as a regression test(see attachment). But it has serious downside: it runs no less than 5 seconds. Also I'll list here every parallelism feature I managed to imagine. It is not to say that I suggest having some of thes

Re: [HACKERS] background sessions

2016-12-11 Thread Andrew Borodin
Hi! I'm planning to review this patch. I have some questions, maybe answers could help me understand patch better. 1. As far as I can see, we connot use COPY FROM STDIN in bg session? Since one of purposes is to orchestrate transactions, may be that would be valuable. 2. From my point of view the

Re: [HACKERS] pg_background contrib module proposal

2016-12-11 Thread Andrew Borodin
2016-12-09 18:00 GMT+05:00 Robert Haas : > It looks like this could be reworked as a client of Peter Eisentraut's > background sessions code, which I think is also derived from > pg_background: > > http://archives.postgresql.org/message-id/e1c2d331-ee6a-432d-e9f5-dcf85cffa...@2ndquadrant.com > > Th

Re: [HACKERS] pg_background contrib module proposal

2016-12-09 Thread Andrew Borodin
I've read the code and now I have more suggestions. 1. Easy one. The case of different natts of expected and acutal result throws same errmsg as the case of wrong types. I think we should assist the user more here if (natts != tupdesc->natts) ereport(ERROR, (errcode(ERRCODE_DATATYPE_M

Re: [HACKERS] pg_background contrib module proposal

2016-12-07 Thread Andrew Borodin
This is simply wonderful! Finaly, I can implement my favorite sleepsort in postgres: create table input as select round(random()*20) x from generate_series(1,5,1); create table output(place int,value int); create sequence s start 1; create table handles as select pg_background_launch('select pg_sl

Re: [HACKERS] GIN non-intrusive vacuum of posting tree

2016-11-30 Thread Andrew Borodin
Here is v1 of the patch. Now it has changes for README and contains more comments clarifying changes of locking model. Also I will elaborate some more on what is patched. Main portion of changes is made to function ginVacuumPostingTreeLeaves(). Before the patch it was traversing tree in depth-firs

[HACKERS] GIN non-intrusive vacuum of posting tree

2016-11-28 Thread Andrew Borodin
Hi hackers! Here is a patch with more concurrency-friendly vacuum of GIN. ===Short problem statement=== Currently GIN vacuum during cleaning posting tree can lock this tree for a long time without real need. ===Problem statement=== Vacuum of posting tree (function ginVacuumPostingTree() in ginva

Re: [HACKERS] Fractal tree indexing

2016-11-16 Thread Andrew Borodin
Hi hackers! Here is prototype of procrastinating GiST. Or lazy GiST. Or buffered GiST. Indeed, there is nothing fractal about it. The concept is leaf tuples stall on internal pages. From time to time they are pushed down in batches. This code is far from perfect, I've coded this only for the rese

Re: [HACKERS] GiST penalty functions [PoC]

2016-10-02 Thread Andrew Borodin
> This thread has basically died, so marking as returned with feedback. Well, not exactly died, but it's kind of in blind lead. I'll summarize a bit for future references: 1. cube extension for indexing lowered dimensionality data (2d in 3d) is broken. Queries effectively will degrade to full inde

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-21 Thread Andrew Borodin
Hi Heikki! > Got a link for a description of the RR*-tree? Would be good to reference it > in the patch comments, too. Well, as for now, the best way to reach the paper is DOI 10.1145/1559845.1559929 http://sci-hub.cc/ Authors in conversations clearly stated that they endorse (not sure in correct

Re: [HACKERS] GiST: interpretation of NaN from penalty function

2016-09-15 Thread Andrew Borodin
> Certainly, "NaN means infinity", as your patch proposes, has no more basis to > it than "NaN means zero". You are absolutley right. Now I see that best interpretation is "NaN means NaN". Seems like we need only drop a check. Nodes with NaN penalties will be considered even worser than those with

[HACKERS] GiST: interpretation of NaN from penalty function

2016-09-14 Thread Andrew Borodin
Hi hackers! Currently GiST treats NaN penalty as zero penalty, in terms of generalized tree this means "perfect fit". I think that this situation should be considered "worst fit" instead. Here is a patch to highlight place in the code. I could not construct test to generate bad tree, which would b

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-14 Thread Andrew Borodin
Here is v6 of cube penalty patch. There was a warning about unused edge function under systems without __STDC_IEC_559__ defined, this patch fixes it. Regards, Andrey Borodin. diff --git a/contrib/cube/cube.c b/contrib/cube/cube.c index 3feddef..ad868ac 100644 --- a/contrib/cube/cube.c +++ b/contr

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-13 Thread Andrew Borodin
Hi hackers! Here is gist cube penaly patch v5. Changes: 1. union version of pack_float() is used, without warings and with minimum asm commands emitted 2. test for __STDC_IEC_559__ is added, this is a standard way of C99 to determine float compliance with IEEE754. ANSI-only compiler+libs, along wi

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-10 Thread Andrew Borodin
>Personally I wouldn't be very happy about an IEEE754 assumption. Ok, so let's avoid autoconf man. There is no real explanations of the ground for this assumption, just a reference to paper of David Goldberg (and there is no metion about IEEE754 is absoulte everywhere). BTW, may be we can ask David

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-08 Thread Andrew Borodin
>BTW, would someone explain to me why using a float here will not fail >catastrophically for inputs outside the range of float? Indeed, it will. And that's one of the reason I'm considering advancing GiST API instead of advancing extensions. It won't crash, but will produce terrible index, not s

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-08 Thread Andrew Borodin
>autoconf check for IEEE 754 floats Autoconf man says folowing: >it is safe to assume IEEE-754 in most portable code these days https://www.gnu.org/software/autoconf/manual/autoconf.html#Floating-Point-Portability > A union might be more readable Here is union version of the patch. It's slower 10%

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-07 Thread Andrew Borodin
Well, arithmetics is too fragile. This version of float packing with arithmetical packaging static float pack_float(float actualValue, int realm) { double max,min; max = FLT_MAX / ( 8 >> realm ); min = FLT_MAX / ( 16 >> realm ); if( realm == 0 ) min = 0; /* squeeze the actual value between min and

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-07 Thread Andrew Borodin
Oh, sorry, made one more attemp and now I see your algorithm differently. So you propose to use oE and oV as a markers of borders for what I call Realm. But there may be very little significant bits in one of this ranges. pg_sphere and PostGiS extensions tried to use 1 as a marker, with alike form

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-06 Thread Andrew Borodin
Hi Heikki! Thank you for reviewing the code, it's always inspiring when a work is noted (: >Looking at the code, by "margin", you mean the sum of all "edges", i.e. of each dimension, of the cube. Indeed. As far as I remember, this is a terminology of old R*-tree paper. Now they use the word "perim

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-01 Thread Andrew Borodin
Here is new version of the patch. Now function pack_float is commented better. All inline keywords are removed. I haven't found any performance loss for that. Remove of static's showed 1%-7% performance loss in SELECT computation (3 test runs), so I left statics where they are. Actually, to avoid

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-01 Thread Andrew Borodin
Thank you for your coments, Tom. > Modern compilers are generally able to make their own decisions about it, and > trying to put your thumb on the scales this heavily is not likely to improve > the code. Well, I have tested that combination of "static inline" affets performance of index build on

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-01 Thread Andrew Borodin
Hi hackers! Here is the new patch version. With the help of Mikhail Bakhterev I've optimized subroutines of the penalty function. Index build time now is roughly equivalent to build time before patch (test attached to thread start). Time of SELECT statement execution is reduced by 40%. Changes in

[HACKERS] GiST penalty functions [PoC]

2016-08-28 Thread Andrew Borodin
Hi, hackers! Some time ago I put a patch to commitfest that optimizes slightly GiST inserts and updates. It's effect in general is quite modest (15% perf improved), but for sorted data inserts are 3 times quicker. This effect I attribute to mitigation of deficiencies of old split algorithm. Alexa

Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-08-18 Thread Andrew Borodin
n, Octonica & Ural Federal University. 2016-08-16 20:23 GMT+05:00 Anastasia Lubennikova : > 09.08.2016 19:45, Andrew Borodin: >> >> Here is new version of the patch, now it includes recommendations from >> Anastasia Lubennikova. >> >> >> I've invest

Re: [HACKERS] WIP: Covering + unique indexes.

2016-08-17 Thread Andrew Borodin
> That was a bug caused by high key truncation, that occurs when index has more > than 3 levels. Fixed. Affirmative. I've tested index construction with 100M rows and subsequent execution of select queries using index, works fine. Best regards, Andrey Borodin, Octonica & Ural Federal University.

Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

2016-08-15 Thread Andrew Borodin
>So on average in a large randomly filled index, pages spend more time nearer 50% full than 100% full. I think we can make this number more...controllable. Before split we can check whether left and right pages both are in shared buffer and if they are seriously under certain fillfactor, say under

Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-08-09 Thread Andrew Borodin
Here is new version of the patch, now it includes recommendations from Anastasia Lubennikova. I've investigated anamalous index size decrease. Most probable version appeared to be true. Cube extension, as some others, use Guttman's polynomial time split algorithm. It is known to generate "needle-

Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-08-06 Thread Andrew Borodin
Anastasia , thank you for your attentive code examine. 2016-08-05 21:19 GMT+05:00 Anastasia Lubennikova : > First of all, shouldn't we use MAXALIGN(oldsize) instead of oldsize? > Although, I'm quite sure that it was already aligned somewhere before. > I doubt that the check (size_diff != MAXALIGN(

Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-30 Thread Andrew Borodin
Here is BRIN-compatible version of patch. Now function PageIndexTupleOverwrite consumes tuple size as a parameter, this behavior is coherent with PageAddItem. Also, i had to remove asserstion that item has a storage in the loop that adjusts line pointers. It would be great if someone could check th

Re: [HACKERS] Optimizing numeric SUM() aggregate

2016-07-29 Thread Andrew Borodin
I've tested patch with this query https://gist.github.com/x4m/fee16ed1a55217528f036983d88771b4 Test specs were: Ubuntu 14 LTS VM, dynamic RAM, hypervisor Windows Server 2016, SSD disk, core i5-2500. Configuration: out of the box master make. On 10 test runs timing of select statement: AVG 3739.8 m

Re: [HACKERS] Optimizing numeric SUM() aggregate

2016-07-27 Thread Andrew Borodin
that value 1 is hardcoded somewhere else, any other value is not recommended and a bunch of code is left for historical reasons. Best regards, Andrey Borodin, Octonica & Ural Federal University. 2016-07-27 13:57 GMT+05:00 Dean Rasheed : > On 27 July 2016 at 07:33, Andrew Borodin wrote: &

Re: [HACKERS] Optimizing numeric SUM() aggregate

2016-07-26 Thread Andrew Borodin
>I think we could do carry every 0x7FF / 1 accumulation, couldn't we? I feel that I have to elaborate a bit. Probably my calculations are wrong. Lets assume we already have accumulated INT_MAX of digits in previous-place accumulator. That's almost overflow, but that's not overflow. C

Re: [HACKERS] Optimizing numeric SUM() aggregate

2016-07-26 Thread Andrew Borodin
Hi! I like the idea and implementation, but I have one suggestion. > Instead of propagating carry after each new value, it's done only every > values (or at the end). I think we could do carry every 0x7FF / 1 accumulation, couldn't we? Best regards, Andrey Borodin, Octonica & Ural

Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-26 Thread Andrew Borodin
>Can you please patch BRIN to use this new function too? On my machine replacement of both BRIN update cases (see https://github.com/x4m/pggistopt/commit/a6d301ff79104b977619339d53aebf748045418a ) showed no performance changes on folowing update query (6 seconds of updates avg): create table dataT

Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-25 Thread Andrew Borodin
>Can you please patch BRIN to use this new function too? AFAIK Sergey Mirvoda was going to implement and test it. It is easy to do this optimization for brin_samepage_update (see patch draft in attachment, make check passes), but WAL of regular BRIN update is not clear for me. Best regards, Andrey

Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-24 Thread Andrew Borodin
>If a feature changes the shape of WAL records, XLOG_PAGE_MAGIC is bumped to >prevent any problems. I've attached patch with a bump, but, obviously, it'll be irrelevant after any other commit changing WAL shape. Here is a patch with updated GiST README. I haven't found apropriate place to describ

Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-07 Thread Andrew Borodin
One more thing came to my mind: WAL records done by the GiST before patch will corrupt GiST after patch if replayed. Is it a problem? May be it's prevented on some higher level? Best regards, Andrey Borodin, Octonica & Ural Federal University. 2016-07-06 22:11 GMT+05:00 Andrew Borodin

Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-06 Thread Andrew Borodin
ad. > > Also I suspect that PageIndexTupleOverwrite() should make a call to > ItemIdSetNormal() to pretend it is generic function and not just a > private part of GiST. > > Best regards, Andrey Borodin, Octonica & Ural Federal University. > > 2016-07-04 22:59 GMT+05:00 Andrew Bor

Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-05 Thread Andrew Borodin
write() should make a call to ItemIdSetNormal() to pretend it is generic function and not just a private part of GiST. Best regards, Andrey Borodin, Octonica & Ural Federal University. 2016-07-04 22:59 GMT+05:00 Andrew Borodin : > Thank you, Alvaro. > > Yes, now I see. I'll update g

Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-04 Thread Andrew Borodin
5:00 Alvaro Herrera : > Andrew Borodin wrote: >> Thank you, Amit. >> >> Currently, if WAL reconstruct page in another order it won't be a problem. > > We require that replay of WAL leads to identical bytes than the > original. Among other things, this enables use of a too

Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-04 Thread Andrew Borodin
th missing spaces later. I also found some more missing spaces in PageIndexTupleOverwrite call and typo in comments ("on-place"). Best regards, Andrey Borodin, Octonica & Ural Federal University. 2016-07-04 18:58 GMT+05:00 Amit Kapila : > On Mon, Jul 4, 2016 at 4:52 PM, Andrew Boro

[HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-04 Thread Andrew Borodin
Here is a patch with corrections from Alexander Korotkov. My tests, check and check-world show no problems against Ubuntu LTS 14 x64 VM. Best regards, Andrey Borodin, Octonica & Ural Federal University. 2016-07-04 10:41 GMT+05:00 Andrew Borodin : > Hi! >>I think you sho

[HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-03 Thread Andrew Borodin
me sanity checks copied from PageIndexTupleDelete). Next weekend I'll do more testing, and then add it to commitfest. Best regards, Andrey Borodin, Octonica & Ural Federal University. 2016-07-03 15:21 GMT+05:00 Alexander Korotkov : > Hi! > > On Sun, Jul 3, 2016 at 12:24 PM, Andr

[HACKERS] GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-03 Thread Andrew Borodin
Hi, hackers! I think there is some room for improving GiST inserts. Following is the description of what I think is performance problem. Function gistplacetopage in file /src/backend/access/gist/gist.c is responsible for updating or appending new tuple on GiST page. Currently, after checking nece

[HACKERS] [Proposal] Improvement of GiST page layout

2016-02-08 Thread Andrew Borodin
Hi, hackers! I want to propose improvement of GiST page layout. GiST is optimized to reduce disk operations on search queries, for example, windows search queries in case of R-tree. I expect that more complicated page layout will help to tradeoff some of CPU efficiency for disk efficiency. GiST