Re: [HACKERS] WIP: Fast GiST index build

2011-08-22 Thread Alexander Korotkov
On Wed, Aug 17, 2011 at 11:11 AM, Alexander Korotkov wrote: > On Tue, Aug 16, 2011 at 11:15 PM, Heikki Linnakangas < > heikki.linnakan...@enterprisedb.com> wrote: > >> On 16.08.2011 22:10, Heikki Linnakangas wrote: >> >>> Here's an version of the patch wit

Re: [HACKERS] WIP: Fast GiST index build

2011-08-17 Thread Alexander Korotkov
an attachment... Thanks a lot. I'm going to start rerunning the tests now. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-08-16 Thread Alexander Korotkov
u might as well pull tuples directly from the > heap. > Yes, seems reasonable. Buffer on the root node was in the paper. But now I don't see the need of it too. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-08-16 Thread Alexander Korotkov
Thus, we can calculate: levelstep = min(log_r(1 + effective_cache_size_in_pages*(r - 1)) - 1, log_r(maintenance_work_mem_in_pages - 1)) and get more precise result. But also we need at least very rough estimate of memory occupied by node buffers hash tab and path items. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-08-16 Thread Alexander Korotkov
I found that I forgot to remove levelstep and buffersize from reloptions.c. Updated patch is attached. -- With best regards, Alexander Korotkov. gist_fast_build-0.14.1.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To

Re: [HACKERS] WIP: Fast GiST index build

2011-08-14 Thread Alexander Korotkov
Hi! Thank you for your notes. On Fri, Aug 12, 2011 at 7:04 PM, Robert Haas wrote: > On Thu, Aug 11, 2011 at 6:21 AM, Alexander Korotkov > wrote: > > [ new patch ] > > Some random comments: > > - It appears that the "noFollowFight" flag is really supp

Re: [HACKERS] WIP: Fast GiST index build

2011-08-12 Thread Alexander Korotkov
-level > buffer. Wouldn't it be fairly simple to detect that case explicitly, and > stop the emptying process only if one of the lower-level buffers really > fills up? That should be more efficient, as you would have "swap" between > different subtrees less often. Yes, it seems reasonable to me. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-08-11 Thread Alexander Korotkov
On Thu, Aug 11, 2011 at 2:28 PM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote: > On 10.08.2011 22:44, Alexander Korotkov wrote: > >> Manual and readme updates. >> > > Thanks, I'm reviewing these now. > > Do we want to expose the leve

Re: [HACKERS] WIP: Fast GiST index build

2011-08-11 Thread Alexander Korotkov
build(). That's why I avoided deletion from lists. Now I've added additional check for root node in loop over list. > The calculations in initBuffering() desperately need comments. As does the > rest of the code too, but the heuristics in that function are particularly > hard t

Re: [HACKERS] WIP: Fast GiST index build

2011-08-11 Thread Alexander Korotkov
ptional. We can try it as add-on patch. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-08-10 Thread Alexander Korotkov
Manual and readme updates. -- With best regards, Alexander Korotkov. gist_fast_build-0.12.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql

Re: [HACKERS] WIP: Fast GiST index build

2011-08-10 Thread Alexander Korotkov
effective_cache_size. 4) Some renames. In particular GISTLoadedPartItem to GISTBufferingInsertStack. 5) Some comments were corrected and some were added. 6) pgindent 7) rebased with head Readme update and user documentation coming soon. -- With best regards, Alexander Korotkov. gist_fast_build-0.11.0

Re: [HACKERS] Compiler warnings with stringRelOpts (was WIP: Fast GiST index build)

2011-08-08 Thread Alexander Korotkov
it also seems that both were added by your commit of table-based parser: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ba748f7a11ef884277b61d1708a17a44acfd1736 -- With best regards, Alexander Korotkov.

Re: [HACKERS] Compiler warnings with stringRelOpts (was WIP: Fast GiST index build)

2011-08-08 Thread Alexander Korotkov
is possible? -- With best regards, Alexander Korotkov.

[HACKERS] Compiler warnings with stringRelOpts (was WIP: Fast GiST index build)

2011-08-08 Thread Alexander Korotkov
pts[0].default_val’**) It is caused by definition of default field of relopt_string structure as 1-length character array. This seems to be a design flaw in the reloptions.c code. Any thoughts? -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-08-08 Thread Alexander Korotkov
t; * Any user-visible options need to be documented in the user manual. > * And of course, make sure comments and the readme are up-to-date. > * Compiler warning: > > reloptions.c:259: warning: initializer-string for array of chars is too > long > reloptions.c:259: warning: (near initialization for > ‘stringRelOpts[0].default_val’**) > > I don't think there's a way to add an entry to stringRelOpts in a way that > works correctly. That's a design flaw in the reloptions.c code that has > never come up before, as there hasn't been any string-formatted relopts > before (actually buffering option might be better served by an enum > reloption too, if we had that). Please start a new thread on that on > pgsql-hackers. Ok. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-08-07 Thread Alexander Korotkov
items. If we imagine extremely large tree with levelstep = 1 size of this datastructures will be singnificant. And it's hard to predict that size without knowing of tree size. -- With best regards, Alexander Korotkov. gist_fast_build-0.10.0.patch.gz Description: GNU Zip compressed data

Re: [HACKERS] WIP: Fast GiST index build

2011-08-04 Thread Alexander Korotkov
Uhh, my bad, really stupid bug. Many thanks. -- With best regards, Alexander Korotkov. On Wed, Aug 3, 2011 at 8:31 PM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote: > On 03.08.2011 11:18, Alexander Korotkov wrote: > >> I found that in previous version

Re: [HACKERS] WIP: Fast GiST index build

2011-08-03 Thread Alexander Korotkov
only to 44/9C518750" appears. Seems that there is some totally wrong use of WAL if even optimization level does matter... -- With best regards, Alexander Korotkov. gist_fast_build-0.9.1.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgs

Re: [HACKERS] WIP: Fast GiST index build

2011-08-02 Thread Alexander Korotkov
ffset - (0001004000D2,10982288) (1 row) May be you have any ideas about it? -- With best regards, Alexander Korotkov. gist_fast_build-0.9.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.or

Re: Hot standby and GiST page splits (was Re: [HACKERS] WIP: Fast GiST index build)

2011-08-01 Thread Alexander Korotkov
quot;Concurrency and Recovery for GiSTs". http://www.sai.msu.su/~megera/postgres/gist/papers/concurrency/access-methods-for-next-generation.pdf.gz -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-28 Thread Alexander Korotkov
Also I need to verify final emptying, because IO guarantees of original paper is based on strict descending final emptying. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-28 Thread Alexander Korotkov
buffers. But those paths can become invalid on page splits. It seems to me that approximately same volume of code for maintaining parent links should be added to this version of patch in order to get it working correctly. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-27 Thread Alexander Korotkov
2) blk: 564 numTuple: 109 free: 3352b(58.92%) rightlink:4294967295 (InvalidBlockNumber) > 7(l:2) blk: 165 numTuple: 108 free: 3396b(58.38%) rightlink:4294967295 (InvalidBlockNumber) . Isn't it a bug? -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-27 Thread Alexander Korotkov
think modification of WAL record structure is possible or I have to insert downlink one by one in buffering build too? -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-26 Thread Alexander Korotkov
ry to evade keeping loaded-tree hash. This might help me a lot. Thanks. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-22 Thread Alexander Korotkov
ome researches about distribution law for such cases (while I didn't find anything relevant in google scholar)? As an alternative I can propose take into account actual average IO operations per tuple rather then an estimate. -- With best regards, Alexander Korotkov. On Mon, Jul 18, 20

Re: [HACKERS] WIP: Fast GiST index build

2011-07-18 Thread Alexander Korotkov
tomatic switching. 2) Take care about concurrent backends in automatic switching. -- With best regards, Alexander Korotkov. gist_fast_build-0.7.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Fast GiST index build

2011-07-15 Thread Alexander Korotkov
Fri, Jul 15, 2011 at 12:53 AM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote: > On 14.07.2011 23:41, Alexander Korotkov wrote: > >> Do you think using "rightlink" as pointer to parent page is possible >> during >> index build? It wo

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-07-15 Thread Alexander Korotkov
g subtly different in > different versions, if we need to backpatch bug fixes that use that field. Yes, it seems very reasonable. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Patch Review: Collect frequency statistics and selectivity estimation for arrays

2011-07-15 Thread Alexander Korotkov
standard use of histogram slot should be avoided. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Patch Review: Collect frequency statistics and selectivity estimation for arrays

2011-07-15 Thread Alexander Korotkov
tree and frequency/length statistics for arrays? > 1) In calc_distr you go to some lengths to avoid round off errors. Since it > is > certainly just the order of the estimate that matters, why not just > perform the calculation in log space? > It seems to me that I didn't anything to avoid round off errors there... -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-14 Thread Alexander Korotkov
fd.c directly, ie. > OpenTemporaryFile/FileRead/**FileWrite. BufFile is distributing temporary data through several files. AFAICS postgres avoids working with files larger than 1GB. Size of tree buffers can easily be greater. Without BufFile I need to maintain set of files manually. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-14 Thread Alexander Korotkov
Do you think using "rightlink" as pointer to parent page is possible during index build? It would allow to simplify code significantly, because of no more need to maintain in-memory structures for parents memorizing. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-07-14 Thread Alexander Korotkov
lidOffsetNumber; >ptr->parent = top->parent; >ptr->next = top->next; > top->next = ptr; >if (tail == top); >tail = ptr; } > -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-14 Thread Alexander Korotkov
od. But concurent backends makes this method unsafe. There are other difficulties with concurrent backends: it would be nice estimate usage of effective cache by other backeds before switching to buffering method. If don't take care about it then we can don't switch to buffering method wh

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-07-13 Thread Alexander Korotkov
Thank you very much for detail explanation. But this line of modified patch seems strange for me: *newchildoffnum = blkno; I believe it should be: *newchildoffnum = i; -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-13 Thread Alexander Korotkov
On Wed, Jul 13, 2011 at 12:40 PM, Alexander Korotkov wrote: > On Wed, Jul 13, 2011 at 12:33 PM, Heikki Linnakangas < > heikki.linnakan...@enterprisedb.com> wrote: > >> Is it possible to switch to the new buffering method in the middle of an >> index build? We could use

Re: [HACKERS] WIP: Fast GiST index build

2011-07-13 Thread Alexander Korotkov
tive_cache_size?), and switch to the buffering > method after that. Yes, it seems to be possible. It also would be great to somehow detect case of ordered data when regular index build performs well. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-12 Thread Alexander Korotkov
New version of patch with a little more refactoring and comments. -- With best regards, Alexander Korotkov. gist_fast_build-0.6.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http

Re: [HACKERS] WIP: Fast GiST index build

2011-07-12 Thread Alexander Korotkov
On Fri, Jul 8, 2011 at 6:18 PM, Tom Lane wrote: > For test purposes, you could turn off synchronize_seqscans to prevent > that. Thanks, it helps. I'm rerunning tests now. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-08 Thread Alexander Korotkov
tab1. But actually, it isn't always so. In aggregate with only few used test cases it can cause significant error. I'm going to make use some more thought-out testing method. Probably, some more precise index quality measure exists (even for R-tree). -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-07 Thread Alexander Korotkov
much (in my expectation decrease of buffer size should make all build parameters closer to regular build). I'm going to recheck my experiments, probably I'm missing something. -- With best regards, Alexander Korotkov. gist_fast_build-0.5.0.patch.gz Description: GNU Zip compressed dat

Re: [HACKERS] WIP: Fast GiST index build

2011-07-01 Thread Alexander Korotkov
uld allow me to undertand tradeoffs better. -- With best regards, Alexander Korotkov. gist_fast_build-0.4.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-06-30 Thread Alexander Korotkov
situation > which I can't induce at will. > Yes, it also seems pretty hard to get this code section executed for me. I'm going to ask Teodor and Oleg about it. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-28 Thread Alexander Korotkov
New version of patch. Bug which caused falldown on trees with high number of levels was fixed. Also some more comments and refactoring. -- With best regards, Alexander Korotkov. gist_fast_build-0.3.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-06-28 Thread Alexander Korotkov
t and fastbuild insert. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-28 Thread Alexander Korotkov
On Mon, Jun 27, 2011 at 10:32 PM, Alexander Korotkov wrote: > I didn't have an estimate yet, but I'm working on it. > Now, it seems that I have an estimate. N - total number of itups B - avg. number of itups in page H - height of tree K - avg. number of itups fitting in node buf

Re: [HACKERS] WIP: Fast GiST index build

2011-06-27 Thread Alexander Korotkov
depend on concurrent load. When there are no concurrent load (as in my experiments) fraction of tree which fits to effective cache is reasonable for estimating benefit of IO economy. But with high concurrent load part of cache occupied by tree should be considerable smaller than whole effective cache. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-27 Thread Alexander Korotkov
of this dataset. In this case results was similar to random generated data. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-25 Thread Alexander Korotkov
#x27;t give much benefit. But anyway, the value of this work can be in producing better index in some cases and SSD drive lifetime economy due to less IO operations. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-24 Thread Alexander Korotkov
On Fri, Jun 24, 2011 at 12:40 PM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote: > On 21.06.2011 13:08, Alexander Korotkov wrote: > >> I've created section about testing in project wiki page: >> http://wiki.postgresql.org/**wiki/Fast_GiS

Re: [HACKERS] hstore - Implementation and performance issues around its operators

2011-06-21 Thread Alexander Korotkov
? > I believe it's an architecture problem. You actually need not '->' operator to be supported by GiST but "column -> 'field_name' = value" expression. Probably, I'm missing something, but I think supporting of this require significant catalog changes. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-21 Thread Alexander Korotkov
relatively expensive penalty method in that opclass. But, probably index build can be still faster when index doesn't fit cache even for gist_trgm_ops. Also with that opclass index quality is slightly worse but the difference is not dramatic. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-20 Thread Alexander Korotkov
- With best regards, Alexander Korotkov. On Thu, Jun 16, 2011 at 10:35 PM, Alexander Korotkov wrote: > Oh, actually it's so easy. Thanks. > > -- > With best regards, > Alexander Korotkov. > > On Thu, Jun 16, 2011 at 10:26 PM, Heikki Linnakangas < > heikki.li

Re: [HACKERS] WIP: Fast GiST index build

2011-06-16 Thread Alexander Korotkov
Oh, actually it's so easy. Thanks. -- With best regards, Alexander Korotkov. On Thu, Jun 16, 2011 at 10:26 PM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote: > On 16.06.2011 21:13, Alexander Korotkov wrote: > >> My current idea is to measure n

Re: [HACKERS] WIP: Fast GiST index build

2011-06-16 Thread Alexander Korotkov
My current idea is to measure number of IO accesses by pg_stat_statements and measure CPU usage by /proc/PID/stat. Any thoughts? -- With best regards, Alexander Korotkov. On Thu, Jun 16, 2011 at 1:33 PM, Alexander Korotkov wrote: > Actually, I would like to measure CPU and IO l

Re: [HACKERS] WIP: Fast GiST index build

2011-06-16 Thread Alexander Korotkov
Actually, I would like to measure CPU and IO load independently for more comprehensive benchmarks. Can you advice me some appropriate tools for it? -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-15 Thread Alexander Korotkov
don't know now to do that on Linux. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-15 Thread Alexander Korotkov
On Wed, Jun 15, 2011 at 11:21 AM, Alexander Korotkov wrote: > I've tried index tuples sorting on penalty function before buffer > relocation on split. But it was without any success. Index quality becomes > even worse than without sorting. > The next thing I've tried is buf

Re: [HACKERS] WIP: Fast GiST index build

2011-06-15 Thread Alexander Korotkov
16 rows=855 loops=1) Index Cond: (v <@ '(0.503,0.503),(0.5,0.5)'::box) Buffers: shared hit=240 Total runtime: 3.043 ms (7 rows) -- With best regards, Alexander Korotkov. gist_fast_build-0.1.0.patch.gz Description: GNU Zip compressed data --

Re: [HACKERS] WIP: collect frequency statistics for arrays

2011-06-14 Thread Alexander Korotkov
Version of patch with few more comments and some fixes. -- With best regards, Alexander Korotkov. arrayanalyze-0.4.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http

Re: [HACKERS] WIP: collect frequency statistics for arrays

2011-06-13 Thread Alexander Korotkov
/ > number-of-elements-in-const) and consider at most that many MCVs. > When this limit kicks in you'll get a less-accurate selectivity > estimate, but that's a reasonable price to pay for not blowing out > planning time. Good option. I'm going to add such condition to my patch. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: collect frequency statistics for arrays

2011-06-12 Thread Alexander Korotkov
be clarified more, please, specify them. > And of course, a few lines for the docs also. > I found that in statistics patch for tsvector only article about pg_stats view was corrected. I've corrected this article a little bit too. -- With best regards, Alexander Korotkov. ar

Re: [HACKERS] WIP: Fast GiST index build

2011-06-06 Thread Alexander Korotkov
On Mon, Jun 6, 2011 at 4:14 PM, Alexander Korotkov wrote: > If I succeed then it will invoke even more such calls. > I meant here that if I succeed in enhancements which improve index quality then fast build algorithm will invoke even more such calls. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-06 Thread Alexander Korotkov
now gist fast build algorithm invokes more penalty calls than repeatable insert algorithm. If I succeed then it will invoke even more such calls. So, if penalty function is very slow then gist fast build will be slover then repeatable insert. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-06 Thread Alexander Korotkov
the wiki page, just > to see what kind of workloads have been taken into consideration and tested > already. Do you think there's some worst-case data distributions where this > algorithm would perform particularly badly? I don't expect some bad cases in terms in IO. My most worrying is about index quality which is strongly related to data distribution. -- With best regards, Alexander Korotkov.

[HACKERS] WIP: Fast GiST index build

2011-06-03 Thread Alexander Korotkov
- With best regards, Alexander Korotkov. gist_fast_build-0.0.3.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cube Index Size

2011-06-01 Thread Alexander Korotkov
m each time entry with maximal difference of inserion cost is inserted. Quadratic algorithm runs slowly than sorting one, but on my tests it shows slightly better results. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Cube Index Size

2011-06-01 Thread Alexander Korotkov
adratic one. Picksplit will take more time, but it should give more stable and predictable result. 3) I had some experiments with my own picksplit algorithm, which showed pretty good results on tests which I've run. But current implementation is dirty and it's require more testing. ---

Re: [HACKERS] pgsql: Protect GIST logic that assumes penalty values can't be negative

2011-06-01 Thread Alexander Korotkov
(A,B) = 3 penalty(B,A) = 0 -- With best regards, Alexander Korotkov.

Re: [HACKERS] Fix for GiST penalty

2011-05-31 Thread Alexander Korotkov
48 bits virtual power management: ------ With best regards, Alexander Korotkov. *** a/doc/src/sgml/gist.sgml --- b/doc/src/sgml/gist.sgml *** *** 377,383 my_decompress(PG_FUNCTION_ARGS) Returns a value indicating the cost of inserting the new entry into a particular

[HACKERS] Fix for GiST penalty

2011-05-30 Thread Alexander Korotkov
(point <@ '(0.505,0.505),(0.5,0.5)'::box) Buffers: shared hit=44 -> Bitmap Index Scan on test_idx (cost=0.00..44.10 rows=1000 width=0) (actual time=0.966..0.966 rows=24 loops=1) Index Cond: (point <@ '(0.505,0.505),(0.5,0.5)'::box) Buffers: sh

[HACKERS] Small patch for GiST: move childoffnum to child

2011-05-24 Thread Alexander Korotkov
ifies existing code a bit. Heikki advice me that since this change simplifies existing code it can be considered as a separate patch. -- With best regards, Alexander Korotkov. gist_childoffnum.path Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.or

Re: [HACKERS] WIP: collect frequency statistics for arrays

2011-05-23 Thread Alexander Korotkov
I forgot to commit before diff. Here is correct version. -- With best regards, Alexander Korotkov. arrayanalyze-0.2.1.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http

Re: [HACKERS] WIP: collect frequency statistics for arrays

2011-05-23 Thread Alexander Korotkov
x27;m warrying that there could be too expensibe calculations for estimation. Also, likely current comments don't clearify anything... -- With best regards, Alexander Korotkov. arrayanalyze-0.2.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (p

Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-05-06 Thread Alexander Korotkov
After root split path will be 0-new_page-234. > > If algorithm could be able to change root then new path could be looked as > new_root-new_page-234 because old root could be splitted to old_root_page > and new_page. Ok. Thank you for explanation. With best regards, Alexander Korotkov.

Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-05-05 Thread Alexander Korotkov
* rare situation when root splited > */ >break; As I understood it's because we can't move root to another page. With best regards, Alexander Korotkov.

Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-05-04 Thread Alexander Korotkov
? With best regards, Alexander Korotkov.

Re: [HACKERS] Extreme bloating of intarray GiST indexes

2011-05-04 Thread Alexander Korotkov
ldn't be many inserts into such pages in future then they will be stay bloat. With best regards, Alexander Korotkov.

Re: [HACKERS] Extreme bloating of intarray GiST indexes

2011-04-28 Thread Alexander Korotkov
hreshold achived array is converted to bitmap). If this problem is urgent, I can write a patch with opclass that would seem more suitable to be default to me, when I'll have a time for it. With best regards, Alexander Korotkov.

Re: [HACKERS] Extreme bloating of intarray GiST indexes

2011-04-28 Thread Alexander Korotkov
like this? What opclass is used for GiST index: gist__int_ops or gist__intbig_ops? Do you take into account that gist__int_ops is very inefficient for large datasets? With best regards, Alexander Korotkov.

Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-04-26 Thread Alexander Korotkov
On Tue, Apr 26, 2011 at 1:10 PM, Alexander Korotkov wrote: > Since algorithm is focused to reduce I/O, we should expect best > acceleration in the case when index doesn't fitting to memory. Size of > buffers is comparable to size of whole index. It means that if we can hold >

Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-04-26 Thread Alexander Korotkov
y of R-tree bulk loading algorithms in the literature, it should be > possible to adapt some of them to GiST. > > [1] http://dx.doi.org/10.1007/s00453-001-0107-6 Yes, these priority seems very reasonable. We should have first effectiveness confirmation as soon as possible. I'll hold on this priority. With best regards, Alexander Korotkov.

[HACKERS] GSoC 2011: Fast GiST index build

2011-04-25 Thread Alexander Korotkov
an use it in initial calculations. I'm going to hold on this assumption in first implementation. With best regards, Alexander Korotkov.

Re: [HACKERS] GSoC proposal: Fast GiST index build

2011-04-05 Thread Alexander Korotkov
unction. With best regards, Alexander Korotkov.

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-04-05 Thread Alexander Korotkov
g and 3-grams 'aaa' and 'aab' from second string (for simplicity, there is no padding here). GIN index will contain structure, which can be represented so: 'aaa' => 1, 2 'aab' => 2 We can see, that there are 2 unique 3-grams, but 3 links to the rows. With best regards, Alexander Korotkov.

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-04-05 Thread Alexander Korotkov
(about 120 millions of links), while size of q-grams itself will be almost ignorable in comparison with it. With best regards, Alexander Korotkov.

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-04-05 Thread Alexander Korotkov
ins btree on gin keys, i.e. q-grams), while data pages number (which contains links to rows in lists or btrees) will be similar. In dependence on data volume index size can be 10x larger (on small datasets) or few percents larger (on large datasets). With best regards, Alexander Korotkov.

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-04-05 Thread Alexander Korotkov
On Mon, Apr 4, 2011 at 9:01 PM, Robert Haas wrote: > On Mon, Apr 4, 2011 at 12:38 PM, Alexander Korotkov > wrote: > > relatively small when q <= 5. Accordingly, I think we should expect > indexes > > to be usable with at least with q = 5. > > I defer to your opinio

Re: [HACKERS] GSoC proposal: Fast GiST index build

2011-04-04 Thread Alexander Korotkov
On Mon, Apr 4, 2011 at 7:04 PM, Robert Haas wrote: > On Mon, Apr 4, 2011 at 7:16 AM, Alexander Korotkov > wrote: > > Project name > > Fast GiST index build > > Would/could/should this be implemented in a manner similar to the > existing "GIN fast update" fe

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-04-04 Thread Alexander Korotkov
On Mon, Apr 4, 2011 at 6:56 PM, Robert Haas wrote: > On Mon, Apr 4, 2011 at 7:35 AM, Alexander Korotkov > wrote: > > I would like to propose a q-gram module which would have following > > differences in comparison with pg_trgm: > > 1) Focus on acceleration of edit di

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-04-04 Thread Alexander Korotkov
vastava. Approximate String Joins in a Database (Almost) for Free. VLDB '01 Proceedings of the 27th International Conference on Very Large Data Bases With best regards, Alexander Korotkov.

[HACKERS] GSoC proposal: Fast GiST index build

2011-04-04 Thread Alexander Korotkov
ers from the International Workshop on Algorithm Engineering and Experimentation With best regards, Alexander Korotkov.

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-03-28 Thread Alexander Korotkov
nt on > such stuff. (It sounds reasonable to me, but I wouldn't know if there > are problems in the idea.) They may be too busy right at the moment. > Thank you for reply. I'm going to ask Oleg and Teodor for their feedback. With best regards, Alexander Korotkov.

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-03-25 Thread Alexander Korotkov
On Fri, Mar 25, 2011 at 8:32 PM, Alexander Korotkov wrote: > I would like to ask you about currency of the work above. This seems to be a mess of words. Sorry for my bad english. Actually, I meant that I need a appraisal of my proposal. With best regards, Alexander Korotkov. -- Sent

[HACKERS] Proposal: q-gram GIN and GiST indexes

2011-03-25 Thread Alexander Korotkov
, where do you like to see it: separate project, contrib module, core (of course, in the case when code have sufficient quality)? I have stong confidence level about implementability of this project in few month. That's why I could propose this as an GSoC project. With best regards, Alex

Re: [HACKERS] GSoC 2011 - Mentors? Projects?

2011-03-08 Thread Alexander Korotkov
ect related to indexing. With best regards, Alexander Korotkov. On Tue, Mar 8, 2011 at 9:44 AM, Selena Deckelmann wrote: > Hi! > > PostgreSQL is applying for GSoC again this year. We're looking for: > > * Mentors > * Project ideas > > Would you like to mentor?

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)

2011-03-06 Thread Alexander Korotkov
gm indexes is not mentioned here. With best regards, Alexander Korotkov.

[HACKERS] WIP: collect frequency statistics for arrays

2011-02-23 Thread Alexander Korotkov
ume something about data distribution in tsvector. But we don't know almost nothing about data in arrays because it can hold any data (tsvector also can holds any data, but it using for non nutural language data is out of purpose). -- With best regards, Alexander Korotkov. arrayanalyze-0

<    5   6   7   8   9   10   11   >