Re: [HACKERS] WIP: Fast GiST index build

2011-08-14 Thread Alexander Korotkov
Hi! Thank you for your notes. On Fri, Aug 12, 2011 at 7:04 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Aug 11, 2011 at 6:21 AM, Alexander Korotkov aekorot...@gmail.com wrote: [ new patch ] Some random comments: - It appears that the noFollowFight flag is really supposed

Re: [HACKERS] WIP: Fast GiST index build

2011-08-12 Thread Alexander Korotkov
explicitly, and stop the emptying process only if one of the lower-level buffers really fills up? That should be more efficient, as you would have swap between different subtrees less often. Yes, it seems reasonable to me. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-08-11 Thread Alexander Korotkov
, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-08-11 Thread Alexander Korotkov
to understand without some explanation. Some comments were added. I'm working on more of them. -- With best regards, Alexander Korotkov. gist_fast_build-0.13.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes

Re: [HACKERS] WIP: Fast GiST index build

2011-08-11 Thread Alexander Korotkov
On Thu, Aug 11, 2011 at 2:28 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 10.08.2011 22:44, Alexander Korotkov wrote: Manual and readme updates. Thanks, I'm reviewing these now. Do we want to expose the level-step and buffersize parameters to users? They've been

Re: [HACKERS] WIP: Fast GiST index build

2011-08-10 Thread Alexander Korotkov
and effective_cache_size. 4) Some renames. In particular GISTLoadedPartItem to GISTBufferingInsertStack. 5) Some comments were corrected and some were added. 6) pgindent 7) rebased with head Readme update and user documentation coming soon. -- With best regards, Alexander Korotkov. gist_fast_build-0.11.0

Re: [HACKERS] WIP: Fast GiST index build

2011-08-10 Thread Alexander Korotkov
Manual and readme updates. -- With best regards, Alexander Korotkov. gist_fast_build-0.12.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql

Re: [HACKERS] WIP: Fast GiST index build

2011-08-08 Thread Alexander Korotkov
reloption too, if we had that). Please start a new thread on that on pgsql-hackers. Ok. -- With best regards, Alexander Korotkov.

[HACKERS] Compiler warnings with stringRelOpts (was WIP: Fast GiST index build)

2011-08-08 Thread Alexander Korotkov
].default_val’**) It is caused by definition of default field of relopt_string structure as 1-length character array. This seems to be a design flaw in the reloptions.c code. Any thoughts? -- With best regards, Alexander Korotkov.

Re: [HACKERS] Compiler warnings with stringRelOpts (was WIP: Fast GiST index build)

2011-08-08 Thread Alexander Korotkov
*default_val; is possible? -- With best regards, Alexander Korotkov.

Re: [HACKERS] Compiler warnings with stringRelOpts (was WIP: Fast GiST index build)

2011-08-08 Thread Alexander Korotkov
that both were added by your commit of table-based parser: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ba748f7a11ef884277b61d1708a17a44acfd1736 -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-08-07 Thread Alexander Korotkov
large tree with levelstep = 1 size of this datastructures will be singnificant. And it's hard to predict that size without knowing of tree size. -- With best regards, Alexander Korotkov. gist_fast_build-0.10.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing

Re: [HACKERS] WIP: Fast GiST index build

2011-08-04 Thread Alexander Korotkov
Uhh, my bad, really stupid bug. Many thanks. -- With best regards, Alexander Korotkov. On Wed, Aug 3, 2011 at 8:31 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 03.08.2011 11:18, Alexander Korotkov wrote: I found that in previous version of patch I missed

Re: [HACKERS] WIP: Fast GiST index build

2011-08-03 Thread Alexander Korotkov
to 44/9C518750 appears. Seems that there is some totally wrong use of WAL if even optimization level does matter... -- With best regards, Alexander Korotkov. gist_fast_build-0.9.1.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers

Re: [HACKERS] WIP: Fast GiST index build

2011-08-02 Thread Alexander Korotkov
- (0001004000D2,10982288) (1 row) May be you have any ideas about it? -- With best regards, Alexander Korotkov. gist_fast_build-0.9.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org

Re: Hot standby and GiST page splits (was Re: [HACKERS] WIP: Fast GiST index build)

2011-08-01 Thread Alexander Korotkov
://www.sai.msu.su/~megera/postgres/gist/papers/concurrency/access-methods-for-next-generation.pdf.gz -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-28 Thread Alexander Korotkov
paths can become invalid on page splits. It seems to me that approximately same volume of code for maintaining parent links should be added to this version of patch in order to get it working correctly. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-28 Thread Alexander Korotkov
to verify final emptying, because IO guarantees of original paper is based on strict descending final emptying. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-27 Thread Alexander Korotkov
modification of WAL record structure is possible or I have to insert downlink one by one in buffering build too? -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-27 Thread Alexander Korotkov
:4294967295 (InvalidBlockNumber) . Isn't it a bug? -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-26 Thread Alexander Korotkov
. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-22 Thread Alexander Korotkov
distribution law for such cases (while I didn't find anything relevant in google scholar)? As an alternative I can propose take into account actual average IO operations per tuple rather then an estimate. -- With best regards, Alexander Korotkov. On Mon, Jul 18, 2011 at 10:00 PM, Alexander

Re: [HACKERS] WIP: Fast GiST index build

2011-07-18 Thread Alexander Korotkov
. -- With best regards, Alexander Korotkov. gist_fast_build-0.7.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch Review: Collect frequency statistics and selectivity estimation for arrays

2011-07-15 Thread Alexander Korotkov
lengths to avoid round off errors. Since it is certainly just the order of the estimate that matters, why not just perform the calculation in log space? It seems to me that I didn't anything to avoid round off errors there... -- With best regards, Alexander Korotkov.

Re: [HACKERS] Patch Review: Collect frequency statistics and selectivity estimation for arrays

2011-07-15 Thread Alexander Korotkov
should be avoided. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-07-15 Thread Alexander Korotkov
, if we need to backpatch bug fixes that use that field. Yes, it seems very reasonable. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-15 Thread Alexander Korotkov
Fri, Jul 15, 2011 at 12:53 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 14.07.2011 23:41, Alexander Korotkov wrote: Do you think using rightlink as pointer to parent page is possible during index build? It would allow to simplify code significantly, because

Re: [HACKERS] WIP: Fast GiST index build

2011-07-14 Thread Alexander Korotkov
difficulties with concurrent backends: it would be nice estimate usage of effective cache by other backeds before switching to buffering method. If don't take care about it then we can don't switch to buffering method which it can give significant benefit. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-07-14 Thread Alexander Korotkov
); tail = ptr; } -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-14 Thread Alexander Korotkov
Do you think using rightlink as pointer to parent page is possible during index build? It would allow to simplify code significantly, because of no more need to maintain in-memory structures for parents memorizing. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-14 Thread Alexander Korotkov
temporary data through several files. AFAICS postgres avoids working with files larger than 1GB. Size of tree buffers can easily be greater. Without BufFile I need to maintain set of files manually. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-13 Thread Alexander Korotkov
?), and switch to the buffering method after that. Yes, it seems to be possible. It also would be great to somehow detect case of ordered data when regular index build performs well. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-13 Thread Alexander Korotkov
On Wed, Jul 13, 2011 at 12:40 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Wed, Jul 13, 2011 at 12:33 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Is it possible to switch to the new buffering method in the middle of an index build? We could use the plain

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-07-13 Thread Alexander Korotkov
Thank you very much for detail explanation. But this line of modified patch seems strange for me: *newchildoffnum = blkno; I believe it should be: *newchildoffnum = i; -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-12 Thread Alexander Korotkov
On Fri, Jul 8, 2011 at 6:18 PM, Tom Lane t...@sss.pgh.pa.us wrote: For test purposes, you could turn off synchronize_seqscans to prevent that. Thanks, it helps. I'm rerunning tests now. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-12 Thread Alexander Korotkov
New version of patch with a little more refactoring and comments. -- With best regards, Alexander Korotkov. gist_fast_build-0.6.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http

Re: [HACKERS] WIP: Fast GiST index build

2011-07-08 Thread Alexander Korotkov
as tab1. But actually, it isn't always so. In aggregate with only few used test cases it can cause significant error. I'm going to make use some more thought-out testing method. Probably, some more precise index quality measure exists (even for R-tree). -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-07 Thread Alexander Korotkov
(in my expectation decrease of buffer size should make all build parameters closer to regular build). I'm going to recheck my experiments, probably I'm missing something. -- With best regards, Alexander Korotkov. gist_fast_build-0.5.0.patch.gz Description: GNU Zip compressed data -- Sent via

Re: [HACKERS] WIP: Fast GiST index build

2011-07-01 Thread Alexander Korotkov
to undertand tradeoffs better. -- With best regards, Alexander Korotkov. gist_fast_build-0.4.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-06-30 Thread Alexander Korotkov
obscure situation which I can't induce at will. Yes, it also seems pretty hard to get this code section executed for me. I'm going to ask Teodor and Oleg about it. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-28 Thread Alexander Korotkov
On Mon, Jun 27, 2011 at 10:32 PM, Alexander Korotkov aekorot...@gmail.comwrote: I didn't have an estimate yet, but I'm working on it. Now, it seems that I have an estimate. N - total number of itups B - avg. number of itups in page H - height of tree K - avg. number of itups fitting in node

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-06-28 Thread Alexander Korotkov
and fastbuild insert. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-28 Thread Alexander Korotkov
New version of patch. Bug which caused falldown on trees with high number of levels was fixed. Also some more comments and refactoring. -- With best regards, Alexander Korotkov. gist_fast_build-0.3.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list

Re: [HACKERS] WIP: Fast GiST index build

2011-06-27 Thread Alexander Korotkov
was similar to random generated data. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-27 Thread Alexander Korotkov
experiments) fraction of tree which fits to effective cache is reasonable for estimating benefit of IO economy. But with high concurrent load part of cache occupied by tree should be considerable smaller than whole effective cache. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-25 Thread Alexander Korotkov
be in producing better index in some cases and SSD drive lifetime economy due to less IO operations. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-24 Thread Alexander Korotkov
On Fri, Jun 24, 2011 at 12:40 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 21.06.2011 13:08, Alexander Korotkov wrote: I've created section about testing in project wiki page: http://wiki.postgresql.org/**wiki/Fast_GiST_index_build_** GSoC_2011#Testing_resultshttp

Re: [HACKERS] WIP: Fast GiST index build

2011-06-21 Thread Alexander Korotkov
expensive penalty method in that opclass. But, probably index build can be still faster when index doesn't fit cache even for gist_trgm_ops. Also with that opclass index quality is slightly worse but the difference is not dramatic. -- With best regards, Alexander Korotkov.

Re: [HACKERS] hstore - Implementation and performance issues around its operators

2011-06-21 Thread Alexander Korotkov
. You actually need not '-' operator to be supported by GiST but column - 'field_name' = value expression. Probably, I'm missing something, but I think supporting of this require significant catalog changes. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-20 Thread Alexander Korotkov
regards, Alexander Korotkov. On Thu, Jun 16, 2011 at 10:35 PM, Alexander Korotkov aekorot...@gmail.comwrote: Oh, actually it's so easy. Thanks. -- With best regards, Alexander Korotkov. On Thu, Jun 16, 2011 at 10:26 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote

Re: [HACKERS] WIP: Fast GiST index build

2011-06-16 Thread Alexander Korotkov
Actually, I would like to measure CPU and IO load independently for more comprehensive benchmarks. Can you advice me some appropriate tools for it? -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-16 Thread Alexander Korotkov
My current idea is to measure number of IO accesses by pg_stat_statements and measure CPU usage by /proc/PID/stat. Any thoughts? -- With best regards, Alexander Korotkov. On Thu, Jun 16, 2011 at 1:33 PM, Alexander Korotkov aekorot...@gmail.comwrote: Actually, I would like to measure CPU

Re: [HACKERS] WIP: Fast GiST index build

2011-06-16 Thread Alexander Korotkov
Oh, actually it's so easy. Thanks. -- With best regards, Alexander Korotkov. On Thu, Jun 16, 2011 at 10:26 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 16.06.2011 21:13, Alexander Korotkov wrote: My current idea is to measure number of IO accesses

Re: [HACKERS] WIP: Fast GiST index build

2011-06-15 Thread Alexander Korotkov
hit=240 Total runtime: 3.043 ms (7 rows) -- With best regards, Alexander Korotkov. gist_fast_build-0.1.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org

Re: [HACKERS] WIP: Fast GiST index build

2011-06-15 Thread Alexander Korotkov
On Wed, Jun 15, 2011 at 11:21 AM, Alexander Korotkov aekorot...@gmail.comwrote: I've tried index tuples sorting on penalty function before buffer relocation on split. But it was without any success. Index quality becomes even worse than without sorting. The next thing I've tried is buffer

Re: [HACKERS] WIP: Fast GiST index build

2011-06-15 Thread Alexander Korotkov
. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: collect frequency statistics for arrays

2011-06-14 Thread Alexander Korotkov
Version of patch with few more comments and some fixes. -- With best regards, Alexander Korotkov. arrayanalyze-0.4.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http

Re: [HACKERS] WIP: collect frequency statistics for arrays

2011-06-13 Thread Alexander Korotkov
at most that many MCVs. When this limit kicks in you'll get a less-accurate selectivity estimate, but that's a reasonable price to pay for not blowing out planning time. Good option. I'm going to add such condition to my patch. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: collect frequency statistics for arrays

2011-06-12 Thread Alexander Korotkov
for tsvector only article about pg_stats view was corrected. I've corrected this article a little bit too. -- With best regards, Alexander Korotkov. arrayanalyze-0.3.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes

Re: [HACKERS] WIP: Fast GiST index build

2011-06-06 Thread Alexander Korotkov
distribution. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-06 Thread Alexander Korotkov
algorithm invokes more penalty calls than repeatable insert algorithm. If I succeed then it will invoke even more such calls. So, if penalty function is very slow then gist fast build will be slover then repeatable insert. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-06-06 Thread Alexander Korotkov
On Mon, Jun 6, 2011 at 4:14 PM, Alexander Korotkov aekorot...@gmail.comwrote: If I succeed then it will invoke even more such calls. I meant here that if I succeed in enhancements which improve index quality then fast build algorithm will invoke even more such calls. -- With best regards

[HACKERS] WIP: Fast GiST index build

2011-06-03 Thread Alexander Korotkov
regards, Alexander Korotkov. gist_fast_build-0.0.3.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgsql: Protect GIST logic that assumes penalty values can't be negative

2011-06-01 Thread Alexander Korotkov
penalty(B,A) = 0 -- With best regards, Alexander Korotkov.

Re: [HACKERS] Cube Index Size

2011-06-01 Thread Alexander Korotkov
will take more time, but it should give more stable and predictable result. 3) I had some experiments with my own picksplit algorithm, which showed pretty good results on tests which I've run. But current implementation is dirty and it's require more testing. -- With best regards, Alexander

Re: [HACKERS] Cube Index Size

2011-06-01 Thread Alexander Korotkov
with maximal difference of inserion cost is inserted. Quadratic algorithm runs slowly than sorting one, but on my tests it shows slightly better results. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Fix for GiST penalty

2011-05-31 Thread Alexander Korotkov
: -- With best regards, Alexander Korotkov. *** a/doc/src/sgml/gist.sgml --- b/doc/src/sgml/gist.sgml *** *** 377,383 my_decompress(PG_FUNCTION_ARGS) para Returns a value indicating the quotecost/quote of inserting the new entry into a particular branch of the tree

[HACKERS] Fix for GiST penalty

2011-05-30 Thread Alexander Korotkov
: shared hit=44 - Bitmap Index Scan on test_idx (cost=0.00..44.10 rows=1000 width=0) (actual time=0.966..0.966 rows=24 loops=1) Index Cond: (point @ '(0.505,0.505),(0.5,0.5)'::box) Buffers: shared hit=20 Total runtime: 1.313 ms (7 rows) -- With best regards, Alexander Korotkov

[HACKERS] Small patch for GiST: move childoffnum to child

2011-05-24 Thread Alexander Korotkov
existing code a bit. Heikki advice me that since this change simplifies existing code it can be considered as a separate patch. -- With best regards, Alexander Korotkov. gist_childoffnum.path Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org

Re: [HACKERS] WIP: collect frequency statistics for arrays

2011-05-23 Thread Alexander Korotkov
be too expensibe calculations for estimation. Also, likely current comments don't clearify anything... -- With best regards, Alexander Korotkov. arrayanalyze-0.2.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes

Re: [HACKERS] WIP: collect frequency statistics for arrays

2011-05-23 Thread Alexander Korotkov
I forgot to commit before diff. Here is correct version. -- With best regards, Alexander Korotkov. arrayanalyze-0.2.1.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http

Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-05-06 Thread Alexander Korotkov
will be 0-new_page-234. If algorithm could be able to change root then new path could be looked as new_root-new_page-234 because old root could be splitted to old_root_page and new_page. Ok. Thank you for explanation. With best regards, Alexander Korotkov.

Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-05-05 Thread Alexander Korotkov
*/ break; As I understood it's because we can't move root to another page. With best regards, Alexander Korotkov.

Re: [HACKERS] Extreme bloating of intarray GiST indexes

2011-05-04 Thread Alexander Korotkov
be many inserts into such pages in future then they will be stay bloat. With best regards, Alexander Korotkov.

Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-05-04 Thread Alexander Korotkov
? With best regards, Alexander Korotkov.

Re: [HACKERS] Extreme bloating of intarray GiST indexes

2011-04-28 Thread Alexander Korotkov
for GiST index: gist__int_ops or gist__intbig_ops? Do you take into account that gist__int_ops is very inefficient for large datasets? With best regards, Alexander Korotkov.

Re: [HACKERS] Extreme bloating of intarray GiST indexes

2011-04-28 Thread Alexander Korotkov
to bitmap). If this problem is urgent, I can write a patch with opclass that would seem more suitable to be default to me, when I'll have a time for it. With best regards, Alexander Korotkov.

Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-04-27 Thread Alexander Korotkov
On Tue, Apr 26, 2011 at 1:10 PM, Alexander Korotkov aekorot...@gmail.comwrote: Since algorithm is focused to reduce I/O, we should expect best acceleration in the case when index doesn't fitting to memory. Size of buffers is comparable to size of whole index. It means that if we can hold

Re: [HACKERS] GSoC 2011: Fast GiST index build

2011-04-26 Thread Alexander Korotkov
/s00453-001-0107-6 Yes, these priority seems very reasonable. We should have first effectiveness confirmation as soon as possible. I'll hold on this priority. With best regards, Alexander Korotkov.

[HACKERS] GSoC 2011: Fast GiST index build

2011-04-25 Thread Alexander Korotkov
calculations. I'm going to hold on this assumption in first implementation. With best regards, Alexander Korotkov.

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-04-05 Thread Alexander Korotkov
On Mon, Apr 4, 2011 at 9:01 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Apr 4, 2011 at 12:38 PM, Alexander Korotkov aekorot...@gmail.com wrote: relatively small when q = 5. Accordingly, I think we should expect indexes to be usable with at least with q = 5. I defer to your

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-04-05 Thread Alexander Korotkov
contains btree on gin keys, i.e. q-grams), while data pages number (which contains links to rows in lists or btrees) will be similar. In dependence on data volume index size can be 10x larger (on small datasets) or few percents larger (on large datasets). With best regards, Alexander Korotkov.

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-04-05 Thread Alexander Korotkov
(about 120 millions of links), while size of q-grams itself will be almost ignorable in comparison with it. With best regards, Alexander Korotkov.

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-04-05 Thread Alexander Korotkov
' and 'aab' from second string (for simplicity, there is no padding here). GIN index will contain structure, which can be represented so: 'aaa' = 1, 2 'aab' = 2 We can see, that there are 2 unique 3-grams, but 3 links to the rows. With best regards, Alexander Korotkov.

Re: [HACKERS] GSoC proposal: Fast GiST index build

2011-04-05 Thread Alexander Korotkov
function. With best regards, Alexander Korotkov.

[HACKERS] GSoC proposal: Fast GiST index build

2011-04-04 Thread Alexander Korotkov
and Experimentation With best regards, Alexander Korotkov.

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-04-04 Thread Alexander Korotkov
Proceedings of the 27th International Conference on Very Large Data Bases With best regards, Alexander Korotkov.

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-04-04 Thread Alexander Korotkov
On Mon, Apr 4, 2011 at 6:56 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Apr 4, 2011 at 7:35 AM, Alexander Korotkov aekorot...@gmail.com wrote: I would like to propose a q-gram module which would have following differences in comparison with pg_trgm: 1) Focus on acceleration

Re: [HACKERS] GSoC proposal: Fast GiST index build

2011-04-04 Thread Alexander Korotkov
On Mon, Apr 4, 2011 at 7:04 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Apr 4, 2011 at 7:16 AM, Alexander Korotkov aekorot...@gmail.com wrote: Project name Fast GiST index build Would/could/should this be implemented in a manner similar to the existing GIN fast update feature

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-03-28 Thread Alexander Korotkov
on such stuff. (It sounds reasonable to me, but I wouldn't know if there are problems in the idea.) They may be too busy right at the moment. Thank you for reply. I'm going to ask Oleg and Teodor for their feedback. With best regards, Alexander Korotkov.

Re: [HACKERS] Proposal: q-gram GIN and GiST indexes

2011-03-26 Thread Alexander Korotkov
On Fri, Mar 25, 2011 at 8:32 PM, Alexander Korotkov aekorot...@gmail.com wrote: I would like to ask you about currency of the work above. This seems to be a mess of words. Sorry for my bad english. Actually, I meant that I need a appraisal of my proposal. With best regards, Alexander

[HACKERS] Proposal: q-gram GIN and GiST indexes

2011-03-25 Thread Alexander Korotkov
, where do you like to see it: separate project, contrib module, core (of course, in the case when code have sufficient quality)? I have stong confidence level about implementability of this project in few month. That's why I could propose this as an GSoC project. With best regards, Alexander

Re: [HACKERS] GSoC 2011 - Mentors? Projects?

2011-03-08 Thread Alexander Korotkov
, Alexander Korotkov. On Tue, Mar 8, 2011 at 9:44 AM, Selena Deckelmann sel...@chesnok.comwrote: Hi! PostgreSQL is applying for GSoC again this year. We're looking for: * Mentors * Project ideas Would you like to mentor? Please let me know! Our application closes on Friday, so please contact me

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)

2011-03-06 Thread Alexander Korotkov
indexes is not mentioned here. With best regards, Alexander Korotkov.

[HACKERS] WIP: collect frequency statistics for arrays

2011-02-23 Thread Alexander Korotkov
, but it using for non nutural language data is out of purpose). -- With best regards, Alexander Korotkov. arrayanalyze-0.1.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http

[HACKERS] Fix for fuzzystrmatch

2011-02-19 Thread Alexander Korotkov
, Alexander Korotkov. *** a/contrib/fuzzystrmatch/Makefile --- b/contrib/fuzzystrmatch/Makefile *** *** 16,18 top_builddir = ../.. --- 16,21 include $(top_builddir)/src/Makefile.global include $(top_srcdir)/contrib/contrib-global.mk endif + + fuzzystrmatch.o: fuzzystrmatch.c

Re: [HACKERS] Proposal: collect frequency statistics for arrays

2011-02-19 Thread Alexander Korotkov
Thanks for feedback on my proposal. Ok, I'll write it as an separate function. After that I'm going to look if is there a way to union them without kluge. If I'll not find such way then I'll propose patch with separate function. -- With best regards, Alexander Korotkov.

[HACKERS] Proposal: collect frequency statistics for arrays

2011-02-18 Thread Alexander Korotkov
. ts_typanalyze internally uses lexeme comparison and hashing. I'm going to use functions from default btree and hash opclasses of array element type in this capacity. Collected frequency statistics for arrays can be used for and @ operators selectivity estimation. -- With best regards, Alexander

Re: [HACKERS] wildcard search support for pg_trgm

2011-02-01 Thread Alexander Korotkov
unrelated feature patch. Ok. Actually, I don't think just increasement of SIGLENINT as a solution. I beleive that we need to have it as index parameter. I'll try to provide more of tests in order to motivate this. With best regards, Alexander Korotkov.

Re: [HACKERS] wildcard search support for pg_trgm

2011-01-30 Thread Alexander Korotkov
Hi! On Mon, Jan 31, 2011 at 12:52 AM, Jan Urbański wulc...@wulczer.org wrote: I saw that the code tries to handle ILIKE searches, but apparently it's failing somewhere. It was just a typo. Corrected version attached. With best regards, Alexander Korotkov. *** a/contrib/pg_trgm

<    5   6   7   8   9   10   11   >