[HACKERS] multibyte charater set in levenshtein function

2010-05-12 Thread Alexander Korotkov
sets. test=# select levenshtein('фыва','аыва'); levenshtein - 1 (1 row) Also it avoids text_to_cstring call. Regards, Alexander Korotkov. fuzzystrmatch.diff.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org

Re: [HACKERS] multibyte charater set in levenshtein function

2010-05-13 Thread Alexander Korotkov
On Thu, May 13, 2010 at 6:03 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Well, since it's only used in one place, why are you defining a macro at all? In order to structure code better. My question was about another. Is memcmp function good choice to compare very short sequences of

Re: [HACKERS] multibyte charater set in levenshtein function

2010-05-13 Thread Alexander Korotkov
On Wed, May 12, 2010 at 11:04 PM, Alvaro Herrera alvhe...@alvh.no-ip.orgwrote: On a quick look, I didn't like the way you separated the pg_database_encoding_max_length() 1 cases. There seem to be too much common code. Can that be refactored a bit better? I did a little refactoring in order

Re: [HACKERS] multibyte charater set in levenshtein function

2010-06-07 Thread Alexander Korotkov
runtime: 42.292 ms (6 rows) In the example above levenshtein_less_equal works about 5 times faster. With best regards, Alexander Korotkov. fuzzystrmatch-0.3.diff.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your

[HACKERS] Parameters of GiST indexes

2010-06-07 Thread Alexander Korotkov
Hi hackers, I found that some parameters of GiST implementation are builin in the code. For example, following can be found in the backend/utils/adt/tsgistidx.c: #define SIGLENINT 31/* 121 = key will toast, so it will not work * !!! */ #define

[HACKERS] Using multidimensional indexes in ordinal queries

2010-06-07 Thread Alexander Korotkov
cube index for original query. With best regards, Alexander Korotkov. cube.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using multidimensional indexes in ordinal queries

2010-06-21 Thread Alexander Korotkov
On Mon, Jun 21, 2010 at 5:42 PM, Robert Haas robertmh...@gmail.com wrote: It seems like you can get more or less the same benefit from a multicolumn btree index. On my system, with the individual btree indices, the query ran in 7625 ms; with an additional index on (v1, v2, v3), it ran in 94

Re: [HACKERS] Using multidimensional indexes in ordinal queries

2010-06-22 Thread Alexander Korotkov
On Tue, Jun 22, 2010 at 1:58 AM, Robert Haas robertmh...@gmail.com wrote: It doesn't? I didn't think it was making any assumptions about the ordering data type beyond the fact that it had a default btree opclass. Actually, the return type of consistent method was replaced by float8.

Re: [HACKERS] multibyte charater set in levenshtein function

2010-07-13 Thread Alexander Korotkov
, VARDATA_ANY, and VARSIZE_ANY_EXHDR? Unpacking versions make the core a bit faster. Fixed. With best regards, Alexander Korotkov. fuzzystrmatch-0.4.diff.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription

Re: [HACKERS] multibyte charater set in levenshtein function

2010-07-21 Thread Alexander Korotkov
, then please add my name, too, because I've already patched this code once... In that case I think we can leave original acknowledgments section. With best regards, Alexander Korotkov.

Re: [HACKERS] multibyte charater set in levenshtein function

2010-07-21 Thread Alexander Korotkov
)) from words; sum - 1074376 (1 row) Time: 254,819 ms The function with negative value of max_d didn't become faster than with just big value of max_d. With best regards, Alexander Korotkov.

Re: [HACKERS] multibyte charater set in levenshtein function

2010-07-22 Thread Alexander Korotkov
- 1]; /* * Because the final value was swapped from the previous row to the * current row, that's where we'll find it. */ return d; } What do you thing about it? With best regards, Alexander Korotkov.

Re: [HACKERS] multibyte charater set in levenshtein function

2010-07-22 Thread Alexander Korotkov
and includes (like in 'backend/utils/adt/like.c'). Do you think it is acceptable in this case? With best regards, Alexander Korotkov.

Re: [HACKERS] multibyte charater set in levenshtein function

2010-07-27 Thread Alexander Korotkov
--+-+- 4589 | Dhaka craziness savvies teenager ploughs Barents's unwed zither | 70983 (1 row) Time: 2983,244 ms With best regards, Alexander Korotkov. fuzzystrmatch-0.5.tar.gz Description: GNU Zip compressed data

Re: [HACKERS] multibyte charater set in levenshtein function

2010-07-29 Thread Alexander Korotkov
I forgot attribution in levenshtein.c file. With best regards, Alexander Korotkov. fuzzystrmatch-0.5.1.tar.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org

Re: [HACKERS] knngist - 0.8

2010-07-30 Thread Alexander Korotkov
to combine different similarity levels in one query. For example: select * from test_trgm order by t - 'asdf' 0.5 or t - 'qwer' 0.4; Is there any chance to handle this syntax also? With best regards, Alexander Korotkov.

Re: [HACKERS] multibyte charater set in levenshtein function

2010-08-04 Thread Alexander Korotkov
) { while (len--) { if (*s++ != *d++) return false; } return true; } Code becomes much faster: test=# select sum(levenshtein(word, 'фывафыва')) from test; sum - 1675281 (1 row) Time: 241,272 ms With best regards, Alexander Korotkov.

Re: [HACKERS] multibyte charater set in levenshtein function

2010-08-04 Thread Alexander Korotkov
Now I think patch is as good as can be. :) I'm going to prepare less-or-equal function in same manner as this patch. With best regards, Alexander Korotkov.

Re: [HACKERS] knngist - 0.8

2010-08-09 Thread Alexander Korotkov
is global for session. That's why we can't create complex queries which contain similarity filtering with different threshold. With best regards, Alexander Korotkov. On Mon, Aug 2, 2010 at 8:14 PM, Robert Haas robertmh...@gmail.com wrote: 2010/7/29 Alexander Korotkov aekorot...@gmail.com

Re: [HACKERS] knngist - 0.8

2010-08-10 Thread Alexander Korotkov
-baked patch that I'm planning to submit to some of the CFs. Unfortunately, I don't have time for it ATM. User-defined parameters for GiST would be a great feature. I'm performing some experiments with GiST and I'm really feeling the need of it. With best regards, Alexander Korotkov.

Re: [HACKERS] multibyte charater set in levenshtein function

2010-08-28 Thread Alexander Korotkov
Here is the patch which adds levenshtein_less_equal function. I'm going to add it to current commitfest. With best regards, Alexander Korotkov. On Tue, Aug 3, 2010 at 3:23 AM, Robert Haas robertmh...@gmail.com wrote: On Mon, Aug 2, 2010 at 5:07 PM, Alexander Korotkov aekorot...@gmail.com

Re: [HACKERS] multibyte charater set in levenshtein function

2010-08-28 Thread Alexander Korotkov
; sum - 1091878 (1 row) Time: 673,515 ms With best regards, Alexander Korotkov.

Re: [HACKERS] Real-life range datasets

2011-12-23 Thread Alexander Korotkov
regards, Alexander Korotkov.

Re: [HACKERS] Collect frequency statistics for arrays

2012-01-03 Thread Alexander Korotkov
.). -- With best regards, Alexander Korotkov. arrayanalyze-0.8.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collect frequency statistics for arrays

2012-01-03 Thread Alexander Korotkov
On Wed, Jan 4, 2012 at 12:33 AM, Noah Misch n...@leadboat.com wrote: On Wed, Jan 04, 2012 at 12:09:16AM +0400, Alexander Korotkov wrote: Thanks for your great work on reviewing this patch. Now I'm trying to find memory corruption bug. Unfortunately it doesn't appears on my system. Can you

Re: [HACKERS] Collect frequency statistics for arrays

2012-01-07 Thread Alexander Korotkov
of elements. Substracting mcelem frequencies from avg_length we have summ of frequencies of non-mcelem elements. -- With best regards, Alexander Korotkov. arrayanalyze-0.9.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make

Re: [HACKERS] Collect frequency statistics for arrays

2012-01-17 Thread Alexander Korotkov
that one different? Oh, I didn't update all array types in 2 tries :) Fixed. -- With best regards, Alexander Korotkov. arrayanalyze-0.11.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription

Re: [HACKERS] WIP: index support for regexp search

2012-01-19 Thread Alexander Korotkov
function in pg_wchar_tbl which converts pg_wchar back to multibyte character is possible solution? -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: index support for regexp search

2012-01-19 Thread Alexander Korotkov
to pg_wchar is possible from these encodings? -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: index support for regexp search

2012-01-19 Thread Alexander Korotkov
On Fri, Jan 20, 2012 at 1:07 AM, Alexander Korotkov aekorot...@gmail.comwrote: What does last 7 zeros in the first column means? No conversion to pg_wchar is possible from these encodings? Uh, I see. These encodings is not supported as server encodings. -- With best regards, Alexander

Re: [HACKERS] WIP: index support for regexp search

2012-01-20 Thread Alexander Korotkov
bug, I think, when running with 'set enable_seqscan=off' in combination with a too-large regex: Thanks for pointing. Will be fixed. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: index support for regexp search

2012-01-20 Thread Alexander Korotkov
slow (because index scan itself is breakable). So, it just shouldn't work so long. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: index support for regexp search

2012-01-20 Thread Alexander Korotkov
On Fri, Jan 20, 2012 at 12:54 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Fri, Jan 20, 2012 at 12:30 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Apart from that, the multibyte issue seems like the big one. Any way around that? Conversion of pg_wchar

Re: [HACKERS] Collect frequency statistics for arrays

2012-01-22 Thread Alexander Korotkov
of elements. -- With best regards, Alexander Korotkov. arrayanalyze-0.12.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collect frequency statistics for arrays

2012-01-23 Thread Alexander Korotkov
, please post another update. Changes looks reasonable for me. Thanks! -- With best regards, Alexander Korotkov.

Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2012-01-24 Thread Alexander Korotkov
, we'd need to make sure it terminated at some point, but splitting the common entries does seem like a smaller version of the original problem. Thoughts? That was a bug. Actually, no abs is needed. Indeed it doesn't affect result significantly. - With best regards, Alexander Korotkov

Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2012-01-29 Thread Alexander Korotkov
fixes at your discretion. Patch with your comment fixes is attached. - With best regards, Alexander Korotkov. rangetypegist-0.7.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http

Re: [HACKERS] spgist text_ops and LIKE

2012-02-02 Thread Alexander Korotkov
that. - With best regards, Alexander Korotkov.

Re: [HACKERS] Fast GiST index build - further improvements

2012-02-02 Thread Alexander Korotkov
. So, users could just tune effective_cache_size for gist index build on high concurrency. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Bugs/slowness inserting and indexing cubes

2012-02-08 Thread Alexander Korotkov
with following setting: effective_cache_size = 3734MB because buffering GiST index build just shouldn't turn on in this case when index fits to cache. I'm goint to take a detailed look on this. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Bugs/slowness inserting and indexing cubes

2012-02-14 Thread Alexander Korotkov
return from the function instead of triggering error. Patch is attached. -- With best regards, Alexander Korotkov. *** a/src/backend/access/gist/gistbuildbuffers.c --- b/src/backend/access/gist/gistbuildbuffers.c *** *** 607,617 gistRelocateBuildBuffersOnSplit(GISTBuildBuffers

Re: [HACKERS] Bugs/slowness inserting and indexing cubes

2012-02-15 Thread Alexander Korotkov
On Wed, Feb 15, 2012 at 2:54 AM, Tom Lane t...@sss.pgh.pa.us wrote: Alexander Korotkov aekorot...@gmail.com writes: ITSM, I found the problem. This piece of code is triggering an error. It assumes each page of corresponding to have initialized buffer. That should be true because we're

Re: [HACKERS] Bugs/slowness inserting and indexing cubes

2012-02-15 Thread Alexander Korotkov
buffers for the new sibling pages. In the final emptying phase, that's a waste of time, the buffers we create will never be used, and even before that I think it's better to create the buffers lazily. I agree. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Designing an extension for feature-space similarity search

2012-02-16 Thread Alexander Korotkov
similarity search using GiST (for example, sets, strings etc.). -- With best regards, Alexander Korotkov.

Re: [HACKERS] Designing an extension for feature-space similarity search

2012-02-17 Thread Alexander Korotkov
empty flag was introduced. This contain empty flag indicates that underlying value can be empty. So, this flag is set when union with empty range or other range with this flag set. It's likely you need similar flag for each dimension. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Designing an extension for feature-space similarity search

2012-02-17 Thread Alexander Korotkov
On Fri, Feb 17, 2012 at 11:32 PM, Jay Levitt jay.lev...@gmail.com wrote: Alexander Korotkov wrote: On Fri, Feb 17, 2012 at 11:00 PM, Jay Levitt jay.lev...@gmail.com mailto:jay.lev...@gmail.com wrote: At first I thought this posed a challenge for union; if I have these points

Re: [HACKERS] Incorrect behaviour when using a GiST index on points

2012-02-20 Thread Alexander Korotkov
On Thu, Feb 16, 2012 at 11:43 AM, Alexander Korotkov aekorot...@gmail.comwrote: Described differences leads to incorrect behaviour of GiST index. The question is: what is correct way to fix it? Should on_pb also use FP* or consistent method should behave like on_pb? Any comments

Re: [HACKERS] Google Summer of Code? Call for mentors.

2012-02-20 Thread Alexander Korotkov
I'm spending my time on the right things). FYI, I found myself to be eligible for this year. So, if PostgreSQL will participate this year, I'll do some proposals on indexing. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Bugs/slowness inserting and indexing cubes

2012-02-20 Thread Alexander Korotkov
On Wed, Feb 15, 2012 at 7:28 PM, Tom Lane t...@sss.pgh.pa.us wrote: Alexander Korotkov aekorot...@gmail.com writes: On Wed, Feb 15, 2012 at 4:26 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: So, I think we should go with your original fix and simply do nothing

Re: [HACKERS] Incorrect behaviour when using a GiST index on points

2012-02-20 Thread Alexander Korotkov
On Mon, Feb 20, 2012 at 7:22 PM, Tom Lane t...@sss.pgh.pa.us wrote: Alexander Korotkov aekorot...@gmail.com writes: On Thu, Feb 16, 2012 at 11:43 AM, Alexander Korotkov aekorot...@gmail.comwrote: Described differences leads to incorrect behaviour of GiST index. The question is: what

Re: [HACKERS] Incorrect behaviour when using a GiST index on points

2012-02-22 Thread Alexander Korotkov
Attached patch fixes GiST behaviour without altering operators behaviour. -- With best regards, Alexander Korotkov. *** a/src/backend/access/gist/gistproc.c --- b/src/backend/access/gist/gistproc.c *** *** 836,842 gist_box_picksplit(PG_FUNCTION_ARGS) } /* ! * Equality

Re: [HACKERS] Collect frequency statistics for arrays

2012-02-29 Thread Alexander Korotkov
/msg00780.php Probably, btree statistics really does matter for some sort of arrays? For example, arrays representing paths in the tree. We could request a subtree in a range query on such arrays. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Collect frequency statistics for arrays

2012-02-29 Thread Alexander Korotkov
On Thu, Mar 1, 2012 at 1:09 AM, Tom Lane t...@sss.pgh.pa.us wrote: Alexander Korotkov aekorot...@gmail.com writes: On Thu, Mar 1, 2012 at 12:39 AM, Tom Lane t...@sss.pgh.pa.us wrote: I am starting to look at this patch now. I'm wondering exactly why the decision was made to continue

Re: [HACKERS] Collect frequency statistics for arrays

2012-03-01 Thread Alexander Korotkov
On Thu, Mar 1, 2012 at 1:19 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Thu, Mar 1, 2012 at 1:09 AM, Tom Lane t...@sss.pgh.pa.us wrote: That seems like a pretty narrow, uncommon use-case. Also, to get accurate stats for such queries that way, you'd need really enormous histograms

Re: [HACKERS] Collect frequency statistics for arrays

2012-03-04 Thread Alexander Korotkov
array of unique DEC and it's frequency. -- With best regards, Alexander Korotkov. *** a/src/backend/utils/adt/array_typanalyze.c --- b/src/backend/utils/adt/array_typanalyze.c *** *** 581,587 compute_array_stats(VacAttrStats *stats, AnalyzeAttrFetchFunc fetchfunc

Re: [HACKERS] Collect frequency statistics for arrays

2012-03-04 Thread Alexander Korotkov
case rough estimate is quite accurate. But in most part of cases it behaves really bad. It is why I started to invent calc_distr and etc. So, I think return DEFAULT_CONTAIN_SEL is OK unless we've some better ideas. -- With best regards, Alexander Korotkov.

Re: [HACKERS] Collect frequency statistics for arrays

2012-03-05 Thread Alexander Korotkov
statkind and statistics slot. Probably, you've better ideas? -- With best regards, Alexander Korotkov.

Re: [HACKERS] Collect frequency statistics for arrays

2012-03-12 Thread Alexander Korotkov
On Thu, Mar 8, 2012 at 4:51 AM, Tom Lane t...@sss.pgh.pa.us wrote: Alexander Korotkov aekorot...@gmail.com writes: True. If (max count - min count + 1) is small, enumerating of frequencies is both more compact and more precise representation. Simultaneously, if (max count - min count + 1

Re: [HACKERS] Incorrect behaviour when using a GiST index on points

2012-03-12 Thread Alexander Korotkov
I believe that attached version of patch can be backpatched. It fixes this problem without altering of index building procedure. It just makes checks in internal pages softener enough to compensate effect of gist_box_same implementation. -- With best regards, Alexander Korotkov. *** a/src

[HACKERS] Wildcard search support for pg_trgm

2010-12-11 Thread Alexander Korotkov
. But on longer strings greater values of SIGLENINT may be required (probably even SIGLENINT 122 will give benefit in some cases in spite of TOAST). With best regards, Alexander Korotkov. *** a/contrib/pg_trgm/pg_trgm.sql.in --- b/contrib/pg_trgm/pg_trgm.sql.in *** *** 113,118

Re: [HACKERS] Wildcard search support for pg_trgm

2010-12-12 Thread Alexander Korotkov
I found another problem. GIN index suffers from GIN indexes do not support whole-index scans when no trigram can be extracted from pattern. With best regards, Alexander Korotkov.

Re: [HACKERS] Wildcard search support for pg_trgm

2010-12-12 Thread Alexander Korotkov
with it. Because testing on dictionaries is good, but obviously not enough. With best regards, Alexander Korotkov.

Re: [HACKERS] Wildcard search support for pg_trgm

2011-01-08 Thread Alexander Korotkov
#27 0x0822be12 in PostmasterMain (argc=3, argv=0x99acc58) at postmaster.c:1109 #28 0x081ce3b7 in main (argc=3, argv=0x99acc58) at main.c:199 With best regards, Alexander Korotkov. *** a/contrib/pg_trgm/pg_trgm.sql.in --- b/contrib/pg_trgm/pg_trgm.sql.in *** *** 113,118

Re: [HACKERS] Wildcard search support for pg_trgm

2011-01-17 Thread Alexander Korotkov
Hi, Here is updated version of this patch. With best regards, Alexander Korotkov. *** a/contrib/pg_trgm/pg_trgm.sql.in --- b/contrib/pg_trgm/pg_trgm.sql.in *** *** 113,118 FOR TYPE text USING gist --- 113,120 AS OPERATOR1 % (text, text

Re: [HACKERS] wildcard search support for pg_trgm

2011-01-24 Thread Alexander Korotkov
. After that new index can be created and it will support like strategy. Although actually there is no need of index recreation, I don't see easier way to do this. With best regards, Alexander Korotkov.

Re: [HACKERS] wildcard search support for pg_trgm

2011-01-29 Thread Alexander Korotkov
these funtions no longer do the same. New arguments was added to sql description of gin interface functions in order to make it confom to new gin interface. See docs of development version: http://developer.postgresql.org/pgdocs/postgres/gin-extensibility.html. With best regards, Alexander

Re: [HACKERS] wildcard search support for pg_trgm

2011-01-30 Thread Alexander Korotkov
Hi! On Mon, Jan 31, 2011 at 12:52 AM, Jan Urbański wulc...@wulczer.org wrote: I saw that the code tries to handle ILIKE searches, but apparently it's failing somewhere. It was just a typo. Corrected version attached. With best regards, Alexander Korotkov. *** a/contrib/pg_trgm

Re: [HACKERS] wildcard search support for pg_trgm

2011-02-01 Thread Alexander Korotkov
unrelated feature patch. Ok. Actually, I don't think just increasement of SIGLENINT as a solution. I beleive that we need to have it as index parameter. I'll try to provide more of tests in order to motivate this. With best regards, Alexander Korotkov.

[HACKERS] WIP: store additional info in GIN index

2012-11-18 Thread Alexander Korotkov
and OffsetNumber. BlockNumber are stored incremental in page. Additionally one bit of OffsetNumber is reserved for additional information NULL flag. To be able to find position in leaf data page quickly patch introduces small index in the end of page. -- With best regards, Alexander Korotkov. ginaddinfo

Re: [HACKERS] pg_trgm partial-match

2012-11-18 Thread Alexander Korotkov
; SET test=# SELECT * FROM test WHERE val LIKE '%aa%'; val - aa aaa (2 rows) I think we can use partial match only for singlebyte encodings. Or, at most, in cases when all alpha-numeric characters are singlebyte (have no idea how to check this). -- With best regards, Alexander Korotkov.

Re: [HACKERS] pg_trgm partial-match

2012-11-19 Thread Alexander Korotkov
On Mon, Nov 19, 2012 at 10:05 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Thu, Nov 15, 2012 at 11:39 PM, Fujii Masao masao.fu...@gmail.comwrote: Note that we cannot do a partial-match if KEEPONLYALNUM is disabled, i.e., if query key contains multibyte characters. In this case, byte

Re: [HACKERS] WIP: index support for regexp search

2012-11-19 Thread Alexander Korotkov
Hi! New version of patch is attached. Changes are following: 1) Right way to convert from pg_wchar to multibyte. 2) Optimization of producing CFNA-like graph on trigrams (produce smaller, but equivalent, graphs in less time). 3) Comments and refactoring. -- With best regards, Alexander

Re: [HACKERS] WIP: index support for regexp search

2012-11-19 Thread Alexander Korotkov
WIP status. -- With best regards, Alexander Korotkov. trgm-regexp-0.3.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: index support for regexp search

2012-11-20 Thread Alexander Korotkov
: MAX_RESULT_STATES, MAX_RESULT_ARCS, MAX_RESULT_PATHS. They are limiting resources usage during regex processing. -- With best regards, Alexander Korotkov. trgm-regexp-0.4.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make

Re: [HACKERS] WIP: index support for regexp search

2012-11-25 Thread Alexander Korotkov
the information from that presentation included in the patch. New version of patch is attached. The changes are following: 1) A big comment with high-level description of what is going on. 2) Regression tests. 3) Documetation update. 4) Some more refactoring. -- With best regards, Alexander

Re: [HACKERS] WIP: index support for regexp search

2012-11-25 Thread Alexander Korotkov
about another opclass for GiST focusing on regex and LIKE/ILIKE search? However, amyway I can create additional patch for current GiST opclass. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: index support for regexp search

2012-11-26 Thread Alexander Korotkov
, Alexander Korotkov. trgm-regexp-0.6.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: index support for regexp search

2012-11-30 Thread Alexander Korotkov
unprocessed branch to immediately finish with matching (this can give us more false positives but no false negatives). For overflow of matrix collection, it's safe to do just OR between all the trigrams. New version of patch is attached. -- With best regards, Alexander Korotkov. trgm-regexp-0.7

Re: [HACKERS] WIP: index support for regexp search

2012-11-30 Thread Alexander Korotkov
Hi! On Thu, Nov 29, 2012 at 12:58 PM, er e...@xs4all.nl wrote: On Mon, November 26, 2012 20:49, Alexander Korotkov wrote: trgm-regexp-0.6.patch.gz I ran the simple-minded tests against generated data (similar to the ones I did in January 2012). The problems of that older version seem

Re: [HACKERS] WIP: index support for regexp search

2012-11-30 Thread Alexander Korotkov
On Fri, Nov 30, 2012 at 3:20 PM, Alexander Korotkov aekorot...@gmail.comwrote: For depth-first it's not. Oh, I didn't explained it. In order to stop graph processing we need to be sure that we put all outgoing arcs from state or assume that state to be final. In DFS we can be in the final part

Re: [HACKERS] WIP: index support for regexp search

2012-12-02 Thread Alexander Korotkov
On Fri, Nov 30, 2012 at 6:23 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 30.11.2012 13:20, Alexander Korotkov wrote: On Thu, Nov 29, 2012 at 5:25 PM, Heikki Linnakangashlinnakangas@** vmware.com hlinnakan...@vmware.com wrote: Would it be safe to simply stop short the depth

Re: [HACKERS] WIP: index support for regexp search

2012-12-02 Thread Alexander Korotkov
On Sat, Dec 1, 2012 at 3:22 PM, Erik Rijkers e...@xs4all.nl wrote: On Fri, November 30, 2012 12:22, Alexander Korotkov wrote: Hi! On Thu, Nov 29, 2012 at 12:58 PM, er e...@xs4all.nl wrote: On Mon, November 26, 2012 20:49, Alexander Korotkov wrote: I ran the simple-minded tests

Re: [HACKERS] WIP: index support for regexp search

2012-12-03 Thread Alexander Korotkov
into trigrams. Simultaneously, we shouldn't allow path from initial state to the final by unexpanded trigrams. It seems much harder to do with graph than with matrix. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: store additional info in GIN index

2012-12-04 Thread Alexander Korotkov
applied on HEAD (because of an assert added to ginRedoCreatePTree), but that shouldn't be a problem. Thanks for testing! Patch is rebased with HEAD. The bug you reported was fixed. -- With best regards, Alexander Korotkov. ginaddinfo.2.patch.gz Description: GNU Zip compressed data

Re: [HACKERS] Patch for removng unused targets

2012-12-04 Thread Alexander Korotkov
? If it's so or there are some other cases which are easy to determine then I'll remove resorderbyonly flag. -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: store additional info in GIN index

2012-12-04 Thread Alexander Korotkov
On Tue, Dec 4, 2012 at 9:34 PM, Robert Haas robertmh...@gmail.com wrote: On Sun, Nov 18, 2012 at 4:54 PM, Alexander Korotkov aekorot...@gmail.com wrote: Patch completely changes storage in posting lists and leaf pages of posting trees. It uses varbyte encoding for BlockNumber

Re: [HACKERS] Patch for removng unused targets

2012-12-04 Thread Alexander Korotkov
On Tue, Dec 4, 2012 at 11:52 PM, Tom Lane t...@sss.pgh.pa.us wrote: Alexander Korotkov aekorot...@gmail.com writes: On Mon, Dec 3, 2012 at 8:31 PM, Tom Lane t...@sss.pgh.pa.us wrote: But having said that, I'm wondering (without having read the patch) why you need anything more than

Re: [HACKERS] WIP: store additional info in GIN index

2012-12-05 Thread Alexander Korotkov
On Wed, Dec 5, 2012 at 1:56 AM, Tomas Vondra t...@fuzzy.cz wrote: On 4.12.2012 20:12, Alexander Korotkov wrote: Hi! On Sun, Dec 2, 2012 at 5:02 AM, Tomas Vondra t...@fuzzy.cz mailto:t...@fuzzy.cz wrote: I've tried to apply the patch with the current HEAD, but I'm getting

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-12-13 Thread Alexander Korotkov
be fixes in the attached version of patch. However, it require significant rethinking of comments. Will update comments and address your questions in a couple of days. Could you recheck if attached patch really fixes problem you reported? -- With best regards, Alexander Korotkov. range_stat-0.9

Re: [HACKERS] gistchoose vs. bloat

2012-12-13 Thread Alexander Korotkov
Hi! On Sat, Dec 8, 2012 at 7:05 PM, Andres Freund and...@2ndquadrant.comwrote: I notice there's no documentation about the new reloption at all? Thanks for notice! I've added small description to docs in the attached patch. -- With best regards, Alexander Korotkov. gist_choose_bloat

Re: [HACKERS] SP-GiST for ranges based on 2d-mapping and quad-tree

2012-12-13 Thread Alexander Korotkov
Hi! On Sun, Nov 4, 2012 at 11:41 PM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2012-11-02 at 12:47 +0400, Alexander Korotkov wrote: Right version of patch is attached. * In bounds_adjacent, there's no reason to flip the labels back. Fixed. * Comment should indicate more explicitly

Re: [HACKERS] WIP: index support for regexp search

2012-12-13 Thread Alexander Korotkov
On Mon, Dec 3, 2012 at 4:31 PM, Alexander Korotkov aekorot...@gmail.comwrote: Actually, I generally dislike path matrix for same reasons. But: 1) Output graphs could contain trigrams which are completely useless for search. For example, for regex /(abcdefgh)*ijk/ we need only ijk trigram

Re: [HACKERS] gistchoose vs. bloat

2012-12-14 Thread Alexander Korotkov
, I fixed a compiler warning. Thanks! -- With best regards, Alexander Korotkov.

Re: [HACKERS] WIP: index support for regexp search

2012-12-16 Thread Alexander Korotkov
On Fri, Dec 14, 2012 at 1:34 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Mon, Dec 3, 2012 at 4:31 PM, Alexander Korotkov aekorot...@gmail.comwrote: Actually, I generally dislike path matrix for same reasons. But: 1) Output graphs could contain trigrams which are completely useless

Re: [HACKERS] WIP: index support for regexp search

2012-12-17 Thread Alexander Korotkov
Hi! On Mon, Dec 17, 2012 at 12:54 PM, Erik Rijkers e...@xs4all.nl wrote: On Sun, December 16, 2012 22:25, Alexander Korotkov wrote: trgm-regexp-0.8.patch.gz 22 k Hi Alexander, I gave this a quick try; the patch works when compiled for DEBUG, but crashes as a 'speed'-compiled binary

Re: [HACKERS] WIP: index support for regexp search

2012-12-17 Thread Alexander Korotkov
On Mon, Dec 17, 2012 at 1:16 PM, Alexander Korotkov aekorot...@gmail.comwrote: Didn't reproduce it yet. Can you retry it with this line uncommented: #define TRGM_REGEXP_DEBUG Then we can see which stage it fails. Bug is found and fixed in attached patch. -- With best regards, Alexander

Re: [HACKERS] WIP: index support for regexp search

2012-12-18 Thread Alexander Korotkov
On Tue, Dec 18, 2012 at 11:45 AM, Erik Rijkers e...@xs4all.nl wrote: On Tue, December 18, 2012 08:04, Alexander Korotkov wrote: I ran the same test again: HEAD versus trgm_regex v6, 7 and 9. In v9 there is some gain but also some regression. It remains a difficult problem... If I get

Re: [HACKERS] WIP: index support for regexp search

2012-12-18 Thread Alexander Korotkov
On Tue, Dec 18, 2012 at 12:51 PM, Erik Rijkers e...@xs4all.nl wrote: On Tue, December 18, 2012 09:45, Alexander Korotkov wrote: You should use {0,n} to express from 0 to n occurences. Thanks, but I know that of course. It's a testing program; and in the end robustness with unexpected

Re: [HACKERS] WIP: store additional info in GIN index

2012-12-22 Thread Alexander Korotkov
testcases. Could you share both database and benchmarking script? -- With best regards, Alexander Korotkov.

Re: [HACKERS] Statistics and selectivity estimation for ranges

2013-01-04 Thread Alexander Korotkov
be better? Yes. Fixed. I also renamed get_length_hist_frac to get_length_hist_summ and rewrote comments about it. Hope it becomes more understandable. -- With best regards, Alexander Korotkov. range_stat-0.10.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing

  1   2   3   4   5   6   7   8   9   10   >