Re: [HACKERS] GIN improvements part2: fast scan

2014-09-26 Thread Heikki Linnakangas
On 09/25/2014 09:05 PM, Thom Brown wrote: On 12 March 2014 16:29, Heikki Linnakangas hlinnakan...@vmware.com wrote: Good point. We have done two major changes to GIN in this release cycle: changed the data page format and made it possible to skip items without fetching all the keys (fast scan).

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-13 Thread Heikki Linnakangas
On 03/12/2014 07:52 PM, Alexander Korotkov wrote: * I just noticed that the dummy trueTriConsistentFn returns GIN_MAYBE, rather than GIN_TRUE. The equivalent boolean version returns 'true' without recheck. Is that a typo, or was there some reason for the discrepancy? Actually, there is not

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-13 Thread Alexander Korotkov
On Thu, Mar 13, 2014 at 8:58 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 03/12/2014 07:52 PM, Alexander Korotkov wrote: * I just noticed that the dummy trueTriConsistentFn returns GIN_MAYBE, rather than GIN_TRUE. The equivalent boolean version returns 'true' without recheck.

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Heikki Linnakangas
On 02/26/2014 11:25 PM, Alexander Korotkov wrote: On Thu, Feb 27, 2014 at 1:07 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Thu, Feb 20, 2014 at 1:48 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 02/09/2014 12:11 PM, Alexander Korotkov wrote: I've rebased catalog

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Heikki Linnakangas
On 03/12/2014 12:09 AM, Tomas Vondra wrote: Hi all, a quick question that just occured to me - do you plan to tweak the cost estimation fot GIN indexes, in this patch? IMHO it would be appropriate, given the improvements and gains, but it seems to me gincostestimate() was not touched by this

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Alexander Korotkov
On Wed, Mar 12, 2014 at 8:29 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 03/12/2014 12:09 AM, Tomas Vondra wrote: Hi all, a quick question that just occured to me - do you plan to tweak the cost estimation fot GIN indexes, in this patch? IMHO it would be appropriate, given

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Alexander Korotkov
On Wed, Mar 12, 2014 at 8:02 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 02/26/2014 11:25 PM, Alexander Korotkov wrote: On Thu, Feb 27, 2014 at 1:07 AM, Alexander Korotkov aekorot...@gmail.com wrote: On Thu, Feb 20, 2014 at 1:48 PM, Heikki Linnakangas

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Robert Haas
On Wed, Mar 12, 2014 at 1:52 PM, Alexander Korotkov aekorot...@gmail.com wrote: * This patch added a triConsistent function for array and tsvector opclasses. Were you planning to submit a patch to do that for the rest of the opclasses, like pg_trgm? (it's getting awfully late for that...)

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Heikki Linnakangas
On 03/12/2014 07:42 PM, Alexander Korotkov wrote: Preparation we do in startScanKey requires knowledge of estimate size of posting lists/trees. We do this estimate by traversal to leaf pages. I think gincostestimate is expected to be way more cheap. So, we probably need so more rough estimate

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-11 Thread Tomas Vondra
Hi all, a quick question that just occured to me - do you plan to tweak the cost estimation fot GIN indexes, in this patch? IMHO it would be appropriate, given the improvements and gains, but it seems to me gincostestimate() was not touched by this patch. I just ran into this while testing some

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-26 Thread Alexander Korotkov
On Thu, Feb 20, 2014 at 1:48 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 02/09/2014 12:11 PM, Alexander Korotkov wrote: I've rebased catalog changes with last master. Patch is attached. I've rerun my test suite with both last master ('committed') and attached patch

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-26 Thread Alexander Korotkov
On Thu, Feb 27, 2014 at 1:07 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Thu, Feb 20, 2014 at 1:48 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 02/09/2014 12:11 PM, Alexander Korotkov wrote: I've rebased catalog changes with last master. Patch is attached. I've rerun

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-20 Thread Heikki Linnakangas
On 02/09/2014 12:11 PM, Alexander Korotkov wrote: I've rebased catalog changes with last master. Patch is attached. I've rerun my test suite with both last master ('committed') and attached patch ('ternary-consistent'). Thanks! method | sum

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-09 Thread Alexander Korotkov
On Fri, Feb 7, 2014 at 5:33 PM, Heikki Linnakangas hlinnakan...@vmware.comwrote: On 02/06/2014 01:22 PM, Alexander Korotkov wrote: Difference is very small. For me, it looks ready for commit. Great, committed! Now, to review the catalog changes... I've rebased catalog changes with last

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-09 Thread Tomas Vondra
On 3.2.2014 07:53, Oleg Bartunov wrote: Tomasa, it'd be nice if you use real data in your testing. One very good application of gin fast-scan is dramatic performance improvement of hstore/jsonb @ operator, see slides 57, 58

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-09 Thread Erik Rijkers
On Sun, February 9, 2014 22:35, Tomas Vondra wrote: On 3.2.2014 07:53, Oleg Bartunov wrote: PS. I used data delicious-rss-1250k.gz from http://randomwalker.info/data/delicious/ I'm working on extending the GIN testing to include this test (and I'll use it to test both for GIN and hstore-v2

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-09 Thread Tomas Vondra
On 9.2.2014 22:51, Erik Rijkers wrote: On Sun, February 9, 2014 22:35, Tomas Vondra wrote: On 3.2.2014 07:53, Oleg Bartunov wrote: PS. I used data delicious-rss-1250k.gz from http://randomwalker.info/data/delicious/ I'm working on extending the GIN testing to include this test (and I'll use

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-07 Thread Heikki Linnakangas
On 02/06/2014 01:22 PM, Alexander Korotkov wrote: Difference is very small. For me, it looks ready for commit. Great, committed! Now, to review the catalog changes... - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-06 Thread Heikki Linnakangas
On 02/05/2014 12:42 PM, Alexander Korotkov wrote: Attached patch is light version of fast scan. It does extra consistent function calls only on startScanKey, no extra calls during scan of the index. It finds subset of rarest entries absence of which guarantee false consistent function result.

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-06 Thread Alexander Korotkov
On Thu, Feb 6, 2014 at 2:21 PM, Heikki Linnakangas hlinnakan...@vmware.comwrote: On 02/05/2014 12:42 PM, Alexander Korotkov wrote: Attached patch is light version of fast scan. It does extra consistent function calls only on startScanKey, no extra calls during scan of the index. It finds

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-05 Thread Alexander Korotkov
On Wed, Feb 5, 2014 at 1:23 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Mon, Feb 3, 2014 at 6:31 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Mon, Jan 27, 2014 at 7:30 PM, Alexander Korotkov aekorot...@gmail.com wrote: On Mon, Jan 27, 2014 at 2:32 PM, Alexander Korotkov

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-04 Thread Alexander Korotkov
On Mon, Feb 3, 2014 at 6:31 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Mon, Jan 27, 2014 at 7:30 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Mon, Jan 27, 2014 at 2:32 PM, Alexander Korotkov aekorot...@gmail.com wrote: On Sun, Jan 26, 2014 at 8:14 PM, Heikki Linnakangas

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-03 Thread Tomas Vondra
Hi Oleg, On 3 Únor 2014, 7:53, Oleg Bartunov wrote: Tomasa, it'd be nice if you use real data in your testing. I'm using a mailing list archive (the benchmark is essentially a simple search engine on top of the archive, implemented using built-in full-text). So I think this is a quite 'real'

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-03 Thread Jesper Krogh
On 03/02/14 02:44, Tomas Vondra wrote: (2) The question is whether the new patch works fine on rare words. See this for comparison of the patches against HEAD: http://www.fuzzy.cz/tmp/gin/3-rare-words.png http://www.fuzzy.cz/tmp/gin/3-rare-words-new.png and this is the

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-03 Thread Alexander Korotkov
On Mon, Jan 27, 2014 at 7:30 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Mon, Jan 27, 2014 at 2:32 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Sun, Jan 26, 2014 at 8:14 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01/26/2014 08:24 AM, Tomas Vondra wrote:

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-03 Thread Tomas Vondra
Hi Alexander, On 3 Únor 2014, 15:31, Alexander Korotkov wrote: I found my patch 0005-Ternary-consistent-implementation.patch to be completely wrong. It introduces ternary consistent function to opclass, but don't uses it, because I forgot to include ginlogic.c change into patch. So, it

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-03 Thread Alexander Korotkov
On Thu, Jan 30, 2014 at 8:38 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01/30/2014 01:53 AM, Tomas Vondra wrote: (3) A file with explain plans for 4 queries suffering ~2x slowdown, and explain plans with 9.4 master and Heikki's patches is available here:

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-03 Thread Alexander Korotkov
On Mon, Feb 3, 2014 at 7:24 PM, Tomas Vondra t...@fuzzy.cz wrote: On 3 Únor 2014, 15:31, Alexander Korotkov wrote: I found my patch 0005-Ternary-consistent-implementation.patch to be completely wrong. It introduces ternary consistent function to opclass, but don't uses it, because I

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-03 Thread Tomas Vondra
On 3 Únor 2014, 17:08, Alexander Korotkov wrote: On Mon, Feb 3, 2014 at 7:24 PM, Tomas Vondra t...@fuzzy.cz wrote: On 3 Únor 2014, 15:31, Alexander Korotkov wrote: I found my patch 0005-Ternary-consistent-implementation.patch to be completely wrong. It introduces ternary consistent

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-03 Thread Alexander Korotkov
On Mon, Feb 3, 2014 at 8:19 PM, Tomas Vondra t...@fuzzy.cz wrote: Sometimes test cases are not what we expect. For example: =# explain SELECT id FROM messages WHERE body_tsvector @@ to_tsquery('english','(5alpha1-initdb''d)'); QUERY

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-03 Thread Tomas Vondra
On 3 Únor 2014, 19:18, Alexander Korotkov wrote: On Mon, Feb 3, 2014 at 8:19 PM, Tomas Vondra t...@fuzzy.cz wrote: Sometimes test cases are not what we expect. For example: =# explain SELECT id FROM messages WHERE body_tsvector @@ to_tsquery('english','(5alpha1-initdb''d)');

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-02 Thread Heikki Linnakangas
On 01/30/2014 01:53 AM, Tomas Vondra wrote: (3) A file with explain plans for 4 queries suffering ~2x slowdown, and explain plans with 9.4 master and Heikki's patches is available here: http://www.fuzzy.cz/tmp/gin/queries.txt All the queries have 6 common words, and the

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-02 Thread Tomas Vondra
On 2.2.2014 11:45, Heikki Linnakangas wrote: On 01/30/2014 01:53 AM, Tomas Vondra wrote: (3) A file with explain plans for 4 queries suffering ~2x slowdown, and explain plans with 9.4 master and Heikki's patches is available here: http://www.fuzzy.cz/tmp/gin/queries.txt

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-02 Thread Oleg Bartunov
Tomasa, it'd be nice if you use real data in your testing. One very good application of gin fast-scan is dramatic performance improvement of hstore/jsonb @ operator, see slides 57, 58 http://www.sai.msu.su/~megera/postgres/talks/hstore-dublin-2013.pdf. I'd like not to lost this benefit :) Oleg

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-30 Thread Heikki Linnakangas
On 01/30/2014 01:53 AM, Tomas Vondra wrote: (3) A file with explain plans for 4 queries suffering ~2x slowdown, and explain plans with 9.4 master and Heikki's patches is available here: http://www.fuzzy.cz/tmp/gin/queries.txt All the queries have 6 common words, and the

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-29 Thread Tomas Vondra
On 28.1.2014 08:29, Heikki Linnakangas wrote: On 01/28/2014 05:54 AM, Tomas Vondra wrote: Then I ran those scripts on: * 9.3 * 9.4 with Heikki's patches (9.4-heikki) * 9.4 with Heikki's and first patch (9.4-alex-1) * 9.4 with Heikki's and both patches (9.4-alex-2) It would be

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-27 Thread Alexander Korotkov
On Sun, Jan 26, 2014 at 8:14 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01/26/2014 08:24 AM, Tomas Vondra wrote: Hi! On 25.1.2014 22:21, Heikki Linnakangas wrote: Attached is a new version of the patch set, with those bugs fixed. I've done a bunch of tests with all the 4

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-27 Thread Alexander Korotkov
On Sun, Jan 26, 2014 at 8:14 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: In addition to that, I'm using the ternary consistent function to check if minItem is a match, even if we haven't loaded all the entries yet. That's less important, but I think for something like rare1 | (rare2

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-27 Thread Heikki Linnakangas
On 01/28/2014 05:54 AM, Tomas Vondra wrote: Then I ran those scripts on: * 9.3 * 9.4 with Heikki's patches (9.4-heikki) * 9.4 with Heikki's and first patch (9.4-alex-1) * 9.4 with Heikki's and both patches (9.4-alex-2) It would be good to also test with unpatched 9.4 (ie. git

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-26 Thread Andres Freund
On 2014-01-26 07:24:58 +0100, Tomas Vondra wrote: Not sure how to interpret that, though. For example where did the ginCompareItemPointers go? I suspect it's thanks to inlining, and that it might be related to the performance decrease. Or maybe not. Try recompiling with

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-26 Thread Heikki Linnakangas
On 01/26/2014 08:24 AM, Tomas Vondra wrote: Hi! On 25.1.2014 22:21, Heikki Linnakangas wrote: Attached is a new version of the patch set, with those bugs fixed. I've done a bunch of tests with all the 4 patches applied, and it seems to work now. I've done tests with various conditions

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-26 Thread Tomas Vondra
On 26.1.2014 17:14, Heikki Linnakangas wrote: I would actually expect it to be fairly effective for that query, so that's a bit surprising. I added counters to see where the calls are coming from, and it seems that about 80% of the calls are actually coming from this little the feature I

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-25 Thread Tomas Vondra
On 23.1.2014 17:22, Heikki Linnakangas wrote: I measured the time that query takes, and the number of pages hit, using explain (analyze, buffers true) patchestime (ms)buffers --- unpatched6501316 patch 10.521316 patches 1+20.501316

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-25 Thread Tomas Vondra
Hi! On 25.1.2014 22:21, Heikki Linnakangas wrote: Attached is a new version of the patch set, with those bugs fixed. I've done a bunch of tests with all the 4 patches applied, and it seems to work now. I've done tests with various conditions (AND/OR, number of words, number of conditions) and I

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-24 Thread Alexander Korotkov
On Thu, Jan 23, 2014 at 8:22 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01/14/2014 05:35 PM, Alexander Korotkov wrote: Attached version is rebased against last version of packed posting lists. Thanks! I think we're missing a trick with multi-key queries. We know that when

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-24 Thread Heikki Linnakangas
On 01/24/2014 01:58 PM, Alexander Korotkov wrote: On Thu, Jan 23, 2014 at 8:22 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: In summary, these are fairly small patches, and useful on their, so I think these should be committed now. But please take a look and see if the logic in

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-23 Thread Heikki Linnakangas
On 01/14/2014 05:35 PM, Alexander Korotkov wrote: Attached version is rebased against last version of packed posting lists. Thanks! I think we're missing a trick with multi-key queries. We know that when multiple scan keys are used, they are ANDed together, so we can do the skip

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-23 Thread Tomas Vondra
On 23.1.2014 17:22, Heikki Linnakangas wrote: On 01/14/2014 05:35 PM, Alexander Korotkov wrote: Attached version is rebased against last version of packed posting lists. Thanks! I think we're missing a trick with multi-key queries. We know that when multiple scan keys are used, they are

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-23 Thread Alexander Korotkov
On Fri, Jan 24, 2014 at 6:48 AM, Tomas Vondra t...@fuzzy.cz wrote: I plan to do more thorough testing over the weekend, but I'd like to make sure I understand what to expect. My understanding is that this patch should: - give the same results as the current code (e.g. the fulltext should

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-14 Thread Alexander Korotkov
On Thu, Nov 21, 2013 at 12:14 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Wed, Nov 20, 2013 at 3:06 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Fri, Nov 15, 2013 at 11:19 AM, Alexander Korotkov aekorot...@gmail.com wrote: On Fri, Nov 15, 2013 at 12:34 AM, Heikki

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-14 Thread Heikki Linnakangas
On 01/14/2014 05:35 PM, Alexander Korotkov wrote: On Thu, Nov 21, 2013 at 12:14 AM, Alexander Korotkov aekorot...@gmail.comwrote: Revised version of patch is attached. Changes are so: 1) Support for GinFuzzySearchLimit. 2) Some documentation. Question about GinFuzzySearchLimit is still

Re: [HACKERS] GIN improvements part2: fast scan

2014-01-14 Thread Alexander Korotkov
On Tue, Jan 14, 2014 at 11:07 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01/14/2014 05:35 PM, Alexander Korotkov wrote: On Thu, Nov 21, 2013 at 12:14 AM, Alexander Korotkov aekorot...@gmail.comwrote: Revised version of patch is attached. Changes are so: 1) Support for

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-20 Thread Alexander Korotkov
On Wed, Nov 20, 2013 at 3:06 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Fri, Nov 15, 2013 at 11:19 AM, Alexander Korotkov aekorot...@gmail.com wrote: On Fri, Nov 15, 2013 at 12:34 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 14.11.2013 19:26, Alexander Korotkov

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-19 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 11:19 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Fri, Nov 15, 2013 at 12:34 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 14.11.2013 19:26, Alexander Korotkov wrote: On Sun, Jun 30, 2013 at 3:00 PM, Heikki Linnakangas hlinnakan...@vmware.com

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-18 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 11:42 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Fri, Nov 15, 2013 at 11:39 PM, Rod Taylor rod.tay...@gmail.com wrote: On Fri, Nov 15, 2013 at 2:26 PM, Alexander Korotkov aekorot...@gmail.com wrote: On Fri, Nov 15, 2013 at 11:18 PM, Rod Taylor

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-18 Thread Rod Taylor
On Fri, Nov 15, 2013 at 2:42 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Fri, Nov 15, 2013 at 11:39 PM, Rod Taylor rod.tay...@gmail.com wrote: The patched index is 58% of the 9.4 master size. 212 MB instead of 365 MB. Good. That's meet my expectations :) You mention that both

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-18 Thread Rod Taylor
I checked out master and put together a test case using a small percentage of production data for a known problem we have with Pg 9.2 and text search scans. A small percentage in this case means 10 million records randomly selected; has a few billion records. Tests ran for master successfully

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-18 Thread Rod Taylor
I tried again this morning using gin-packed-postinglists-16.patch and gin-fast-scan.6.patch. No crashes. It is about a 0.1% random sample of production data (10,000,000 records) with the below structure. Pg was compiled with debug enabled in both cases. Table public.kp Column | Type |

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Rod Taylor
I tried again this morning using gin-packed-postinglists-16.patch and gin-fast-scan.6.patch. No crashes during index building. Pg was compiled with debug enabled in both cases. The data is a ~0.1% random sample of production data (10,000,000 records for the test) with the below structure.

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 6:57 PM, Rod Taylor p...@rbt.ca wrote: I tried again this morning using gin-packed-postinglists-16.patch and gin-fast-scan.6.patch. No crashes. It is about a 0.1% random sample of production data (10,000,000 records) with the below structure. Pg was compiled with

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Rod Taylor
2%. It's essentially sentence fragments from 1 to 5 words in length. I wasn't expecting it to be much smaller. 10 recent value selections: white vinegar reduce color running vinegar cure uti cane vinegar acidity depends parameter how remedy fir clogged shower use vinegar sensitive skin

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 11:18 PM, Rod Taylor rod.tay...@gmail.com wrote: 2%. It's essentially sentence fragments from 1 to 5 words in length. I wasn't expecting it to be much smaller. 10 recent value selections: white vinegar reduce color running vinegar cure uti cane vinegar acidity

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Rod Taylor
On Fri, Nov 15, 2013 at 2:26 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Fri, Nov 15, 2013 at 11:18 PM, Rod Taylor rod.tay...@gmail.com wrote: 2%. It's essentially sentence fragments from 1 to 5 words in length. I wasn't expecting it to be much smaller. 10 recent value

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 11:39 PM, Rod Taylor rod.tay...@gmail.com wrote: On Fri, Nov 15, 2013 at 2:26 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Fri, Nov 15, 2013 at 11:18 PM, Rod Taylor rod.tay...@gmail.comwrote: 2%. It's essentially sentence fragments from 1 to 5 words in

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 12:26 PM, Alexander Korotkov wrote: Revised version of patch is attached. This doesn't build: ginget.c: In function ‘scanPage’: ginget.c:1108:2: warning: implicit declaration of function ‘GinDataLeafPageGetPostingListEnd’ [-Wimplicit-function-declaration] ginget.c:1108:9: warning:

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Alexander Korotkov
On Sat, Nov 16, 2013 at 12:10 AM, Peter Eisentraut pete...@gmx.net wrote: On 11/14/13, 12:26 PM, Alexander Korotkov wrote: Revised version of patch is attached. This doesn't build: ginget.c: In function ‘scanPage’: ginget.c:1108:2: warning: implicit declaration of function

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-14 Thread Alexander Korotkov
On Sun, Jun 30, 2013 at 3:00 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 28.06.2013 22:31, Alexander Korotkov wrote: Now, I got the point of three state consistent: we can keep only one consistent in opclasses that support new interface. exact true and exact false values will

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-14 Thread Heikki Linnakangas
On 14.11.2013 19:26, Alexander Korotkov wrote: On Sun, Jun 30, 2013 at 3:00 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 28.06.2013 22:31, Alexander Korotkov wrote: Now, I got the point of three state consistent: we can keep only one consistent in opclasses that support new

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-14 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 3:25 AM, Rod Taylor r...@simple-knowledge.comwrote: I checked out master and put together a test case using a small percentage of production data for a known problem we have with Pg 9.2 and text search scans. A small percentage in this case means 10 million records

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-14 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 12:34 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 14.11.2013 19:26, Alexander Korotkov wrote: On Sun, Jun 30, 2013 at 3:00 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 28.06.2013 22:31, Alexander Korotkov wrote: Now, I got the point

Re: [HACKERS] GIN improvements part2: fast scan

2013-07-06 Thread Tomas Vondra
Hi, this is a follow-up to the message I posted to the thread about additional info in GIN. I've applied both ginaddinfo.7.patch and gin_fast_scan.4.patch on commit b8fd1a09, but I'm observing a lot of failures like this: STATEMENT: SELECT id FROM messages WHERE body_tsvector @@

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-30 Thread Heikki Linnakangas
On 28.06.2013 22:31, Alexander Korotkov wrote: Now, I got the point of three state consistent: we can keep only one consistent in opclasses that support new interface. exact true and exact false values will be passed in the case of current patch consistent; exact false and unknown will be passed

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-28 Thread Alexander Korotkov
On Tue, Jun 25, 2013 at 2:20 AM, Alexander Korotkov aekorot...@gmail.comwrote: 4. If we do go with a new function, I'd like to just call it consistent (or consistent2 or something, to keep it separate form the old consistent function), and pass it a tri-state input for each search term. It

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-24 Thread Alexander Korotkov
On Fri, Jun 21, 2013 at 11:43 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 19.06.2013 11:56, Alexander Korotkov wrote: On Wed, Jun 19, 2013 at 12:49 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 19.06.2013 11:30, Alexander Korotkov wrote: On Wed, Jun 19, 2013 at

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-21 Thread Heikki Linnakangas
On 19.06.2013 11:56, Alexander Korotkov wrote: On Wed, Jun 19, 2013 at 12:49 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 19.06.2013 11:30, Alexander Korotkov wrote: On Wed, Jun 19, 2013 at 11:48 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 18.06.2013 23:59,

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-19 Thread Heikki Linnakangas
On 18.06.2013 23:59, Alexander Korotkov wrote: I would like to illustrate that on example. Imagine you have fulltext query rare_term frequent_term. Frequent term has large posting tree while rare term has only small posting list containing iptr1, iptr2 and iptr3. At first we get iptr1 from

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-19 Thread Alexander Korotkov
On Wed, Jun 19, 2013 at 11:48 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 18.06.2013 23:59, Alexander Korotkov wrote: I would like to illustrate that on example. Imagine you have fulltext query rare_term frequent_term. Frequent term has large posting tree while rare term has

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-19 Thread Alexander Korotkov
On Wed, Jun 19, 2013 at 12:30 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Wed, Jun 19, 2013 at 11:48 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 18.06.2013 23:59, Alexander Korotkov wrote: I would like to illustrate that on example. Imagine you have fulltext query

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-19 Thread Heikki Linnakangas
On 19.06.2013 11:30, Alexander Korotkov wrote: On Wed, Jun 19, 2013 at 11:48 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 18.06.2013 23:59, Alexander Korotkov wrote: I would like to illustrate that on example. Imagine you have fulltext query rare_term frequent_term. Frequent

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-19 Thread Alexander Korotkov
On Wed, Jun 19, 2013 at 12:49 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 19.06.2013 11:30, Alexander Korotkov wrote: On Wed, Jun 19, 2013 at 11:48 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 18.06.2013 23:59, Alexander Korotkov wrote: I would like to

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-18 Thread Alexander Korotkov
On Mon, Jun 17, 2013 at 5:09 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 17.06.2013 15:55, Alexander Korotkov wrote: On Sat, Jun 15, 2013 at 2:55 AM, Alexander Korotkovaekorot...@gmail.com **wrote: attached patch implementing fast scan technique for GIN. This is second patch

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-17 Thread Alexander Korotkov
On Sat, Jun 15, 2013 at 2:55 AM, Alexander Korotkov aekorot...@gmail.comwrote: attached patch implementing fast scan technique for GIN. This is second patch of GIN improvements, see the 1st one here:

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-17 Thread Heikki Linnakangas
On 17.06.2013 15:55, Alexander Korotkov wrote: On Sat, Jun 15, 2013 at 2:55 AM, Alexander Korotkovaekorot...@gmail.comwrote: attached patch implementing fast scan technique for GIN. This is second patch of GIN improvements, see the 1st one here: