Re: I: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-07-06 Thread Leonardo F
> I reviewed > your patch. It seems to be in good shape, and worked as > expected. I > suppressed a compiler warning in the patch and cleaned up > whitespaces in it. > Patch attached. Thanks for the review! I saw that you also changed the writing: LogicalTapeWrite(state->tapeset, tapenum,

Re: [HACKERS] bitmap indexes - performance

2010-07-01 Thread Leonardo F
> In > principle a bitmap index scan should be significantly faster if the > index can > return the bitmap more or less "natively" rather than having > to construct > it. The problem I'm seeing is that even on a 20M rows table, doing a select * from t where c1=10 and c2=1 where c1 and c2 ar

[HACKERS] bitmap indexes - performance

2010-07-01 Thread Leonardo F
Using as a starting point the old bitmap patch in: http://archives.postgresql.org/message-id/20081101000154.go27...@fune I re-applied and re-worked the patch to see what kind of improvements over btrees bitmaps actually provided. Using a 20M rows table of 10/100/1000 random values, I've found t

[HACKERS] access method: are disk pages mandatory?

2010-06-23 Thread Leonardo F
in bufpage.h: "all blocks written out by an access method must be disk pages" but in http://www.postgresql.org/docs/8.4/static/storage-page-layout.html "Actually, index access methods need not use this page format. All the existing index methods do use this basic format, but the data kept on i

Re: [HACKERS] Error with GIT Repository

2010-06-11 Thread Leonardo F
> Why are you cloning over http? Me too I've used http, since I'm behind a proxy and I couldn't find a "simple" way of having the git:// method working behind a proxy... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.pos

Re: [HACKERS] Git: Unable to get pack file

2010-06-09 Thread Leonardo F
> I've re-run git repack on > it, please try again. At least that file is > accessible from here > now.. It worked, thank you very much -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hack

[HACKERS] Git: Unable to get pack file

2010-06-08 Thread Leonardo F
Hi, I tried getting the source using: git clone http://git.postgresql.org/git/postgresql.git postgresql-git but after a while (252MB) I always get: [...] Getting pack 61e1395a5bdacda95de5432123a0f8124fff05e6 which contains 476418893d3a2f366f47dbe4ce6d7522cc427545 error: Unable to get pack file

Re: [HACKERS] [FWD] About "Our CLUSTER implementation is pessimal" patch

2010-02-15 Thread Leonardo F
> Yes. There's not going to be any more commitfests for this release, so > the next commitfest is for 9.1. Perfect! Where could I find such information? I mean: how could I know it? > (don't worry about the lack of enthusiasm for the patch, people are just > very busy with 9.0 and don't have the

Re: [HACKERS] [FWD] About "Our CLUSTER implementation is pessimal" patch

2010-02-15 Thread Leonardo F
> As outlined in the "Submission timing" section, you're > asking about something during the wrong time to be doing so--that's why > you're > not getting any real feedback. Add your patch to the next CommitFest by > linking > to your message at https://commitfest.postgresql.org/ Ok! But th

[HACKERS] [FWD] About "Our CLUSTER implementation is pessimal" patch

2010-02-15 Thread Leonardo F
I really thought this would have caused some interest, since - this item is in the TODO list - the improvement for CLUSTER in some scenarios is 800%, and maybe more (if I didn't do anything wrong, of course...) Could at least the message: http://archives.postgresql.org/pgsql-hackers/2010-02/msg00

Re: I: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-02-10 Thread Leonardo F
>Perhaps you could supply a .sql file containing a testcase > illustrating the performance benefits you tested with your patch Sure. Attached the updated patch (should solve a bug) and a script. The sql scripts generates a 2M rows table ("orig"); then the table is copied and the copy clustered

Re: I: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-02-10 Thread Leonardo F
I think I've found the problem: tuple->t_data wasn't at HEAPTUPLESIZE distance from tuple. I guess the code makes that assumption somewhere, so I had to do: tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE); Now that

Re: I: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-02-10 Thread Leonardo F
> I think you're confusing HeapTuple and HeapTupleHeader. SortTuple->tuple > field should point to a HeapTupleHeader, not a HeapTuple. Mmh, I don't get it: that would mean I might be using more info than required, but I still don't understand why it's not working... at the end, CLUSTER is going

Re: I: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-02-10 Thread Leonardo F
Hi all, while testing the seq scan + sort CLUSTER implementation, I've found a bug in write/readtup_rawheap. The functions are needed by the sort part. The code in http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php didn't copy the whole tuple, but only the HeapTuple "header": t

I: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-02-09 Thread Leonardo F
Not even a comment? As I said, performance results on my system were very good > I know you're all very busy getting 9.0 out, but I think the results in > heap scanning + sort instead of index scanning for CLUSTER are > very good... I would like to know if I did something wrong/I should > impr

[HACKERS] IndexBuildHeapScan and RIDs order

2010-02-08 Thread Leonardo F
Hi, I was looking at the code for bitmap index: http://archives.postgresql.org/pgsql-hackers/2008-10/msg01691.php and I couldn't understand why during "bmbuild" (the ambuild call for bitmap indexes) the code checks for not-ordered ItemPointerData(s). In which cases the ItemPointerData(s) given

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-02-01 Thread Leonardo F
I know you're all very busy getting 9.0 out, but I think the results in heap scanning + sort instead of index scanning for CLUSTER are very good... I would like to know if I did something wrong/I should improve something in the patch... I haven't tested it with index expressions yet (but the tests

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-28 Thread Leonardo F
Hi all, attached a patch to do seq scan + sorting instead of index scan on CLUSTER (when that's supposed to be faster). As I've already said, the patch is based on: http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php Of course, the code isn't supposed to be ready to be merged: I

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-27 Thread Leonardo F
> Consider multi-column indexes, ie: > CREATE INDEX i_foo ON foo (length(a), length(b)); Ok, I've never thought of expression indexes that way (in the (expr1,expr2,exprN) form): that is a good example. > Maybe you're confusing expression indexes with partial indexes? No no, that was exactly w

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-26 Thread Leonardo F
> If we ever get another index type that supports ordered > scans, it'll be time enough to worry about cases like this. Ok > BTW, I think you could use tuplesort_begin_index_btree() rather than > touching _bt_mkscankey_nodata directly. well I created my own tuplesort_begin_rawheap method (co

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-25 Thread Leonardo F
> Rule it out. Note you should be looking at pg_am.amcanorder, not > hardwiring knowledge of particular index types. Sorry, I replied "ok" too fast... I can look at pg_am.amcanorder, but I would still need the ScanKey to be used by tuplesort; and I can't find any other way of doing it than ca

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-22 Thread Leonardo F
> Note you should be looking at pg_am.amcanorder, not > hardwiring knowledge of particular index types. Ok. Would it make sense to use FormIndexDatum to get the index value to be used by tuplesort? I'm having trouble avoiding the call to _bt_mkscankey_nodata to get the scanKeys... that relat

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-22 Thread Leonardo F
So, if I'm not mistaken: hash indexes -> can't be used in CLUSTER gin indexes -> can't be used in CLUSTER that leaves: btree -> ok expression btree -> I have to find a way to compute the expression for each tuple: hints? gist -> how can I get something "comparable" by tuplesort? Or should I r

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-22 Thread Leonardo F
> > So my proposal would be: do the test seq_scan vs sort/index_scan only for > > regular btree index, and integrate that test in the planner. > > Keep in mind that this patch was after the deadline for 9.0, so there > is probably not a huge rush to get this done. That's true; I'll try to get th

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-21 Thread Leonardo F
> Well, the expression cases would be more likely to cost more if > implemented as a sort, but that doesn't mean that a sort couldn't be a > win. Besides, even if you blow off the expression case, what about > nulls first/last, nondefault opclasses, etc? Ok, let's split the problem in 2 parts:

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-21 Thread Leonardo F
> > I meant to add only ASC/DESC; I would leave all other cases > > (non-btrees, custom expression btrees) to use the old index-scan method. > > That hardly seems acceptable. Well I brought up that in an earlier post: http://old.nabble.com/Re%3A-About-%22Our-CLUSTER-implementation-is-pessimal%2

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-21 Thread Leonardo F
> By the time you make this actually work in all cases, it's probably > going to be more of a mess than the other way; I meant to add only ASC/DESC; I would leave all other cases (non-btrees, custom expression btrees) to use the old index-scan method. > not to mention that it > doesn't work *at

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-21 Thread Leonardo F
>one idea could be to actually prepare a query using SPI for "select * from >table order by " and then peek inside > to see which plan was generated. I like that!!! Here's a first attempt, it looks like it's working... (I still have to skip non-btree indexes and expression indexes, plus add a AS

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-21 Thread Leonardo F
> * Do we need to disable sort-path for tables clustered on a gist index? Yes; as I said in a previous mail, only plain btree indexes (that is, not custom expression indexes) would have that option (at least in a first version...) > * I'd prefer to separate cost calculation routines from create_i

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-21 Thread Leonardo F
Anyone? I'd like some feedback before moving on to do the seq scan + sort in those CLUSTER cases where "use_index_scan" returns false... - Messaggio originale - > Da: Leonardo F > A: pgsql-hackers@postgresql.org > Inviato: Mer 20 gennaio 2010, 18:48:00 > O

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-20 Thread Leonardo F
> I read the thread "Our CLUSTER implementation is pessimal" > http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php . > > I would like to try/integrate that patch as we use CLUSTER a lot on our > system. > > I was going to try to add the proper cost_index/cost_sort calls to decide

Re: [HACKERS] Review: Patch: Allow substring/replace() to get/set bit values

2010-01-20 Thread Leonardo F
> All issues addressed, with one tiny nit-pick -- the get_bit and > set_bit methods are not part of the SQL standard. Damn! I completely forgot to mention that I had no idea if what I wrote in the docs made any sense... Well thank you for your thorough review. -- Sent via pgsql-hackers mai

Re: [HACKERS] Review: Patch: Allow substring/replace() to get/set bit values

2010-01-20 Thread Leonardo F
New version of the patch, let me know if I can fix/change something else. Leonardo getsetbit.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Patch: Allow substring/replace() to get/set bit values

2010-01-19 Thread Leonardo F
> In the > documentation, the get_bit and set_bit methods are added to a list > where we state "The following SQL-standard functions work on bit > strings as well as character strings"; however they are not > SQL-standard functions and are implemented on binary strings, not > character strings. O

Re: [HACKERS] Review: Patch: Allow substring/replace() to get/set bit values

2010-01-18 Thread Leonardo F
> This patch no longer applies. Could you rebase it? Done (I think). Added a couple of simple tests for bit overlay. I didn't include the catversion.h changes: obviously the CATALOG_VERSION_NO has to be changed. Leonardo Index: src/backend/utils/adt/varbit.c =

Re: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-15 Thread Leonardo F
> Yeah, I think you could do that, I agree it feels better that way. > You'll still need new copytup and comparetup functions, though, to deal > with HeapTupleHeaders instead of MinimalTuples, or modify the existing > ones to handle both. You meant HeapTuple, not HeapTupleHeaders, right? Mmh, di

[HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-01-15 Thread Leonardo F
Hi, I read the thread "Our CLUSTER implementation is pessimal" http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php . I would like to try/integrate that patch as we use CLUSTER a lot on our system. I was going to try to add the proper cost_index/cost_sort calls to decide which "pa

Re: [HACKERS] Patch: Allow substring/replace() to get/set bit values

2010-01-08 Thread Leonardo F
> What we can do in the back branches is make the code treat any > negative value as meaning two-arg form. To throw an error we'd > need to refactor the pg_proc representation ... I was going to fix that myself, but I think it has just been done. How can I keep up with "who's doing what"? -

Re: [HACKERS] Patch: Allow substring/replace() to get/set bit values

2010-01-07 Thread Leonardo F
> Thanks! Please add your patch here: > > https://commitfest.postgresql.org/action/commitfest_view/open > Ok; but what about what I said about the difference between bit/string substring? That affects overlay behaviour for bit... I've even got "ERROR: invalid memory alloc request size 42

[HACKERS] Patch: Allow substring/replace() to get/set bit values

2010-01-07 Thread Leonardo F
Hi all, attached a patch that adds the following functions for bit string: - overlay - get_bit - set_bit Some info: 1) overlay is implemented as calls to substring; given the different way substring behaves when used with strings vs bit strings: test=# SELECT substring(B'0001' f

Re: I: [HACKERS] TODO: Allow substring/replace() to get/set bit values

2010-01-07 Thread Leonardo F
> > Is anybody interested? Otherwise the entry could be removed from the TODO > list... > > Even if not, you can still submit a patch. There are a lot more users > of PG than there are people who read -hackers. Ok, I'll try and submit a patch. Thank you very much. -- Sent via pgsql-hac

Re: I: [HACKERS] TODO: Allow substring/replace() to get/set bit values

2010-01-06 Thread Leonardo F
> > To sum up: > > > > 1) a new function, "get_bit", that calls substring > > 2) a new function, "overlay", that replaces bits (starting at a certain > position) > > 3) a new function, "set_bit", that calls overlay > > That seems reasonable to me. Not sure what others think. Is anybody interes

Re: I: [HACKERS] TODO: Allow substring/replace() to get/set bit values

2010-01-05 Thread Leonardo F
> As you say, there's really no point in changing the internal > representation, and if you don't find replace() useful either, then > why are you even working on this at all? I would like a get_bit / set_bit for bit strings, as I find them useful. get_bit could be a simple call to substring, bu

Re: I: [HACKERS] TODO: Allow substring/replace() to get/set bit values

2010-01-05 Thread Leonardo F
> You might want to search the archives (or the wiki history, or the CVS > history if it's been there since before we moved the TODO list to the > wiki) for discussion of why that item was added to the TODO in the > first place. I read the thread: http://archives.postgresql.org/pgsql-hackers/20

I: [HACKERS] TODO: Allow substring/replace() to get/set bit values

2010-01-04 Thread Leonardo F
Re-reading the docs it looks like the only thing missing is get/set_bit for bit string. Substring is already implemented for bit string, and I don't really know if replace is useful at all. (sorry if the other mail came with a different sender name) Leonardo > I would like to work on "Al