date:20130630



[...] Why?  I don't know exactly why, but I am sure that pgbench isn't 
doing anything weird.  It's either libpq acting funny, or the OS.


My guess is the OS. PQfinish or select do/are systems calls that 
present opportunities to switch context. I think that the OS is passing 
time with other processes on the same host, expecially postgres backends, 
when it is not with the client. In order to test that, pgbench should run 
on a dedicated box with less threads than the number of available cores, 
or user time could be measured in addition to elapsed time. Also, testing 
with many clients per thread means that if any client is stuck all other 
clients incur an artificial latency: measures are intrinsically fragile.


I need to catch up with revisions done to this feature since I started 
instrumenting my copy more heavily.  I hope I can get this ready for 
commit by Monday.


Ok, thanks!

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New regression test time




If we had a different set of tests, that would be a valid argument.  But
we don't, so it's not.  And nobody has offered to write a feature to
split our tests either.


I have done a POC. See:

https://commitfest.postgresql.org/action/patch_view?id=1170

What I have not done is to decide how to split tests in two buckets.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor inheritance/check bug: Inconsistent behavior


On Saturday, June 29, 2013 4:58 AM Bruce Momjian wrote:
On Sat, Jan 26, 2013 at 11:31:49AM +0530, Amit Kapila wrote:
 On Friday, January 25, 2013 8:36 PM Bruce Momjian wrote:
  On Fri, Sep 14, 2012 at 02:04:51PM +, Amit kapila wrote:
   On Thu, 6 Sep 2012 14:50:05 -0400 Robert Hass wrote:
  
   On Tue, Aug 28, 2012 at 6:40 AM, Amit Kapila
  amit(dot)kapila(at)huawei(dot)
   com wrote:
  AFAICT during Update also, it doesn't contain useful. The only
  chance it
would have contain something useful is when it goes for
  EvalPlanQual and
then again comes to check for constraints. However these
  attributes get
filled in heap_update much later.
   
So now should the fix be that it returns an error for system
  column
reference except for OID case?
  
+1.
  
  
  
   1. I think in this scenario the error for system column except for
  tableOID
   should be thrown at Create/Alter time.
  
   2. After setting OID in ExecInsert/ExecUpdate may be setting of same
  inside
   heap functions can be removed.
  
  But for now I have kept them as it is.
  
  
  
   Please find the Patch for bug-fix.
  
   If this is okay, I shall send you the test cases for same.
 
  Did we ever make any progress on this?

 I have done the initial analysis and prepared a patch, don't know if
 anything more I can do until
 someone can give any suggestions to further proceed on this bug.

So, I guess we never figured this out.

I can submit this bug-fix for next commitfest if there is no objection for 
doing so.
What is your opinion?


With Regards,
Amit Kapila.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Move unused buffers to freelist

On Friday, June 28, 2013 6:20 PM Robert Haas wrote:
On Fri, Jun 28, 2013 at 12:52 AM, Amit Kapila amit.kap...@huawei.com wrote:
 Currently it wakes up based on bgwriterdelay config parameter which is by
 default 200ms, so you means we should
 think of waking up bgwriter based on allocations and number of elements left
 in freelist?

 I think that's what Andres and I are proposing, yes.

 As per my understanding Summarization of points raised by you and Andres
 which this patch should address to have a bigger win:

 1. Bgwriter needs to be improved so that it can help in reducing usage count
 and finding next victim buffer
(run the clock sweep and add buffers to the free list).

 Check.

 2. SetLatch for bgwriter (wakeup bgwriter) when elements in freelist are
 less.

Check.  The way to do this is to keep a variable in shared memory in
the same cache line as the spinlock protecting the freelist, and
update it when you update the free list.

 3. Split the workdone globallock (Buffreelist) in StrategyGetBuffer
(a spinlock for the freelist, and an lwlock for the clock sweep).

Check.

 4. Separate processes for writing dirty buffers and moving buffers to
 freelist

 I think this part might be best pushed to a separate patch, although I
 agree we probably need it.

 5. Bgwriter needs to be more aggressive, logic based on which it calculates
 how many buffers it needs to process needs to be improved.

 This is basically overlapping with points already made.  I suspect we
 could just get rid of bgwriter_delay, bgwriter_lru_maxpages, and
 bgwriter_lru_multiplier altogether.  The background writer would just
 have a high and a low watermark.  When the number of buffers on the
 freelist drops below the low watermark, the allocating backend sets
 the latch and bgwriter wakes up and begins adding buffers to the
 freelist.  When the number of buffers on the free list reaches the
 high watermark, the background writer goes back to sleep.  Some
 experimentation might be needed to figure out what values are
 appropriate for those watermarks.  In theory this could be a
 configuration knob, but I suspect it's better to just make the system
 tune it right automatically.

Do you think it will be sufficient to just wake bgwriter when the buffers in 
freelist drops
below low watermark, how about it's current job of flushing dirty buffers?

I mean to ask that if for some scenario where there are sufficient buffers in 
freelist, but most
other buffers are dirty, will delaying flush untill number of buffers fall 
below low watermark is okay.

 6. There can be contention around buffer mapping locks, but we can focus on
 it later
 7. cacheline bouncing around the buffer header spinlocks, is there anything
 we can do to reduce this?

 I think these are points that we should leave for the future.

with Regards,
Amit Kapila.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Move unused buffers to freelist


On Friday, June 28, 2013 6:38 PM Robert Haas wrote:
On Fri, Jun 28, 2013 at 8:50 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jun 28, 2013 at 12:52 AM, Amit Kapila amit.kap...@huawei.com wrote:
 Currently it wakes up based on bgwriterdelay config parameter which is by
 default 200ms, so you means we should
 think of waking up bgwriter based on allocations and number of elements left
 in freelist?

 I think that's what Andres and I are proposing, yes.

 Incidentally, I'm going to mark this patch Returned with Feedback in
the CF application.  

Many thanks to you and Andres for providing valuable suggestions.

I think this line of inquiry has potential, but
clearly there's a lot more work to do here before we commit anything,
and I don't think that's going to happen in the next few weeks.  But
let's keep discussing.

Sure.

With Regards,
Amit Kapila.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New regression test time


On Sunday, June 30, 2013 11:37 AM Fabien COELHO wrote:
 If we had a different set of tests, that would be a valid argument.  But
 we don't, so it's not.  And nobody has offered to write a feature to
 split our tests either.

I have done a POC. See:

 https://commitfest.postgresql.org/action/patch_view?id=1170

I think it is better to submit for next commit fest which is at below link:

https://commitfest.postgresql.org/action/commitfest_view?id=19


With Regards,
Amit Kapila.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-30 Thread Dean Rasheed

On 29 June 2013 17:30, Jeff Davis pg...@j-davis.com wrote:

 On Mon, 2013-06-24 at 18:01 +0100, Nicholas White wrote:
 Good catch - I've attached a patch to address your point 1. It now
 returns the below (i.e. correctly doesn't fill in the saved value if
 the index is out of the window. However, I'm not sure whether (e.g.)
 lead-2-ignore-nulls means count forwards two rows, and if that's null
 use the last one you've seen (the current implementation) or count
 forwards two non-null rows (as you suggest). The behaviour isn't
 specified in a (free) draft of the 2003 standard
 (http://www.wiscorp.com/sql_2003_standard.zip), and I don't have
 access to the (non-free) final version. Could someone who does have
 access to it clarify this? I've also added your example to the
 regression test cases.

 Reading a later version of the draft, it is specified, but is still
 slightly unclear.

 As I see it, the standard describes the behavior in terms of eliminating
 the NULL rows entirely before applying the offset. This matches Troels's
 interpretation. Are you aware of any implementations that do something
 different?

 I didn't include this functionality for the first / last value window
 functions as their implementation is currently a bit different; they
 just call WinGetFuncArgInFrame to pick out a single value. Making
 these functions respect nulls would involve changing the single lookup
 to a walk through the tuples to find the first non-null version, and
 keeping track of this index in a struct in the context. As this change
 is reasonably orthogonal I was going to submit it as a separate patch.

 Sounds good.


I took a quick look at this and I think there are still a few problems:

1). The ignore/respect nulls flag needs to be per-window-function
data, not a window frame option, because the same window may be shared
by multiple window function calls. For example, the following test
causes a crash:

SELECT val,
   lead(val, 2) IGNORE NULLS OVER w,
   lead(val, 2) RESPECT NULLS OVER w
  FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
WINDOW w as ();

The connection to the server was lost. Attempting reset: Failed.

2). As Troels Nielsen said up-thread, I think this should throw a
FEATURE_NOT_SUPPORTED error if it is used for window functions that
don't support it, rather than silently ignoring the flag.

3). Similarly, the parser accepts ignore/respect nulls for arbitrary
aggregate functions over a window, so maybe this should also throw a
FEATURE_NOT_SUPPORTED error. Alternatively, it might be trivial to
make all aggregate functions work with ignore nulls in a window
context, simply by using the existing code for strict aggregate
transition functions. That might be quite handy to support things like
array_agg(val) IGNORE NULLS OVER(...).

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-30 Thread Heikki Linnakangas


On 28.06.2013 22:31, Alexander Korotkov wrote:

Now, I got the point of three state consistent: we can keep only one
consistent in opclasses that support new interface. exact true and exact
false values will be passed in the case of current patch consistent; exact
false and unknown will be passed in the case of current patch
preConsistent. That's reasonable.


I'm going to mark this as returned with feedback. For the next 
version, I'd like to see the API changed per above. Also, I'd like us to 
do something about the tidbitmap overhead, as a separate patch before 
this, so that we can assess the actual benefit of this patch. And a new 
test case that demonstrates the I/O benefits.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GIN improvements part 3: ordering in index

2013-06-30 Thread Heikki Linnakangas


On 25.06.2013 21:18, Alexander Korotkov wrote:

On Tue, Jun 25, 2013 at 7:31 PM, Heikki Linnakangashlinnakan...@vmware.com

wrote:



In summary: The test case you presented as motivation for this patch is a
bit of a worst-case scenario for the current tidbitmap implementation. The
speedup from your patch comes from avoiding the tidbitmap. However, it
would be fairly easy to optimize the tidbitmap to handle this scenario
better, which would benefit all kinds of queries that use bitmap scans.
There is really no reason to complicate the GIN API for this. Let's just
optimize tidbitmap.

I'm not sure if I fullly understand your patch, though. Is there some
other test scenario where it performs significantly better, which can not
be attributed to a tidbitmap overhead? I'm assuming 'no' for now, and
marking this patch as rejected in the commitfest app, but feel free to
reopen if there is.


So, it's likely I've positioned this patch wrong from the begging, because
my examples were focused on CPU time improvement. But initial purpose of
this patch was to decrease IO.


Ok. Storing the additional information bloats the index considerably, so 
it's clearly not going to be a win in all cases. So whether you store 
the additional information or not needs to configurable somehow.


I'm marking this as returned with feedback, as we need new performance 
testing from I/O point of view. The comparison should be with the base 
additional information patch or at least the part of that that packs 
the item pointers more tightly. Also, this depends on the additional 
information patch, so we need to get that committed before this one, and 
I just returned that patch.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] plpython implementation

I'm reading through plperl and plpython implementations and I don't
understand the way they work.

Comments for plperl say that there are two interpreters (trusted and
untrusted) for each user session, and they are stored in a hash.

Plpython version looks quite different, there is no such global hash with
interpreters, there is just a pointer to an interpreter and one global
function _PG_init, which runs once (but per session, user, or what?).

I'm just wondering how a plpython implementation should look like. We need
another interpreter, but PG_init function is run once, should it then
create two interpreters on init, or should we let this function do nothing
and create a proper interpreter in the first call of plpython(u) function
for current session?

thanks,
Szymon

Re: [HACKERS] plpython implementation

2013-06-30 Thread Martijn van Oosterhout

On Sun, Jun 30, 2013 at 01:49:53PM +0200, Szymon Guz wrote:
 I'm reading through plperl and plpython implementations and I don't
 understand the way they work.
 
 Comments for plperl say that there are two interpreters (trusted and
 untrusted) for each user session, and they are stored in a hash.

The point is that python has no version for untrusted users, since it's
been accepted that there's no way to build a python sandbox for
untrusted code. There was actually a small competition to make one but
it failed, since then they don't bother.

Perl does provide a sandbox, hence you can have two interpreters in a
single backend.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature

Re: [HACKERS] plpython implementation



On 06/30/2013 07:49 AM, Szymon Guz wrote:
I'm reading through plperl and plpython implementations and I don't 
understand the way they work.


Comments for plperl say that there are two interpreters (trusted and 
untrusted) for each user session, and they are stored in a hash.


Plpython version looks quite different, there is no such global hash 
with interpreters, there is just a pointer to an interpreter and one 
global function _PG_init, which runs once (but per session, user, or 
what?).


I'm just wondering how a plpython implementation should look like. We 
need another interpreter, but PG_init function is run once, should it 
then create two interpreters on init, or should we let this function 
do nothing and create a proper interpreter in the first call of 
plpython(u) function for current session?






python does not any any sort of reliable sandbox, so there is no 
plpython, only plpythonu - hence only one interpreter per backend is needed.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython implementation

On 30 June 2013 14:13, Andrew Dunstan and...@dunslane.net wrote:


 On 06/30/2013 07:49 AM, Szymon Guz wrote:

 I'm reading through plperl and plpython implementations and I don't
 understand the way they work.

 Comments for plperl say that there are two interpreters (trusted and
 untrusted) for each user session, and they are stored in a hash.

 Plpython version looks quite different, there is no such global hash with
 interpreters, there is just a pointer to an interpreter and one global
 function _PG_init, which runs once (but per session, user, or what?).

 I'm just wondering how a plpython implementation should look like. We
 need another interpreter, but PG_init function is run once, should it then
 create two interpreters on init, or should we let this function do nothing
 and create a proper interpreter in the first call of plpython(u) function
 for current session?




 python does not any any sort of reliable sandbox, so there is no plpython,
 only plpythonu - hence only one interpreter per backend is needed.


Is there any track of the discussion that there is no way to make the
sandbox? I managed to create some kind of sandbox, a simple modification
which totally disables importing modules, so I'm just wondering why it
cannot be done.

Szymon

[HACKERS] Randomisation for ensuring nlogn complexity in quicksort

2013-06-30 Thread Atri Sharma

Hi all,

I have been reading the recent discussion and was researching a bit, and I 
think that we should really go with the idea of randomising the input data(if 
it is not completely presorted), to ensure that we do not get quadratic 
complexity.

One easy way to do that could be to take a sample of the data set, and take a 
pivot out of it. Still a better way could be to take multiple samples which are 
spread of the data set, select a value from each of them, and then take a 
cumulative pivot(median,maybe).

Anyways, I really think that if we do not go with the above ideas, then, we 
should some how factor in the degree of randomness of the input data when 
making the decision between quicksort and external merge sort for a set of rows.

This shouldn't be too complex, and should give us a fixed nlogn complexity even 
for wild data sets, without affecting existing normal data sets that are 
present in every day transactions. I even believe that those data sets will 
also benefit from the above optimisation.

Thoughts/Comments?

Regards,
Atri

Sent from my iPad

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython implementation

2013-06-30 Thread Martijn van Oosterhout

On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:
  python does not any any sort of reliable sandbox, so there is no plpython,
  only plpythonu - hence only one interpreter per backend is needed.
 
 Is there any track of the discussion that there is no way to make the
 sandbox? I managed to create some kind of sandbox, a simple modification
 which totally disables importing modules, so I'm just wondering why it
 cannot be done.

http://wiki.python.org/moin/SandboxedPython

This is the thread I was thinking of:
http://mail.python.org/pipermail/python-dev/2009-February/086401.html

If you read through it I think you will understand the difficulties.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature

Re: [HACKERS] plpython implementation



On 06/30/2013 08:18 AM, Szymon Guz wrote:




python does not any any sort of reliable sandbox, so there is no
plpython, only plpythonu - hence only one interpreter per backend
is needed.


Is there any track of the discussion that there is no way to make the 
sandbox? I managed to create some kind of sandbox, a simple 
modification which totally disables importing modules, so I'm just 
wondering why it cannot be done.





If your sandbox is simple it's almost certainly going to be broken. I 
suggest you use Google to research the topic. Our discussions should be 
in the mailing list archives.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython implementation

On 30 June 2013 14:31, Martijn van Oosterhout klep...@svana.org wrote:

 On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:
   python does not any any sort of reliable sandbox, so there is no
 plpython,
   only plpythonu - hence only one interpreter per backend is needed.
  
  Is there any track of the discussion that there is no way to make the
  sandbox? I managed to create some kind of sandbox, a simple modification
  which totally disables importing modules, so I'm just wondering why it
  cannot be done.

 http://wiki.python.org/moin/SandboxedPython

 This is the thread I was thinking of:
 http://mail.python.org/pipermail/python-dev/2009-February/086401.html

 If you read through it I think you will understand the difficulties.


Hi Martin,
thanks for links. I was thinking about something else. In fact we don't
need full sandbox, I think it would be enough to have safe python, if it
couldn't import any outside module. Wouldn't be enough?

It seems like the sandbox modules want to limit many external operations,
I'm thinking about not being able to import any module, even standard ones,
wouldn't be enough?

Szymon

Re: [HACKERS] plpython implementation

2013-06-30 Thread Andres Freund

On 2013-06-30 14:42:24 +0200, Szymon Guz wrote:
 On 30 June 2013 14:31, Martijn van Oosterhout klep...@svana.org wrote:
 
  On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:
python does not any any sort of reliable sandbox, so there is no
  plpython,
only plpythonu - hence only one interpreter per backend is needed.
   
   Is there any track of the discussion that there is no way to make the
   sandbox? I managed to create some kind of sandbox, a simple modification
   which totally disables importing modules, so I'm just wondering why it
   cannot be done.
 
  http://wiki.python.org/moin/SandboxedPython
 
  This is the thread I was thinking of:
  http://mail.python.org/pipermail/python-dev/2009-February/086401.html
 
  If you read through it I think you will understand the difficulties.
 
 thanks for links. I was thinking about something else. In fact we don't
 need full sandbox, I think it would be enough to have safe python, if it
 couldn't import any outside module. Wouldn't be enough?
 
 It seems like the sandbox modules want to limit many external operations,
 I'm thinking about not being able to import any module, even standard ones,
 wouldn't be enough?

python
 open('/etc/passwd', 'r').readlines()

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython implementation

On 30 June 2013 14:45, Andres Freund and...@2ndquadrant.com wrote:

 On 2013-06-30 14:42:24 +0200, Szymon Guz wrote:
  On 30 June 2013 14:31, Martijn van Oosterhout klep...@svana.org wrote:
 
   On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:
 python does not any any sort of reliable sandbox, so there is no
   plpython,
 only plpythonu - hence only one interpreter per backend is needed.

Is there any track of the discussion that there is no way to make the
sandbox? I managed to create some kind of sandbox, a simple
 modification
which totally disables importing modules, so I'm just wondering why
 it
cannot be done.
  
   http://wiki.python.org/moin/SandboxedPython
  
   This is the thread I was thinking of:
   http://mail.python.org/pipermail/python-dev/2009-February/086401.html
  
   If you read through it I think you will understand the difficulties.
  
  thanks for links. I was thinking about something else. In fact we don't
  need full sandbox, I think it would be enough to have safe python, if it
  couldn't import any outside module. Wouldn't be enough?
 
  It seems like the sandbox modules want to limit many external operations,
  I'm thinking about not being able to import any module, even standard
 ones,
  wouldn't be enough?

 python
  open('/etc/passwd', 'r').readlines()


thanks :)

Re: [HACKERS] review: Non-recursive processing of AND/OR lists

On Tue, Jun 18, 2013 at 3:01 PM, Pavel Stehule pavel.steh...@gmail.comwrote:

related to

https://commitfest.postgresql.org/action/patch_view?id=1130

http://www.postgresql.org/message-id/cabwtf4v9rsjibwe+87pk83mmm7acdrg7sz08rq-4qyme8jv...@mail.gmail.com

* motivation: remove recursive procession of AND/OR list (hangs with
10062 and more subexpressions)

* patch is short, clean and respect postgresql source code requirements
* patch was applied cleanly without warnings
* all regression tests was passed
* I successfully evaluated expression with 10 subexpressions
* there is no significant slowdown

possible improvements

a = (A_Expr*) list_nth(pending, 0);

a = (A_Expr*) linitial(pending);

I made that change, hesitantly. The comments above definition of linitial()
macro describe the confusion that API causes. I wanted to avoid that
confusion for new code, so I used the newer API which makes the intention
quite clear. But looking at that code closely, list_nth() causes at least 2
function calls, and that's pretty heavy compared to the linitiali() macro.

not well comment

should be -- If the right branch is also an SAME condition, append it to
the

I moved that comment above the outer bock, so that the intention of the
whole do-while code block is described in one place.

I don't see any other issues, so after fixing comments this patch is
ready for commit

Thanks for the review Pavel.

Attached is the updated patch, v4. It has the above edits, and a few code
improvements, like not repeating the (root_kind == AEPR_AND ? .. : ..)
ternary expression.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

non_recursive_and_or_transformation_v4.patch
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New regression test time




https://commitfest.postgresql.org/action/patch_view?id=1170


I think it is better to submit for next commit fest which is at below link:

https://commitfest.postgresql.org/action/commitfest_view?id=19


I put it there as the discussion whether to accept or not Robins patches 
because of their possible impact on non-regression test time is right now, 
so it may make sense to look at it now, and it is a rather small patch. 
Otherwise, next commit fest is fine.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: Non-recursive processing of AND/OR lists

2013-06-30 Thread Pavel Stehule

Hello

just one small notices

I dislike a name root_bool_expr, because, there is not a expression,
but expression type. Can you use root_bool_expr_type instead? It is
little bit longer, but more correct. Same not best name is
root_char, maybe root_bool_op_name

or root_expr_type and root_op_name ???

Have no other comments

Regards

Pavel

2013/6/30 Gurjeet Singh gurj...@singh.im:
On Tue, Jun 18, 2013 at 3:01 PM, Pavel Stehule pavel.steh...@gmail.com
wrote:

related to

https://commitfest.postgresql.org/action/patch_view?id=1130

http://www.postgresql.org/message-id/cabwtf4v9rsjibwe+87pk83mmm7acdrg7sz08rq-4qyme8jv...@mail.gmail.com

* motivation: remove recursive procession of AND/OR list (hangs with
10062 and more subexpressions)

possible improvements

a = (A_Expr*) list_nth(pending, 0);

a = (A_Expr*) linitial(pending);

not well comment

should be -- If the right branch is also an SAME condition, append it to
the

I moved that comment above the outer bock, so that the intention of the
whole do-while code block is described in one place.

I don't see any other issues, so after fixing comments this patch is
ready for commit

Thanks for the review Pavel.

Attached is the updated patch, v4. It has the above edits, and a few code
improvements, like not repeating the (root_kind == AEPR_AND ? .. : ..)
ternary expression.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: Non-recursive processing of AND/OR lists

On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule pavel.steh...@gmail.comwrote:

 Hello

 just one small notices

 I dislike a name root_bool_expr, because, there is not a expression,
 but expression type. Can you use root_bool_expr_type instead? It is
 little bit longer, but more correct. Same not best name is
 root_char, maybe root_bool_op_name

 or root_expr_type and root_op_name ???


How about naming those 3 variables as follows:

root_expr_kind
root_expr_name
root_bool_expr_type


-- 
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Re: [HACKERS] New regression test time

2013-06-30 Thread Robins Tharakan

On 30 June 2013 02:33, Amit kapila amit.kap...@huawei.com wrote:


 On Sunday, June 30, 2013 11:37 AM Fabien COELHO wrote:
  If we had a different set of tests, that would be a valid argument.  But
  we don't, so it's not.  And nobody has offered to write a feature to
  split our tests either.

 I have done a POC. See:

  https://commitfest.postgresql.org/action/patch_view?id=1170

 I think it is better to submit for next commit fest which is at below link:

 https://commitfest.postgresql.org/action/commitfest_view?id=19


Hi,

- There is a certain value in having separate tests, just that for the
big-tests to be any meaningful, if the buildfarm could run on a periodic
(daily?) basis and send some kind of automated bug-reports. Without an
automatic feedback, most may not inclined to run all tests before
submitting a patch and there'd be a big pile up near a release.

- For now, the new tests that I submit for review (for next CF) would be
for 'make check', until a 'make bigcheck' or whatever is up and running.

--
Robins Tharakan

Re: [HACKERS] review: Non-recursive processing of AND/OR lists

2013-06-30 Thread Pavel Stehule

2013/6/30 Gurjeet Singh gurj...@singh.im:
 On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:

 Hello

 just one small notices

 I dislike a name root_bool_expr, because, there is not a expression,
 but expression type. Can you use root_bool_expr_type instead? It is
 little bit longer, but more correct. Same not best name is
 root_char, maybe root_bool_op_name

 or root_expr_type and root_op_name ???


 How about naming those 3 variables as follows:

 root_expr_kind
 root_expr_name
 root_bool_expr_type

+1

Pavel



 --
 Gurjeet Singh

 http://gurjeet.singh.im/

 EnterpriseDB Inc.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: Non-recursive processing of AND/OR lists

On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule pavel.steh...@gmail.comwrote:

 2013/6/30 Gurjeet Singh gurj...@singh.im:
  On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule pavel.steh...@gmail.com
 
  wrote:
 
  How about naming those 3 variables as follows:
 
  root_expr_kind
  root_expr_name
  root_bool_expr_type

 +1


Thanks. Attached is the patch with that change. I'll update the commitfest
entry with a link to this email.

-- 
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

[HACKERS] Fwd: review: Non-recursive processing of AND/OR lists

On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule pavel.steh...@gmail.comwrote:

 2013/6/30 Gurjeet Singh gurj...@singh.im:
  On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule pavel.steh...@gmail.com
 
  wrote:
 
  How about naming those 3 variables as follows:
 
  root_expr_kind
  root_expr_name
  root_bool_expr_type

 +1


Thanks. Attached is the patch with that change. I'll update the commitfest
entry with a link to this email.

-- 
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.



-- 
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.


non_recursive_and_or_transformation_v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: Non-recursive processing of AND/OR lists

2013-06-30 Thread Pavel Stehule

2013/6/30 Gurjeet Singh gurj...@singh.im:
 On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:

 2013/6/30 Gurjeet Singh gurj...@singh.im:
  On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule
  pavel.steh...@gmail.com
  wrote:
 
  How about naming those 3 variables as follows:
 
  root_expr_kind
  root_expr_name
  root_bool_expr_type

 +1


 Thanks. Attached is the patch with that change. I'll update the commitfest
 entry with a link to this email.

ok

I chechecked it - patched without warnings, all tests passed

It is ready for commit

Regards

Pavel


 --
 Gurjeet Singh

 http://gurjeet.singh.im/

 EnterpriseDB Inc.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)

2013-06-30 Thread Jeff Davis

On Tue, 2013-05-28 at 22:10 -0400, Greg Smith wrote:
 I was just thinking of something to run in your test program, not 
 another build time check.  Just run the new allocation sequence, and 
 then check the resulting WAL file for a) correct length, and b) 16K of 
 zero bytes.  I would like to build some confidence that posix_fallocate 
 is operating correctly in this context on at least one platform.  My 
 experience with Linux handling this class of functions correctly has 
 left me skeptical of them working until that's proven to be the case.

As I understand it, you are basically asking if posix_fallocate() works
at all anywhere.

Simple test program attached, which creates two files and fills them:
one by 2048 8KB writes; and another by 1 posix_fallocate of 16MB. Then,
I just cmp the resulting files (and also ls them, to make sure they
are 16MB).

Passes on my workstation:
$ uname -a
Linux jdavis 3.5.0-34-generic #55-Ubuntu SMP Thu Jun 6 20:18:19 UTC 2013
x86_64 x86_64 x86_64 GNU/Linux

Regards,
Jeff Davis


#include fcntl.h

char buf[8192] = {0};

int main()
{
	int i;
	int fda = open(/tmp/afile, O_CREAT | O_EXCL | O_WRONLY, 0600);
	int fdb = open(/tmp/bfile, O_CREAT | O_EXCL | O_WRONLY, 0600);

	for(i = 0; i  2048; i++)
		{
			write(fda, buf, 8192);
		}

	posix_fallocate(fdb, 0, 16*1024*1024);

	close(fda);
	close(fdb);

	return 0;
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)

2013-06-30 Thread Jeff Davis

On Fri, 2013-06-28 at 11:38 -0700, Josh Berkus wrote:
 Since Greg seems to be busy, what needs to be done to test this?

As I understand it, he was mainly asking if posix_fallocate works at
all. I tried to address that question with a simple test, which behaves
as I expected it to:

http://www.postgresql.org/message-id/1372615313.19747.13.camel@jdavis

Unless something surprising comes up, or someone thinks and objection
has been missed, I am going to commit this soon.

(Of course, to avoid your wrath, I'll get rid of the GUC which was added
for testing.)

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] big test separation POC



Note about the POC patch limitations/questions:

 - is deriving a schedule with a piece of shell okay?
   or should perl/python/whatever scripting be better?

 - the big_schedule is assumed sequential, i.e. one test per line.
   maybe it could/should be parallel?

 - I'm not sure of the parallel_schedule and big_schedule
   file names are the best possible choices.

 - I'm really not sure about VPATH stuff.

 - I do not understand why the makefile specifies $(srcdir) before
   local files in some places.

 - should the bigcheck target be accessible from the project root?
   that is should make bigcheck from ../../.. work?

 - the documentation is not updated, I guess something should be done
   somewhere.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] big test separation POC



On 06/30/2013 02:54 PM, Fabien COELHO wrote:


Note about the POC patch limitations/questions:

 - is deriving a schedule with a piece of shell okay?
   or should perl/python/whatever scripting be better?



I would think all we need are the results, i.e. the schedule files, plus 
some Makefile entries for them.





 - the big_schedule is assumed sequential, i.e. one test per line.
   maybe it could/should be parallel?

 - I'm not sure of the parallel_schedule and big_schedule
   file names are the best possible choices.

 - I'm really not sure about VPATH stuff.



This should be totally transparent to VPATH builds.




 - I do not understand why the makefile specifies $(srcdir) before
   local files in some places.



For VPATH builds :-)


 - should the bigcheck target be accessible from the project root?
   that is should make bigcheck from ../../.. work?




Yes, possibly, but it's not terribly important (for example, the 
buildfarm does cd src/test/regress  make testname)


cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New regression test time

2013-06-30 Thread Jeff Janes

On Sat, Jun 29, 2013 at 3:43 PM, Andrew Dunstan and...@dunslane.net wrote:


 On 06/29/2013 05:59 PM, Josh Berkus wrote:

  Maybe there is a good case for these last two in a different set of tests.

 If we had a different set of tests, that would be a valid argument.  But
 we don't, so it's not.  And nobody has offered to write a feature to
 split our tests either.

 I have to say, I'm really surprised at the level of resistance people on
 this list are showing to the idea of increasing test coverage. I thought
 that Postgres was all about reliability?   For a project as mature as we
 are, our test coverage is abysmal, and I think I'm starting to see why.



 Dividing the tests into different sections is as simple as creating one
 schedule file per section.

 I'm not at all resistant to it. In fact, of someone wants to set up
 separate sections and add new tests to the different sections I'll be more
 than happy to provide buildfarm support for it. Obvious candidates could
 include:

  * code coverage
  * bugs
  * tests too big to run in everyday developer use



I don't really see a difference in the first two.  If we were sure the
uncovered code had no bugs, we wouldn't need to cover it.  At least if you
consider unintended behavior changes to be bugs.  I think it would make
more sense to split them up by what computers it makes sense to run them on.

Tests that take too much RAM to be run by everyone.
Tests that take too many CPUs (in order to be meaningful) to run by
everyone most of the time.
Tests that take too much disk space...
Tests that take too much wall-clock time
And maybe that tests that take too much wall-clock time specifically
under CLOBBER_CACHE_ALWAYS...

Some of these sets would probably be empty currently, because candidates
that belong in them were not committed at all since they were not wanted in
the default and they there was no other place to add them.

If we are very worried about how long the tests take, we should probably
also spend some time trying to make the existing ones faster.
 Parallelization does not cut the test time very much (~20% with 8 CPUs),
because the tests are poorly packed.  In a parallel group all the tests
finish fast except one, and the whole group is then dominated by that one
test.  (The main goal of parallelization is probably not to make the test
faster, but to make them more realistic from a concurrency perspective, but
if there is little actual parallelism, it doesn't achieve that very well,
either).  I don't know how much freedom there is to re-order the tests
without breaking dependencies, though.  I think prepared_xacts and stats
could be usefully run together, as both take a long time sleeping but
impose little real load that would interfere with each other.  Perhaps
prepared_xacts could be re-written to get what it needs without the long
statement_timeouts.  Testing the timeout itself doesn't seem to be the goal.

Cheers,

Jeff

Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)

2013-06-30 Thread Jeff Davis

On Sun, 2013-06-30 at 11:11 -0700, Jeff Davis wrote:
 Unless something surprising comes up, or someone thinks and objection
 has been missed, I am going to commit this soon.

Quick question to anyone who happens to know:

What is the standard procedure for changes to pg_config.h.win32? I
looked at an old patch of mine that Tom (CC'd) committed:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b966dd6c4228d696b291c1cdcb5ab8c8475fefa8

and I see that it modifies pg_config.h.win32. Should I modify it in a
similar way for this fallocate patch?

Right now, pg_config.in.win32 seems a little inconsistent because it has
an entry for HAVE_POSIX_SIGNALS but not HAVE_POSIX_FADVISE. It says it's
a generated file, but I don't have a windows environment or MingW.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)



On 06/30/2013 03:50 PM, Jeff Davis wrote:

On Sun, 2013-06-30 at 11:11 -0700, Jeff Davis wrote:

Unless something surprising comes up, or someone thinks and objection
has been missed, I am going to commit this soon.

Quick question to anyone who happens to know:

What is the standard procedure for changes to pg_config.h.win32? I
looked at an old patch of mine that Tom (CC'd) committed:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b966dd6c4228d696b291c1cdcb5ab8c8475fefa8

and I see that it modifies pg_config.h.win32. Should I modify it in a
similar way for this fallocate patch?

Right now, pg_config.in.win32 seems a little inconsistent because it has
an entry for HAVE_POSIX_SIGNALS but not HAVE_POSIX_FADVISE. It says it's
a generated file, but I don't have a windows environment or MingW.




It was originally generated. Since then it's been maintained by hand.

If you need a Windows environment to test on, see the Amazon recipe at 
http://wiki.postgresql.org/wiki/Building_With_MinGW


cheers

andrew









--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-30 Thread Nicholas White

 this should throw a FEATURE_NOT_SUPPORTED error if it is used for window
functions that don't support it
 arbitrary aggregate functions over a window ... should also throw a
FEATURE_NOT_SUPPORTED error.

Fixed (with test cases) in the attached patch.

 because the same window may be shared by multiple window function calls.

Ah, your example gives the stack trace below. As the respect / ignore nulls
frame option is part of the window definition your example should cause two
windows to be created (both based on w, but one with the respect-nulls flag
set), but instead it fails an assert as one window definition can't have
two sets of frame options. It might take me a day or two to solve this -
let me know if this approach (making the parser create two window objects)
seems wrong.

#2  0x000100cdb68b in ExceptionalCondition (conditionName=Could not
find the frame base for ExceptionalCondition.
) at /Users/xxx/postgresql/src/backend/utils/error/assert.c:54
#3  0x0001009a3c03 in transformWindowFuncCall (pstate=0x7f88228362c8,
wfunc=0x7f8822948ec0, windef=0x7f88228353a8) at
/Users/xxx/postgresql/src/backend/parser/parse_agg.c:573

Thanks -

Nick


lead-lag-ignore-nulls.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)

2013-06-30 Thread Greg Smith


On 6/30/13 2:01 PM, Jeff Davis wrote:

Simple test program attached, which creates two files and fills them:
one by 2048 8KB writes; and another by 1 posix_fallocate of 16MB. Then,
I just cmp the resulting files (and also ls them, to make sure they
are 16MB).


This makes platform level testing a lot easier, thanks.  Attached is an 
updated copy of that program with some error checking.  If the files it 
creates already existed, the code didn't notice, and a series of write 
errors happened.  If you set the test up right it's not a problem, but 
it's better if a bad setup is caught.  I wrapped the whole test with a 
shell script, also attached, which insures the right test sequence and 
checks.


Your C test program compiles and passes on RHEL5/6 here, doesn't on OS X 
Darwin.  No surprises there, there's a long list of platforms that don't 
support this call at 
https://www.gnu.org/software/gnulib/manual/html_node/posix_005ffallocate.html 
and the Mac is on it.  Many other platforms I was worried about don't 
support it too--older FreeBSD, HP-UX 11, Solaris 10, mingw, MSVC--so 
that cuts down on testing quite a bit.  If it runs faster on Linux, 
that's the main target here, just like the existing 
effective_io_concurrency fadvise code.


The specific thing I was worried about is that this interface might have 
a stub that doesn't work perfectly in older Linux kernels.  After being 
surprised to find this interface worked on RHEL5 with your test program, 
I dug into this more.  It works there, but it may actually be slower.


posix_fallocate is actually implemented by glibc on Linux.  Been there 
since 2.1.94 according to the Linux man pages.  But Linux itself didn't 
add the feature until kernel 2.6.20:  http://lwn.net/Articles/226436/ 
The biggest thing I was worried about--the call might be there in early 
kernels but with a non-functional implementation--that's not the case. 
Looking at the diff, before that patch there's no fallocate at all.


So what happened in earlier kernels, where there was no kernel level 
fallocate available?  According to 
https://www.redhat.com/archives/fedora-devel-list/2009-April/msg00110.html 
what glibc does is check for kernel fallocate(), and if it's not there 
it writes a bunch of zeros to create the file instead.  What is actually 
happening on a RHEL5 system (with kernel 2.6.18) is that calling 
posix_fallocate does this fallback behavior, where it basically does the 
same thing the existing WAL clearing code does.


I can even prove that's the case.  On RHEL5, if you run strace -o out 
./fallocate the main write loop looks like this:


write(3, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 
8192) = 8192


But when you call posix_fallocate, you still get a bunch of writes, but 
4 bytes at a time:


pwrite(4, \0, 1, 16769023)= 1
pwrite(4, \0, 1, 16773119)= 1
pwrite(4, \0, 1, 16777215)= 1

That's glibc helpfully converting your call to posix_fallocate into 
small writes, because the OS doesn't provide a better way in that 
kernel.  It's not hard to imagine this being slower than what the WAL 
code is doing right now.  I'm not worried about correctness issues 
anymore, but my gut paranoia about this not working as expected on older 
systems was justified.  Everyone who thought I was just whining owes me 
a cookie.


This is what I plan to benchmark specifically next.  If the 
posix_fallocate approach is actually slower than what's done now when 
it's not getting kernel acceleration, which is the case on RHEL5 era 
kernels, we might need to make the configure time test more complicated. 
 Whether posix_fallocate is defined isn't sensitive enough; on Linux it 
may be the case that this only is usable when fallocate() is also there.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
#!/bin/sh
rm -f fallocate /tmp/afile /tmp/bfile
gcc fallocate.c -o fallocate
if [ ! -x fallocate ] ; then
  echo Test program did not compile, posix_fallocate may not be supported
  exit
fi

./fallocate
if [ -f /tmp/afile ] ; then
  sizea=`du /tmp/afile | cut -f 1`
  sizeb=`du /tmp/bfile | cut -f 1`
  if [ $sizea -eq $sizeb ] ; then
cmp /tmp/afile /tmp/bfile
if [ $? -ne 0 ] ; then
  echo Test failed, files do not match
else
  echo Test passed
fi
  else
echo Test failed, sizes do not match
  fi
fi
#include fcntl.h
#include stdio.h

char buf[8192] = {0};

int main()
{
	int i;
	int written;
	int fda = open(/tmp/afile, O_CREAT | O_EXCL | O_WRONLY, 0600);
	int fdb = open(/tmp/bfile, O_CREAT | O_EXCL | O_WRONLY, 0600);
	if (fda  0 || fdb  0)
	{
		printf(Opening files failed\n);
		return(1);
	}
	for(i = 0; i  2048; i++)
		{
			written=write(fda, buf, 8192);
			if (written  8192)
			{
printf(Write to file failed);
return(2);
			}
		}

	posix_fallocate(fdb, 0, 16*1024*1024);

	close(fda);
	close(fdb);

	return 0;
}

--

Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)

2013-06-30 Thread Jon Nelson

On Sun, Jun 30, 2013 at 5:55 PM, Greg Smith g...@2ndquadrant.com wrote:


 pwrite(4, \0, 1, 16769023)= 1
 pwrite(4, \0, 1, 16773119)= 1
 pwrite(4, \0, 1, 16777215)= 1

 That's glibc helpfully converting your call to posix_fallocate into small
 writes, because the OS doesn't provide a better way in that kernel.  It's
 not hard to imagine this being slower than what the WAL code is doing right
 now.  I'm not worried about correctness issues anymore, but my gut paranoia
 about this not working as expected on older systems was justified.  Everyone
 who thought I was just whining owes me a cookie.

I had noted in the very early part of the thread that glibc emulates
posix_fallocate when the (Linux-specific) 'fallocate' systemcall
fails. In this case, it's writing 4 bytes of zeros and then
essentially seeking forward 4092 (4096-4) bytes. This prevents files
with holes in them because the holes have to be at least 4kiB in size,
if I recall properly. It's *not* writing out 16MiB in 4 byte
increments.

--
Jon


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)

2013-06-30 Thread Greg Smith


On 5/28/13 10:00 PM, Jon Nelson wrote:


A note: The attached test program uses *fsync* instead of *fdatasync*
after calling fallocate (or writing out 16MB of zeroes), per an
earlier suggestion.


I tried this out on the RHEL5 platform I'm worried about now.  There's 
something weird about the test program there.  If I run it once it shows 
posix_fallocate running much faster:


without posix_fallocate: 1 open/close iterations, 1 rewrite in 23.0169s
with posix_fallocate: 1 open/close iterations, 1 rewrite in 11.1904s

The problem is that I'm seeing the gap between the two get smaller the 
more iterations I run, which makes me wonder if the test is completely fair:


without posix_fallocate: 2 open/close iterations, 2 rewrite in 34.3281s
with posix_fallocate: 2 open/close iterations, 2 rewrite in 23.1798s

without posix_fallocate: 3 open/close iterations, 3 rewrite in 44.4791s
with posix_fallocate: 3 open/close iterations, 3 rewrite in 33.6102s

without posix_fallocate: 5 open/close iterations, 5 rewrite in 65.6244s
with posix_fallocate: 5 open/close iterations, 5 rewrite in 61.0991s

You didn't show any output from the latest program on your system, so 
I'm not sure how it behaved for you here.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]

2013-06-30 Thread David Fetter

On Fri, Jun 28, 2013 at 09:22:52PM +0100, Dean Rasheed wrote:
 On 21 June 2013 06:16, David Fetter da...@fetter.org wrote:
  Please find attached a patch which allows subqueries in the FILTER
  clause and adds regression testing for same.
 
 
 This needs re-basing/merging following Robert's recent commit to make
 OVER unreserved.

Please find attached.  Thanks, Andrew Gierth!  In this one, FILTER is
no longer a reserved word.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
diff --git a/doc/src/sgml/keywords.sgml b/doc/src/sgml/keywords.sgml
index 5e3b33a..ecfde99 100644
--- a/doc/src/sgml/keywords.sgml
+++ b/doc/src/sgml/keywords.sgml
@@ -1786,7 +1786,7 @@
/row
row
 entrytokenFILTER/token/entry
-entry/entry
+entrynon-reserved/entry
 entryreserved/entry
 entryreserved/entry
 entry/entry
@@ -3200,7 +3200,7 @@
/row
row
 entrytokenOVER/token/entry
-entryreserved (can be function or type)/entry
+entrynon-reserved/entry
 entryreserved/entry
 entryreserved/entry
 entry/entry
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 68309ba..b289a3a 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -594,10 +594,13 @@ GROUP BY replaceable 
class=parameterexpression/replaceable [, ...]
/para
 
para
-Aggregate functions, if any are used, are computed across all rows
+In the absence of a literalFILTER/literal clause,
+aggregate functions, if any are used, are computed across all rows
 making up each group, producing a separate value for each group
 (whereas without literalGROUP BY/literal, an aggregate
 produces a single value computed across all the selected rows).
+When a literalFILTER/literal clause is present, only those
+rows matching the FILTER clause are included.
 When literalGROUP BY/literal is present, it is not valid for
 the commandSELECT/command list expressions to refer to
 ungrouped columns except within aggregate functions or if the
diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index b139212..c4d5f33 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -1562,24 +1562,26 @@ sqrt(2)
 syntax of an aggregate expression is one of the following:
 
 synopsis
-replaceableaggregate_name/replaceable 
(replaceableexpression/replaceable [ , ... ] [ 
replaceableorder_by_clause/replaceable ] )
-replaceableaggregate_name/replaceable (ALL 
replaceableexpression/replaceable [ , ... ] [ 
replaceableorder_by_clause/replaceable ] )
-replaceableaggregate_name/replaceable (DISTINCT 
replaceableexpression/replaceable [ , ... ] [ 
replaceableorder_by_clause/replaceable ] )
-replaceableaggregate_name/replaceable ( * )
+replaceableaggregate_name/replaceable 
(replaceableexpression/replaceable [ , ... ] [ 
replaceableorder_by_clause/replaceable ] ) [ FILTER ( WHERE 
replaceablefilter_clause/replaceable ) ]
+replaceableaggregate_name/replaceable (ALL 
replaceableexpression/replaceable [ , ... ] [ 
replaceableorder_by_clause/replaceable ] ) [ FILTER ( WHERE 
replaceablefilter_clause/replaceable ) ]
+replaceableaggregate_name/replaceable (DISTINCT 
replaceableexpression/replaceable [ , ... ] [ 
replaceableorder_by_clause/replaceable ] ) [ FILTER ( WHERE 
replaceablefilter_clause/replaceable ) ]
+replaceableaggregate_name/replaceable ( * ) [ FILTER ( WHERE 
replaceablefilter_clause/replaceable ) ]
 /synopsis
 
 where replaceableaggregate_name/replaceable is a previously
 defined aggregate (possibly qualified with a schema name),
-replaceableexpression/replaceable is
-any value expression that does not itself contain an aggregate
-expression or a window function call, and
-replaceableorder_by_clause/replaceable is a optional
-literalORDER BY/ clause as described below.
+replaceableexpression/replaceable is any value expression that
+does not itself contain an aggregate expression or a window
+function call, replaceableorder_by_clause/replaceable is a
+optional literalORDER BY/ clause as described below.  The
+replaceableaggregate_name/replaceable can also be suffixed
+with literalFILTER/literal as described below.
/para
 
para
-The first form of aggregate expression invokes the aggregate
-once for each input row.
+The first form of aggregate expression invokes the aggregate once
+for each input row, or when a FILTER clause is present, each row
+matching same.
 The second form is the same as the first, since
 literalALL/literal is the default.
 The third form invokes the aggregate once for each distinct value
@@ -1607,6 +1609,21 @@ sqrt(2)
/para

Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division

2013-06-30 Thread David Fetter

On Fri, Jun 28, 2013 at 01:28:35PM -0400, Peter Eisentraut wrote:
 On 6/28/13 11:30 AM, Robert Haas wrote:
  On Fri, Jun 28, 2013 at 10:31 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  David Fetter da...@fetter.org writes:
  Please find attached the latest patch.
 
  I remain of the opinion that this is simply a bad idea.  It is unlike
  our habits for constructing other types of nodes, and makes it harder
  not easier to find all the places that need to be updated when adding
  another field to FuncCall.
  
  I think it's a nice code cleanup.  I don't understand your objection.
 
 Yeah, I was reading the patch thinking, yes, finally someone cleans that up.

Please find enclosed a patch reflecting the changes that de-reserved
OVER as a keyword.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index c487db9..245aef2 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -508,3 +508,28 @@ makeDefElemExtended(char *nameSpace, char *name, Node *arg,
 
return res;
 }
+
+/*
+ * makeFuncCall -
+ *
+ * Initialize a FuncCall struct with the information every caller must
+ * supply.  Any non-default parameters have to be handled by the
+ * caller.
+ *
+ */
+
+FuncCall *
+makeFuncCall(List *name, List *args, int location)
+{
+   FuncCall *n = makeNode(FuncCall);
+   n-funcname = name;
+   n-args = args;
+   n-location = location;
+   n-agg_order = NIL;
+   n-agg_star = FALSE;
+   n-agg_distinct = FALSE;
+   n-func_variadic = FALSE;
+   n-over = NULL;
+   return n;
+}
+
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 0fc5b13..f67ef0c 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10503,16 +10503,9 @@ a_expr:c_expr  
{ $$ = $1; }
}
| a_expr AT TIME ZONE a_expr%prec AT
{
-   FuncCall *n = makeNode(FuncCall);
-   n-funcname = 
SystemFuncName(timezone);
-   n-args = list_make2($5, $1);
-   n-agg_order = NIL;
-   n-agg_star = FALSE;
-   n-agg_distinct = FALSE;
-   n-func_variadic = FALSE;
-   n-over = NULL;
-   n-location = @2;
-   $$ = (Node *) n;
+   $$ = (Node *) 
makeFuncCall(SystemFuncName(timezone),
+   
   list_make2($5, $1),
+   
   @2);
}
/*
 * These operators must be called out explicitly in order to 
make use
@@ -10564,113 +10557,65 @@ a_expr:  c_expr  
{ $$ = $1; }
{ $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, 
~~, $1, $3, @2); }
| a_expr LIKE a_expr ESCAPE a_expr
{
-   FuncCall *n = makeNode(FuncCall);
-   n-funcname = 
SystemFuncName(like_escape);
-   n-args = list_make2($3, $5);
-   n-agg_order = NIL;
-   n-agg_star = FALSE;
-   n-agg_distinct = FALSE;
-   n-func_variadic = FALSE;
-   n-over = NULL;
-   n-location = @2;
+   FuncCall *n = 
makeFuncCall(SystemFuncName(like_escape),
+   
   list_make2($3, $5),
+   
   @2);
$$ = (Node *) 
makeSimpleA_Expr(AEXPR_OP, ~~, $1, (Node *) n, @2);
}
| a_expr NOT LIKE a_expr
{ $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, 
!~~, $1, $4, @2); }
| a_expr NOT LIKE a_expr ESCAPE

Re: [HACKERS] Review: query result history in psql

2013-06-30 Thread ian link

Not sure about all of your suggestions. Let me see if I can clarify what
you're looking for.


  * simply decision if content should be stored in history or not,

Do you mean that the user should use a flag to place the result of a query
into the history?
like:
--ans SELECT * FROM cities...
Not sure if that's what you mean, but it seems kind of unnecesary. They can
just hit the \ans flag beforehand.

* simply remove last entry (table) of history

That could be useful. What do you think Maciej?

 * queries should be joined to content, only name is not enough

Don't know what you mean. Could you try re-wording that?

Ian



On Fri, Jun 28, 2013 at 8:49 AM, Pavel Stehule pavel.steh...@gmail.comwrote:

 Hello

 I am not sure, this interface is really user friendly

 there is not possible searching in history, and not every query push
 to history some interesting content.

 It require:

 * simply decision if content should be stored in history or not,
 * simply remove last entry (table) of history
 * queries should be joined to content, only name is not enough

 Regards

 Pavel

 2013/6/28 Maciej Gajewski maciej.gajews...@gmail.com:
  Thanks for checking the patch!
 
  So what's left to fix?
  * Moving the escaping-related functions to separate module,
  * applying your corrections.
 
  Did I missed anything?
 
  I'll submit corrected patch after the weekend.
 
  M

Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)

2013-06-30 Thread Jon Nelson

On Sun, Jun 30, 2013 at 6:49 PM, Greg Smith g...@2ndquadrant.com wrote:
 On 5/28/13 10:00 PM, Jon Nelson wrote:

 A note: The attached test program uses *fsync* instead of *fdatasync*
 after calling fallocate (or writing out 16MB of zeroes), per an
 earlier suggestion.


 I tried this out on the RHEL5 platform I'm worried about now.  There's
 something weird about the test program there.  If I run it once it shows
 posix_fallocate running much faster:

 without posix_fallocate: 1 open/close iterations, 1 rewrite in 23.0169s
 with posix_fallocate: 1 open/close iterations, 1 rewrite in 11.1904s

Assuming the platform chosen is using the glibc approach of pwrite(4
bytes) every 4KiB, then the results ought to be similar, and I'm at a
loss to explain why it's performing better (unless - grasping at
straws - simply the *volume* of data transferred from userspace to the
kernel is at play, in which case posix_fallocate will result in 4096
calls to pwrite but at 4 bytes each versus 2048 calls to write at 8KiB
each.) Ultimately the same amount of data gets written to disk (one
would imagine), but otherwise I can't really think of much.

I have also found several errors test_fallocate.c program I posted,
corrected below.
One of them is: it is missing two pairs of parentheses around two #defines:

#define SIXTEENMB 1024*1024*16
#define EIGHTKB 1024*8

should be:

#define SIXTEENMB (1024*1024*16)
#define EIGHTKB (1024*8)

Otherwise the program will end up writing (131072) 8KiB blocks instead of 2048.

This actually makes the comparison between writing 8KiB blocks and
using posix_fallocate favor the latter more strongly in the results
(also seen below).

 The problem is that I'm seeing the gap between the two get smaller the more
 iterations I run, which makes me wonder if the test is completely fair:

 without posix_fallocate: 2 open/close iterations, 2 rewrite in 34.3281s
 with posix_fallocate: 2 open/close iterations, 2 rewrite in 23.1798s


 without posix_fallocate: 3 open/close iterations, 3 rewrite in 44.4791s
 with posix_fallocate: 3 open/close iterations, 3 rewrite in 33.6102s

 without posix_fallocate: 5 open/close iterations, 5 rewrite in 65.6244s
 with posix_fallocate: 5 open/close iterations, 5 rewrite in 61.0991s

 You didn't show any output from the latest program on your system, so I'm
 not sure how it behaved for you here.

On the the platform I use - openSUSE (12.3, x86_64, kernel 3.9.7
currently) I never see posix_fadvise perform worse. Typically better,
sometimes much better.


To set the number of times the file is overwritten to just 1 (one):

for i in 1 2 5 10 100; do ./test_fallocate foo $i 1; done

I am including a revised version of test_fallocate.c that corrects the
above noted error, one typo (from when I changed fdatasync to fsync)
that did not alter program behavior, corrects a mis-placed
gettimeofday (which does change the results) and includes a new test
that aims (perhaps poorly) to emulate the glibc style of pwrite(4
bytes) for every 4KiB, and tests the resulting file size to make sure
it is 16MiB in size.

The performance of the latter (new) test sometimes seems to perform
worse and sometimes seems to perform better (usually worse) than
either of the other two. In all cases, posix_fallocate performs
better, but I don't have a sufficiently old kernel to test with.

The new results on one machine are below.

With 0 (zero) rewrites (testing *just*
open/some_type_of_allocation/fsync/close):

method: classic. 100 open/close iterations, 0 rewrite in 29.6060s
method: posix_fallocate. 100 open/close iterations, 0 rewrite in 2.1054s
method: glibc emulation. 100 open/close iterations, 0 rewrite in 31.7445s

And with the same number of rewrites as open/close cycles:

method: classic. 1 open/close iterations, 1 rewrite in 0.6297s
method: posix_fallocate. 1 open/close iterations, 1 rewrite in 0.3028s
method: glibc emulation. 1 open/close iterations, 1 rewrite in 0.5521s

method: classic. 2 open/close iterations, 2 rewrite in 1.6455s
method: posix_fallocate. 2 open/close iterations, 2 rewrite in 1.0409s
method: glibc emulation. 2 open/close iterations, 2 rewrite in 1.5604s

method: classic. 5 open/close iterations, 5 rewrite in 7.5916s
method: posix_fallocate. 5 open/close iterations, 5 rewrite in 6.9177s
method: glibc emulation. 5 open/close iterations, 5 rewrite in 8.1137s

method: classic. 10 open/close iterations, 10 rewrite in 29.2816s
method: posix_fallocate. 10 open/close iterations, 10 rewrite in 28.4400s
method: glibc emulation. 10 open/close iterations, 10 rewrite in 31.2693s



--
Jon
#include stdlib.h
#include stdio.h
#include fcntl.h
#include unistd.h
#include string.h
#include sys/time.h
#include sys/types.h
#include sys/stat.h

#define SIXTEENMB (1024*1024*16)
#define FOURKB (1024*4)
#define EIGHTKB (1024*8)

void writeout(int fd, char *buf)
{
	int i;
	for (i = 0; i  SIXTEENMB / EIGHTKB; ++i) {
		if (write(fd, buf, EIGHTKB) != EIGHTKB) {
			fprintf(stderr, Error in write: %m!\n);

Re: [HACKERS] plpython implementation

2013-06-30 Thread Claudio Freire

On Sun, Jun 30, 2013 at 9:45 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-06-30 14:42:24 +0200, Szymon Guz wrote:
 On 30 June 2013 14:31, Martijn van Oosterhout klep...@svana.org wrote:

  On Sun, Jun 30, 2013 at 02:18:07PM +0200, Szymon Guz wrote:
python does not any any sort of reliable sandbox, so there is no
  plpython,
only plpythonu - hence only one interpreter per backend is needed.
   
   Is there any track of the discussion that there is no way to make the
   sandbox? I managed to create some kind of sandbox, a simple modification
   which totally disables importing modules, so I'm just wondering why it
   cannot be done.
 
  http://wiki.python.org/moin/SandboxedPython
 
  This is the thread I was thinking of:
  http://mail.python.org/pipermail/python-dev/2009-February/086401.html
 
  If you read through it I think you will understand the difficulties.
 
 thanks for links. I was thinking about something else. In fact we don't
 need full sandbox, I think it would be enough to have safe python, if it
 couldn't import any outside module. Wouldn't be enough?

 It seems like the sandbox modules want to limit many external operations,
 I'm thinking about not being able to import any module, even standard ones,
 wouldn't be enough?

 python
 open('/etc/passwd', 'r').readlines()

Not only that, the CPython interpreter is rather fuzzy about the
division between interpreters. You can initialize multiple
interpreters, but they share a lot of state, so you can never fully
separate them. You'd have some state from the untrusted interpreter
spill over into the trusted one within the same session, which is not
ideal at all (and in fact can be exploited).

In essence, you'd have to use another implementation. CPython guys
have left it very clear they don't intend to fix that, as they don't
consider it a bug. It's just how it is.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-30 Thread Nicholas White

I've attached another iteration of the patch that fixes the multiple-window
bug and adds ( uses) a function to create a Bitmapset using a custom
allocator. I don't think there's any outstanding problems with it now.

 Alternatively, it might be trivial to make all aggregate functions work
with ignore nulls in a window context

This is a good idea, but I'd like to keep the scope of this patch limited
for the time being - I'll look at doing this (along with the first / last /
nth value window functions) for a later release.

Thanks -

Nick


lead-lag-ignore-nulls.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] build postgresql-9.3beta2 on xubuntu 12.04 without failure

2013-06-30 Thread Boris Skegin

Hi.

Name: postgresql-9.3 ,
ftp.postgresql.org/pub/source/v9.3beta2/postgresql-9.3beta2.tar.gz   as  of
June 24, 2013, 7:03 p.m.

Release: beta2

Test Type: build

Platform: xubuntu 12.04

Installation Method: building from sourse, gmake install-world

Platform Detail: 2 core , 3 GB RAM

Test Procedure:
configure  PREFIX=build  --with-libxml --with-libxslt
gmake world
gmake install-world
gmake installcheck-parallel

Failure? : no

Comments: did not know if I should post to this mailing list,
but as the testers' one is inactive ..

Simple pgbench tests were also ok.

Regards,
Boris


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Eliminating PD_ALL_VISIBLE, take 2

2013-06-30 Thread Robert Haas

On Sat, Jun 29, 2013 at 11:24 AM, Robins rob...@pobox.com wrote:
 On 10 June 2013 00:17, Jeff Davis pg...@j-davis.com wrote:

 On Thu, 2013-05-30 at 10:07 -0700, Jeff Davis wrote:
   Come to think of it, even without the torn page  checksum issue, do
   we
   really want to actively clear the all-visible flags after upgrade?

 Removed that from the patch and rebased. I think the best approach is to
 remove the bit opportunistically when we're already dirtying the page
 for something else.

 However, right now, there is enough skepticism of the general approach
 in this patch (and enough related proposals) that I'll leave this to be
 resolved if and when there is more agreement that my approach is a good
 one.


 Did some basic checks on this patch. List-wise feedback below.

 - Cleanly applies to Git-Head: Yes (Some offsets, but thats probably because
 of delay in review)
 - Documentation Updated: No. (Required?)
 - Tests Updated: No. (Required?)
 - All tests pass: Yes
 - Does it Work : ???

 - Any visible issues: No
 - Any compiler warnings: No

 - Others:
 Number of uncovered lines: Reduced by 167 lines

I thought that Jeff withdrew this patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New regression test time

2013-06-30 Thread Josh Berkus

On 06/30/2013 12:33 AM, Amit kapila wrote:
 
 On Sunday, June 30, 2013 11:37 AM Fabien COELHO wrote:
 If we had a different set of tests, that would be a valid argument.  But
 we don't, so it's not.  And nobody has offered to write a feature to
 split our tests either.
 
 I have done a POC. See:
 
 https://commitfest.postgresql.org/action/patch_view?id=1170
 
 I think it is better to submit for next commit fest which is at below link:
 
 https://commitfest.postgresql.org/action/commitfest_view?id=19

I would argue for doing this in this CF, just so that we can have the
benefit of the extra tests for the next 3 months, and so that Andrew can
work on the buildfarm additions.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER

2013-06-30 Thread ian link

I found some time and I think I am up to speed now. I finally figured out
how to add new operator strategies and made a little test operator for
myself.

It seems pretty clear that assuming '+' and '-' are addition and
subtraction is a bad idea. I don't think it would be too tricky to add
support for new operator strategies. Andrew Gierth suggested calling these
new strategies offset - and offset +, which I think describes it pretty
well. I assigned the operator itself to be @+ and @- but that can
obviously be changed. If this sounds like a good path to you guys, I will
go ahead and implement the operators for the appropriate types. Please let
me know if I am misunderstanding something - I am still figuring stuff out
:)

Aside from the opclass stuff, there were some other important issues
mentioned with the original RANGE support. I think I will address those
after the opclass stuff is done.

Thanks!
Ian


On Sat, Jun 22, 2013 at 4:38 PM, ian link i...@ilink.io wrote:

 Thanks Craig! That definitely does help. I probably still have some
 questions but I think I will read through the rest of the code before
 asking. Thanks again!

 Ian

  Craig Ringer
  Friday, June 21, 2013 8:41 PM

 
  On 06/22/2013 03:30 AM, ian link wrote:
 
  Forgive my ignorance, but I don't entirely understand the problem. What
  does '+' and '-' refer to exactly?
 
  Consider RANGE 4.5 PRECEDING'.
 
  You need to be able to test whether, for the current row 'b', any given
  row 'a' is within the range (b - 4.5)  a = b . Not 100% sure about the
   vs = boundaries, but that's irrelevant for the example.
 
  To test that, you have to be able to do two things: you have to be able
  to test whether one value is greater than another, and you have to be
  able to add or subtract a constant from one of the values.
 
  Right now, the b-tree access method provides information on the ordering
  operators  = =  =  , which provides half the answer. But these
  don't give any concept of *distance* - you can test ordinality but not
  cardinality.
 
  To implement the different by 4.5 part, you have to be able to add 4.5
  to one value or subtract it from the other.
 
  The obvious way to do that is to look up the function that implements
  the '+' or '-' operator, and do:
 
  ((OPERATOR(+))(a, 4.5))  b AND (a = b)
 
  or
 
  ((OPERATOR(-))(b, 4.5))  a AND (a = b);
 
  The problem outlined by Tom in prior discussion about this is that
  PostgreSQL tries really hard not to assume that particular operator
  names mean particular things. Rather than knowing that + is always
  an operator that adds two values together; is transitive, symmetric and
  reflexive, PostgreSQL requires that you define an *operator class* that
  names the operator that has those properties.
 
  Or at least, it does for less-than, less-than-or-equals, equals,
  greater-than-or-equals, greater-than, and not-equals as part of the
  b-tree operator class, which *usually* defines these operators as  = =
 
  =  , but you could use any operator names you wanted if you really
 
  liked.
 
  Right now (as far as I know) there's no operator class that lets you
  identify operators for addition and subtraction in a similar way. So
  it's necessary to either add such an operator class (in which case
  support has to be added for it for every type), extend the existing
  b-tree operator class to provide the info, or blindly assume that +
  and - are always addition and subtraction.
 
  For an example of why such assumptions are a bad idea, consider matrix
  multiplication. Normally, a * b = b * a, but this isn't true for
  multiplication of matrices. Similarly, if someone defined a + operator
  as an alias for string concatenation (||), we'd be totally wrong to
  assume we could use that for doing range-offset windowing.
 
  So. Yeah. Operator classes required, unless we're going to change the
  rules and make certain operator names special in PostgreSQL, so that
  if you implement them they *must* have certain properties. This seems
  like a pretty poor reason to add such a big change.
 
  I hope this explanation (a) is actually correct and (b) is helpful.
 
  ian link
  Friday, June 21, 2013 12:30 PM

  Forgive my ignorance, but I don't entirely understand the problem. What
 does '+' and '-' refer to exactly?
  Thanks!
 
 
 
  Hitoshi Harada
  Friday, June 21, 2013 4:35 AM
 
 
 

 On 06/22/2013 03:30 AM, ian link wrote:
  Forgive my ignorance, but I don't entirely understand the problem. What
  does '+' and '-' refer to exactly?

 Consider RANGE 4.5 PRECEDING'.

 You need to be able to test whether, for the current row 'b', any given
 row 'a' is within the range (b - 4.5)  a = b . Not 100% sure about the
  vs = boundaries, but that's irrelevant for the example.

 To test that, you have to be able to do two things: you have to be able
 to test whether one value is greater than another, and you have to be
 able to add or subtract a constant from one of the values.

 Right now, the b-tree

Re: [HACKERS] Eliminating PD_ALL_VISIBLE, take 2

2013-06-30 Thread Josh Berkus


 I thought that Jeff withdrew this patch.
 

He did, but nobody removed it from the commitfest --- partly because of
a change of subject line breaking the thread.

Bounced to returned with feedback now.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)

2013-06-30 Thread Greg Smith


On 6/30/13 9:28 PM, Jon Nelson wrote:

The performance of the latter (new) test sometimes seems to perform
worse and sometimes seems to perform better (usually worse) than
either of the other two. In all cases, posix_fallocate performs
better, but I don't have a sufficiently old kernel to test with.


This updated test program looks reliable now.  The numbers are very 
tight when I'd expect them to be, and there's nowhere with the huge 
differences I saw in the earlier test program.


Here's results from a few sets of popular older platforms:

RHEL5, ext3

method: classic. 10 open/close iterations, 10 rewrite in 22.6949s
method: posix_fallocate. 10 open/close iterations, 10 rewrite in 23.0113s
method: glibc emulation. 10 open/close iterations, 10 rewrite in 22.4921s

method: classic. 10 open/close iterations, 10 rewrite in 23.2808s
method: posix_fallocate. 10 open/close iterations, 10 rewrite in 22.4736s
method: glibc emulation. 10 open/close iterations, 10 rewrite in 23.9871s

method: classic. 10 open/close iterations, 10 rewrite in 22.4812s
method: posix_fallocate. 10 open/close iterations, 10 rewrite in 22.2393s
method: glibc emulation. 10 open/close iterations, 10 rewrite in 23.6940s

RHEL6, ext4

method: classic. 10 open/close iterations, 10 rewrite in 56.0483s
method: posix_fallocate. 10 open/close iterations, 10 rewrite in 61.5092s
method: glibc emulation. 10 open/close iterations, 10 rewrite in 53.8569s

method: classic. 10 open/close iterations, 10 rewrite in 57.0361s
method: posix_fallocate. 10 open/close iterations, 10 rewrite in 55.9840s
method: glibc emulation. 10 open/close iterations, 10 rewrite in 64.9437sb


RHEL6, ext3

method: classic. 10 open/close iterations, 10 rewrite in 14.4080s
method: posix_fallocate. 10 open/close iterations, 10 rewrite in 16.1395s
method: glibc emulation. 10 open/close iterations, 10 rewrite in 16.9657s

method: classic. 10 open/close iterations, 10 rewrite in 15.2825s
method: posix_fallocate. 10 open/close iterations, 10 rewrite in 16.5315s
method: glibc emulation. 10 open/close iterations, 10 rewrite in 14.8115s

The win for posix_fallocate is there in most cases, but it's pretty hard 
to see in these older systems.  That could be OK.  As long as the 
difference is no more than noise, and that is the case, this could be 
good enough to commit.  If there are significantly better results on the 
new platforms, the old ones need to just not get worse.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER

2013-06-30 Thread Josh Berkus

On 06/30/2013 08:54 PM, ian link wrote:
 I found some time and I think I am up to speed now. I finally figured out
 how to add new operator strategies and made a little test operator for
 myself.
 
 It seems pretty clear that assuming '+' and '-' are addition and
 subtraction is a bad idea. I don't think it would be too tricky to add
 support for new operator strategies. Andrew Gierth suggested calling these
 new strategies offset - and offset +, which I think describes it pretty
 well. I assigned the operator itself to be @+ and @- but that can
 obviously be changed. If this sounds like a good path to you guys, I will
 go ahead and implement the operators for the appropriate types. Please let
 me know if I am misunderstanding something - I am still figuring stuff out
 :)
 
 Aside from the opclass stuff, there were some other important issues
 mentioned with the original RANGE support. I think I will address those
 after the opclass stuff is done.

Are these things you plan to get done this week, or for next CommitFest?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] big test separation POC




Note about the POC patch limitations/questions:

 - is deriving a schedule with a piece of shell okay?
   or should perl/python/whatever scripting be better?


I would think all we need are the results, i.e. the schedule files, plus 
some Makefile entries for them.


You can replicate data, but maintaining a set of files consistently looks 
like a bad idea to me, because it means that you have to update all 
replicated data for all changes. The current status is that there are two 
files, parallel  sequential, so it is not too bad. With big tests that 
could be 4, so it seems reasonnable to have at least some automatic 
derivation.



 - I'm really not sure about VPATH stuff.


This should be totally transparent to VPATH builds.


Sure:-) It means that I have not tested that, so it may or may not work.


 - I do not understand why the makefile specifies $(srcdir) before
   local files in some places.


For VPATH builds :-)


Hmmm. That is not what I call transparent:-) So I understand that 
derived files should not have them, because they would be put in the build 
tree instead of the source tree.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] big test separation POC