Re: dsa_allocate() faliure

2018-05-22 Thread Sand Stone
>>dsa_allocate could not find 7 free pages
I just this error message again on all of my worker nodes (I am using
Citus 7.4 rel). The PG core is my own build of release_10_stable
(10.4) out of GitHub on Ubuntu.

What's the best way to debug this? I am running pre-production tests
for the next few days, so I could gather info. if necessary (I cannot
pinpoint a query to repro this yet, as we have 10K queries running
concurrently).




On Mon, Jan 29, 2018 at 1:35 PM, Rick Otten  wrote:
> If I do a "set max_parallel_workers_per_gather=0;" before I run the query in
> that session, it runs just fine.
> If I set it to 2, the query dies with the dsa_allocate error.
>
> I'll use that as a work around until 10.2 comes out.  Thanks!  I have
> something that will help.
>
>
> On Mon, Jan 29, 2018 at 3:52 PM, Thomas Munro
>  wrote:
>>
>> On Tue, Jan 30, 2018 at 5:37 AM, Tom Lane  wrote:
>> > Rick Otten  writes:
>> >> I'm wondering if there is anything I can tune in my PG 10.1 database to
>> >> avoid these errors:
>> >
>> >> $  psql -f failing_query.sql
>> >> psql:failing_query.sql:46: ERROR:  dsa_allocate could not find 7 free
>> >> pages
>> >> CONTEXT:  parallel worker
>> >
>> > Hmm.  There's only one place in the source code that emits that message
>> > text:
>> >
>> > /*
>> >  * Ask the free page manager for a run of pages.  This should
>> > always
>> >  * succeed, since both get_best_segment and make_new_segment
>> > should
>> >  * only return a non-NULL pointer if it actually contains enough
>> >  * contiguous freespace.  If it does fail, something in our
>> > backend
>> >  * private state is out of whack, so use FATAL to kill the
>> > process.
>> >  */
>> > if (!FreePageManagerGet(segment_map->fpm, npages, _page))
>> > elog(FATAL,
>> >  "dsa_allocate could not find %zu free pages", npages);
>> >
>> > Now maybe that comment is being unreasonably optimistic, but it sure
>> > appears that this is supposed to be a can't-happen case, in which case
>> > you've found a bug.
>>
>> This is probably the bug fixed here:
>>
>>
>> https://www.postgresql.org/message-id/E1eQzIl-0004wM-K3%40gemulon.postgresql.org
>>
>> That was back patched, so 10.2 will contain the fix.  The bug was not
>> in dsa.c itself, but in the parallel query code that mixed up DSA
>> areas, corrupting them.  The problem comes up when the query plan has
>> multiple Gather nodes (and a particular execution pattern) -- is that
>> the case here, in the EXPLAIN output?  That seems plausible given the
>> description of a 50-branch UNION.  The only workaround until 10.2
>> would be to reduce max_parallel_workers_per_gather to 0 to prevent
>> parallelism completely for this query.
>>
>> --
>> Thomas Munro
>> http://www.enterprisedb.com
>
>



Re: Help me in reducing the CPU cost for the high cost query below, as it is hitting production seriously!!

2018-05-22 Thread Pavan Teja
Thanks a lot!  I will have a look

On Tue, May 22, 2018, 11:53 PM Justin Pryzby  wrote:

> On Tue, May 22, 2018 at 03:51:44AM -0700, pavan95 wrote:
> > Please find the output of explain(analyze,buffers) for the whole query in
> > the below link.
>
> > Seq Scan on res_users users (cost=750.92..1,836.69 rows=249 width=15)
> (actual time=3.962..17.544 rows=67 loops=1)
>
> Not sure but would you try creating an index on:
> res_users.res_employee_id
>
> > Seq Scan on res_users user1 (cost=0.00..58.03 rows=1,303 width=15)
> (actual time=0.002..0.002 rows=1 loops=1)
>
> Also the planner's estimate for table:res_users is off by 1300x..so you
> should
> probably vacuum analyze it then recheck.  I don't think we know what
> version
> postgres you have, but last week's patch releases include a fix which may
> be
> relevant (reltuples including dead tuples).
>
> Also I don't know the definition of this table or its indices:
> tms_workflow_history
>
> ..but it looks like an additional or modified index or maybe clustering the
> table on existing index might help (active? is_final_approver?)
> Or maybe this should be 3 separate indices rather than composite index?
> Perhaps some of those could be BRIN indices, depending on postgres version
>
> Justin
>


Re: Help me in reducing the CPU cost for the high cost query below, as it is hitting production seriously!!

2018-05-22 Thread Justin Pryzby
On Tue, May 22, 2018 at 03:51:44AM -0700, pavan95 wrote:
> Please find the output of explain(analyze,buffers) for the whole query in
> the below link.

> Seq Scan on res_users users (cost=750.92..1,836.69 rows=249 width=15) (actual 
> time=3.962..17.544 rows=67 loops=1) 

Not sure but would you try creating an index on:
res_users.res_employee_id

> Seq Scan on res_users user1 (cost=0.00..58.03 rows=1,303 width=15) (actual 
> time=0.002..0.002 rows=1 loops=1)

Also the planner's estimate for table:res_users is off by 1300x..so you should
probably vacuum analyze it then recheck.  I don't think we know what version
postgres you have, but last week's patch releases include a fix which may be
relevant (reltuples including dead tuples).

Also I don't know the definition of this table or its indices:
tms_workflow_history

..but it looks like an additional or modified index or maybe clustering the
table on existing index might help (active? is_final_approver?)
Or maybe this should be 3 separate indices rather than composite index?
Perhaps some of those could be BRIN indices, depending on postgres version

Justin



Re: Help me in reducing the CPU cost for the high cost query below, as it is hitting production seriously!!

2018-05-22 Thread pavan95
Hi Justin,

Please find the output of explain(analyze,buffers) for the whole query in
the below link.

Link:  https://explain.depesz.com/s/dNkb   

Thanks in Advance!


Regards,
Pavan



--
Sent from: 
http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html



Re: Help me in reducing the CPU cost for the high cost query below, as it is hitting production seriously!!

2018-05-22 Thread pavan95
Hi all, 
Thank you so much for your valuable responses.Tried every aspect which you
have said for my sub-query.  
I hoped a better decrease in cost for my main query. But yes it decreased
but not to a great extent.
What I felt is to provide the main query and the associated table
definitions in the query. Please help me to tune the following big query. 
select res.id id,
  row_number() OVER () as sno,
   res.header_id,
   res.emp_id,
   res.alias alias,
   res.name as name,
   res.billed_hrs billed_hrs,
   res.unbilled_hrs unbilled_hrs,
   res.paid_time_off paid_time_off,
   res.unpaid_leave unpaid_leave,
   res.breavement_time breavement_time,
   res.leave leave,
   res.state,
   count(*) OVER() AS full_count,
   res.header_emp_id,
   res.header_status
 from (
select 
history.id as id,
0 as header_id,
'0' as emp_id,
 row_number() OVER () as sno,
user1.alias_id as alias,
partner.name as name,
( select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
timesheet_header_id=header.id and
work_order_no != 'CORPORATE') billed_hrs,

(select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
timesheet_header_id=header.id and
release_no = 'unbillable_time') as unbilled_hrs,
(select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
timesheet_header_id=header.id and
release_no = 'paid_time_off') as paid_time_off,
(select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
timesheet_header_id=header.id and
release_no = 'unpaid_leave') as unpaid_leave,
(select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
timesheet_header_id=header.id and
release_no = 'bereavement_time') as breavement_time,
(select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
timesheet_header_id=header.id and date
>='2018-04-16' and date <='2018-04-30' and release_no in
('sick_leave','casual_leave','privilege_leave','optional_holiday') ) as
leave,
(case when tl_status.state = '' then 'Waiting for approval'
else tl_status.state end) as state,
header.res_employee_id as header_emp_id,
status.name as header_status 
from tms_workflow_history history, 
res_users users,
res_users user1,
res_partner partner,
tms_timesheet_status status,
tms_timesheet_header header
left join tms_workflow_history tl_status on
tl_status.timesheet_id=header.id
 and
tl_status.active=True
 and
tl_status.group_id=13

where 
 history.timesheet_id=header.id
and header.res_employee_id=user1.res_employee_id
and  status.id=header.status_id
and history.user_id=users.id
and user1.partner_id=partner.id
and header.timesheet_period_id = 127
 and (history.state = 'Approved' )
and history.current_activity='N'
and history.is_final_approver=True 
and history.active = True
   union 
select 
0 as id,
header.id as header_id,
'0' as emp_id,
 0 as sno,
users.alias_id as alias,
partner.name as name,
(select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
work_order_no != 'CORPORATE' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select 

Re: Help me in reducing the CPU cost for the high cost query below, as it is hitting production seriously!!

2018-05-22 Thread pavan95
Hi all, 
Thank you so much for your valuable responses.Tried every aspect which you
have said for my sub-query.  
I hoped a better decrease in cost for my main query. But yes it decreased
but not to a great extent.
What I felt is to provide the main query and the associated table
definitions in the query. Please help me to tune the following big query. 
select res.id id,
  row_number() OVER () as sno,
   res.header_id,
   res.emp_id,
   res.alias alias,
   res.name as name,
   res.billed_hrs billed_hrs,
   res.unbilled_hrs unbilled_hrs,
   res.paid_time_off paid_time_off,
   res.unpaid_leave unpaid_leave,
   res.breavement_time breavement_time,
   res.leave leave,
   res.state,
   count(*) OVER() AS full_count,
   res.header_emp_id,
   res.header_status
 from (
select 
history.id as id,
0 as header_id,
'0' as emp_id,
 row_number() OVER () as sno,
user1.alias_id as alias,
partner.name as name,
( select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
timesheet_header_id=header.id and
work_order_no != 'CORPORATE') billed_hrs,

(select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
timesheet_header_id=header.id and
release_no = 'unbillable_time') as unbilled_hrs,
(select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
timesheet_header_id=header.id and
release_no = 'paid_time_off') as paid_time_off,
(select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
timesheet_header_id=header.id and
release_no = 'unpaid_leave') as unpaid_leave,
(select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
timesheet_header_id=header.id and
release_no = 'bereavement_time') as breavement_time,
(select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
timesheet_header_id=header.id and date
>='2018-04-16' and date <='2018-04-30' and release_no in
('sick_leave','casual_leave','privilege_leave','optional_holiday') ) as
leave,
(case when tl_status.state = '' then 'Waiting for approval'
else tl_status.state end) as state,
header.res_employee_id as header_emp_id,
status.name as header_status 
from tms_workflow_history history, 
res_users users,
res_users user1,
res_partner partner,
tms_timesheet_status status,
tms_timesheet_header header
left join tms_workflow_history tl_status on
tl_status.timesheet_id=header.id
 and
tl_status.active=True
 and
tl_status.group_id=13

where 
 history.timesheet_id=header.id
and header.res_employee_id=user1.res_employee_id
and  status.id=header.status_id
and history.user_id=users.id
and user1.partner_id=partner.id
and header.timesheet_period_id = 127
 and (history.state = 'Approved' )
and history.current_activity='N'
and history.is_final_approver=True 
and history.active = True
   union 
select 
0 as id,
header.id as header_id,
'0' as emp_id,
 0 as sno,
users.alias_id as alias,
partner.name as name,
(select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
from tms_timesheet_details where 
work_order_no != 'CORPORATE' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select