[HACKERS] Performance issue with postgres9.6

2017-04-07 Thread Prakash Itnal
Hello,

We currently use psotgres 9.3 in our products. Recently we upgraded to
postgres 9.6. But with 9.6 we have seen a drastic reduction in throughput.
After analyzing carefully I found that "planner time" in 9.6 is very high.
Below are the details:

Scenario:
1 Create a table with 10 rows.
2 Execute simple query: select * from subscriber where s_id = 100;
3 No update/delete/insert; tried vacuum, full vacuum; by default we enable
auto-vacuum

9.3: Avg of "Total runtime" : *0.24ms* [actual throughput: *650 TPS*]
9.6: Avg of Total time: *0.56ms* (Avg of "Planning time" : 0.38ms + Avg of
"Execution time" : 0.18ms) [actual throughput: *80 TPS*]

Check the attachments for more details.

Below is the configuration setting. Full configuration can be found in
attachment.
shared_buffers = 128MB
effective_cache_size = 256MB

Note that we use master-slave (one master - one slave) setup. I could see
no difference even when I take out slave.

I tried all possibilities of increasing shared memory, maitenance_work,
asynchronous commit etc. but, nothing showed any major improvements. Kindly
help to identify what is missing!

PS: We use postgres for small scale so the values are less. The size of the
DB is also just around 180MB.

-- 
Cheers,
Prakash
psql (9.3.14)
Type "help" for help.

perftestdb=# select count(*) from subscriber ;
 count  

 10
(1 row)

perftestdb=# \d subscriber
Table "public.subscriber"
Column| Type  | Modifiers 
--+---+---
 s_id | integer   | not null
 sub_nbr  | character varying(15) | not null
 bit_1| smallint  | 
 bit_2| smallint  | 
 bit_3| smallint  | 
 bit_4| smallint  | 
 bit_5| smallint  | 
 bit_6| smallint  | 
 bit_7| smallint  | 
 bit_8| smallint  | 
 bit_9| smallint  | 
 bit_10   | smallint  | 
 hex_1| smallint  | 
 hex_2| smallint  | 
 hex_3| smallint  | 
 hex_4| smallint  | 
 hex_5| smallint  | 
 hex_6| smallint  | 
 hex_7| smallint  | 
 hex_8| smallint  | 
 hex_9| smallint  | 
 hex_10   | smallint  | 
 byte2_1  | smallint  | 
 byte2_2  | smallint  | 
 byte2_3  | smallint  | 
 byte2_4  | smallint  | 
 byte2_5  | smallint  | 
 byte2_6  | smallint  | 
 byte2_7  | smallint  | 
 byte2_8  | smallint  | 
 byte2_9  | smallint  | 
 byte2_10 | smallint  | 
 msc_location | integer   | 
 vlr_location | integer   | 
Indexes:
"subscriber_pkey" PRIMARY KEY, btree (s_id)
"subscriber_by_sub_nbr" UNIQUE, btree (sub_nbr)
Referenced by:
TABLE "access_info" CONSTRAINT "access_info_s_id_fkey" FOREIGN KEY (s_id) 
REFERENCES subscriber(s_id)
TABLE "special_facility" CONSTRAINT "special_facility_s_id_fkey" FOREIGN 
KEY (s_id) REFERENCES subscriber(s_id)

perftestdb=#  explain analyze select * from subscriber where s_id = 100;
 QUERY PLAN 
 
-
 Index Scan using subscriber_pkey on subscriber  (cost=0.29..8.31 rows=1 
width=88) (actual time=0.049..0.055 rows=1 loops=1)
   Index Cond: (s_id = 100)
 Total runtime: 0.231 ms
(3 rows)

perftestdb=#  explain analyze select * from subscriber where s_id = 100;
 QUERY PLAN 
 
-
 Index Scan using subscriber_pkey on subscriber  (cost=0.29..8.31 rows=1 
width=88) (actual time=0.059..0.066 rows=1 loops=1)
   Index Cond: (s_id = 100)
 Total runtime: 0.246 ms
(3 rows)

perftestdb=#  explain analyze select * from subscriber where s_id = 100;
 QUERY PLAN 
 
-
 Index Scan using subscriber_pkey on subscriber  (cost=0.29..8.31 rows=1 
width=88) (actual time=0.059..0.066 rows=1 loops=1)
   Index Cond: (s_id = 100)
 Total runtime: 0.249 ms
(3 rows)

perftestdb=#  explain analyze select * from subscriber where s_id = 100;
 QUERY PLAN 

Re: [HACKERS] Auto-vacuum is not running in 9.1.12

2015-06-22 Thread Prakash Itnal
Hi Tom/Alvaro,

Kindly let us know if the correction provided in previous mail is fine or
not! Current code any way handle scenario-1 whereas it is still vulnerable
to scenario-2.

From previous mail:
*Scenario-1:* current_time (2015) - changed_to_past (1995) -
stays-here-for-half-day - corrected to current_time (2015)
*Scenario-2:* current_time (2015) - changed_to_future (2020) -
stays-here-for-half-day - corrected to current_time (2015)

We are waiting for your response.

On Sun, Jun 21, 2015 at 2:56 PM, Prakash Itnal prakash...@gmail.com wrote:

 Hi,

 To my understanding it will probably not open doors for worst situations!
 Please correct if my below understanding is correct.

 The latch will wake up under below three situations:
 a) Socket error (= result is set to negative number)
 b) timeout (= result is set to TIMEOUT)
 c) some event arrived on socket (= result is set to non-zero value, if
 caller registers for arrived events otherwise no value is set)

 Given the above conditions, the result can be zero only if there is an
 unregistered event which breaks the latch (*). In such case, current
 implementation evaluates the remaining sleep time. This calculation is
 making the situation worst, if time goes back.

 The time difference between cur_time (current time) and start_time (time
 when latch started) should always be a positive integer because cur_time is
 always greater than start_time under all normal conditions.

 delta_timeout = cur_time - start_time;

 The difference can be negative only if time shifts to past. So it is
 possible to detect if time shifted to past. When it is possible to detect
 can it be possible to correct? I think we can correct and prevent long
 sleeps due to time shifts.

 Currently I treat it as TIMEOUT, though conceptually it is not. The ideal
 solution would be to leave this decision to the caller of WaitLatch(). With
 my little knowledge of postgres code, I think TIMEOUT would be fine!


 (*) The above description is true only for timed wait. If latch is started
 with blocking wait (no timeout) then above logic is not applicable.

 On Sat, Jun 20, 2015 at 10:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Prakash Itnal prakash...@gmail.com writes:
  Sorry for the late response. The current patch only fixes the scenario-1
  listed below. It will not address the scenario-2. Also we need a fix in
  unix_latch.c where the remaining sleep time is evaluated, if latch is
 woken
  by other events (or result=0). Here to it is possible the latch might
 go in
  long sleep if time shifts to past time.

 Forcing WL_TIMEOUT if the clock goes backwards seems like quite a bad
 idea to me.  That seems like a great way to make a bad situation worse,
 ie it induces failures where there were none before.

 regards, tom lane




 --
 Cheers,
 Prakash




-- 
Cheers,
Prakash


Re: [HACKERS] Is Postgres database server works fine if there is a change in system time?

2015-06-21 Thread Prakash Itnal
Hi,

The more detailed discussion is on other mail chain with subject line:
Auto-vacuum
is not running in 9.1.12 I agree there is no issue if time shifts by small
amount. This issue is serious only if there is big time shifts.

In our product the database is deployed on remote network elements and
there will be thousands of such network elements. All these network
elements are synced with one or more centralized NTP servers. If due to
some issue by mistake if NTP shifts back in time and corrects itself then
most of our network elements goes for toss. Since we use limited disc space
and our db model is write-heavy (update SQLs), the disc starts getting full
and soon all services get interrupted. Restarting service is costly from
customer point-of-view. So our suggestion is to prevent a restart in such
situations.

With my little understanding and analysis I have shared the patch in other
mail chain. Please check and share your inputs.



For quick reference please find the scenarios below:



On Mon, Jun 22, 2015 at 8:02 AM, Craig Ringer cr...@2ndquadrant.com wrote:

 On 17 June 2015 at 15:36, Prakash Itnal prakash...@gmail.com wrote:
  Hi,
 
  Currently we observed that certain postgres child process, for eg.
  autovacuum worker, are not working as expected if there is a system time
  change. So I wanted to know if postgres already supports system time
 changes
  or not.

 In general, small time slews will have no effect on operations. If
 they do then that's likely a PostgreSQL bug.

 A large time discontinuity, where time jumps backwards or forwards by
 a large amount, could well cause issues. I would personally be
 inclined to restart the database server in these cases, or kill the
 autovaccum scheduler to force it to restart.

 In what way are the autovacuum workers not working as expected after
 a system time change? What was the system time change made? How was
 the change made? On what operating system and version?

 BTW, this seems to be posted on Stack Overflow too:

 http://stackoverflow.com/questions/30839163/system-time-change-impact-to-postgresql-database-processes


 --
  Craig Ringer   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services




-- 
Cheers,
Prakash


Re: [HACKERS] Auto-vacuum is not running in 9.1.12

2015-06-21 Thread Prakash Itnal
Hi,

To my understanding it will probably not open doors for worst situations!
Please correct if my below understanding is correct.

The latch will wake up under below three situations:
a) Socket error (= result is set to negative number)
b) timeout (= result is set to TIMEOUT)
c) some event arrived on socket (= result is set to non-zero value, if
caller registers for arrived events otherwise no value is set)

Given the above conditions, the result can be zero only if there is an
unregistered event which breaks the latch (*). In such case, current
implementation evaluates the remaining sleep time. This calculation is
making the situation worst, if time goes back.

The time difference between cur_time (current time) and start_time (time
when latch started) should always be a positive integer because cur_time is
always greater than start_time under all normal conditions.

delta_timeout = cur_time - start_time;

The difference can be negative only if time shifts to past. So it is
possible to detect if time shifted to past. When it is possible to detect
can it be possible to correct? I think we can correct and prevent long
sleeps due to time shifts.

Currently I treat it as TIMEOUT, though conceptually it is not. The ideal
solution would be to leave this decision to the caller of WaitLatch(). With
my little knowledge of postgres code, I think TIMEOUT would be fine!


(*) The above description is true only for timed wait. If latch is started
with blocking wait (no timeout) then above logic is not applicable.

On Sat, Jun 20, 2015 at 10:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Prakash Itnal prakash...@gmail.com writes:
  Sorry for the late response. The current patch only fixes the scenario-1
  listed below. It will not address the scenario-2. Also we need a fix in
  unix_latch.c where the remaining sleep time is evaluated, if latch is
 woken
  by other events (or result=0). Here to it is possible the latch might go
 in
  long sleep if time shifts to past time.

 Forcing WL_TIMEOUT if the clock goes backwards seems like quite a bad
 idea to me.  That seems like a great way to make a bad situation worse,
 ie it induces failures where there were none before.

 regards, tom lane




-- 
Cheers,
Prakash


Re: [HACKERS] Auto-vacuum is not running in 9.1.12

2015-06-20 Thread Prakash Itnal
Hi,

Sorry for the late response. The current patch only fixes the scenario-1
listed below. It will not address the scenario-2. Also we need a fix in
unix_latch.c where the remaining sleep time is evaluated, if latch is woken
by other events (or result=0). Here to it is possible the latch might go in
long sleep if time shifts to past time.

*Scenario-1:* current_time (2015) - changed_to_past (1995) -
stays-here-for-half-day - corrected to current_time (2015)
*Scenario-2:* current_time (2015) - changed_to_future (2020) -
stays-here-for-half-day - corrected to current_time (2015)

*Results: *
Scenario-1: Auto-vacuuming not done from the time system time changed to
1995 until it is corrected to current time. In current context half-day.
Scenario-2: Auto-vacuuming keeps running if system time shifts to future.
However after correcting time back to current time (from 2020-2015), the
auto-vacuuming goes into 5 year sleep. Though current patch fixes waking up
from sleep it will not allow to launch auto-vacuum worker as the dblist
still holds previously set time i.e. 2020.

*Proposed Fixes:*
*autovacuum.c:* I will rebuild_database_list if time shift is detected. The
time-shift is detected if sleep time evaluated is zero or greater than
autovacuum_naptime. Currently the list is rebuilt only if time shifts to
future. I added a check to rebuild it if sleep time is greater than
autovacuum_naptime. Secondly I included the patch from Alvaro and changed
the default 300 seconds value to autovacuum_naptime. This will avoid
multiple wakeups if autovacuum_naptime is set to greater than 300 seconds.

*unix_latch.c:* Current implementation evaluates the remaining sleep time
using cur_timeout = timeout - (start_time - cur_time). If the time is
shifted back to past then cur_timeout will be evaluated to long time (for
eg. start_time=2015 and cur_time=1995 then cur_timeout=timeout - (-20
years) = timeout + 20years). To avoid this wrong calculation I added a
check and treat it as timeout.

With above mentioned fixes the auto-vacuuming will be robust enough to
handle any system time changes. We tested the scenarios in our setup and
they seem to work fine. I hope these are valid fixes and they do not affect
any other flows.

Please review and share your review comments/suggestions.

PS: In our product database is used in update-heavy mode with limited disc
space. So we need to be robust to handle such time changes to avoid any
system failures due to disc full.

On Fri, Jun 19, 2015 at 10:28 PM, Andres Freund and...@anarazel.de wrote:

 On 2015-06-17 18:10:42 -0300, Alvaro Herrera wrote:
  Yeah, the case is pretty weird and I'm not really sure that the server
  ought to be expected to behave.  But if this is actually the only part
  of the server that misbehaves because of sudden gigantic time jumps, I
  think it's fair to patch it.  Here's a proposed patch.

 We probably should go through the server and look at the various sleeps
 and make sure thy all have a upper limit. I doubt this is the only
 location without one.

 Greetings,

 Andres Freund




-- 
Cheers,
Prakash


time_shift_fixes_in_autovacuum.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Is Postgres database server works fine if there is a change in system time?

2015-06-17 Thread Prakash Itnal
Hi,

Currently we observed that certain postgres child process, for eg.
autovacuum worker, are not working as expected if there is a system time
change. So I wanted to know if postgres already supports system time
changes or not.

Please confirm if postgres already handles system time changes or not.

-- 
Cheers,
Prakash


Re: [HACKERS] Auto-vacuum is not running in 9.1.12

2015-06-16 Thread Prakash Itnal
Hi,

Currently the issue is easily reproducible. Steps to reproduce:
* Set some aggressive values for auto-vacuuming.
* Run a heavy database update/delete/insert queries. This leads to invoking
auto-vacuuming in quick successions.
* Change the system time to older for eg. 1995-01-01

Suddenly auto-vacuuming stops working. Even after changing system time back
to current time, the auto-vacuuming did not resume.

So the question is, does postrges supports system time changes?.


On Tue, Jun 16, 2015 at 10:12 AM, Prakash Itnal prakash...@gmail.com
wrote:

 Hi,

 @Avaro Herrera, Thanks for quick reply. I was on leave and hence not able
 to reply soon.

 This issue was observed on customer site. However after long discussion
 and digging into what happened around the date 2nd May 2015, we got to know
 that NTP server suddenly went back in time to 1995. It remained there for
 some time until it is noticed and corrected. So after correcting NTP server
 time the whole cluster is synced to current date. After this change in time
 the auto-vacuum stopped. Since auto-vacuuming is triggered periodically, I
 doubt if this time change has affected any timer!

 So I suspect the time change is the root cause! It would be great if
 someone can clarify if this is the root cause for auto-vacuum stopped.


 On Wed, Jun 10, 2015 at 8:19 PM, Alvaro Herrera alvhe...@2ndquadrant.com
 wrote:

 Prakash Itnal wrote:
  Hello,
 
  Recently we encountered a issue where the disc space is continuously
  increasing towards 100%. Then a manual vacuum freed the disc space. But
  again it is increasing. When digged more it is found that auto-vacuuming
  was not running or it is either stucked/hanged.

 Hm, we have seen this on Windows, I think.

 Is the stats collector process running?  Is it stuck?

 If you attach to process 6504 (autovac launcher), what's the backtrace?

  4) Last run auto-vacuum:
  SELECT now(), schemaname, relname, last_vacuum, last_autovacuum,
 vacuum_count, autovacuum_count FROM pg_stat_user_tables;
 
now  | schemaname |relname|
 last_vacuum |last_autovacuum| vacuum_count |
 autovacuum_count
 
 ---++---+-+---+--+--
   2015-06-10 01:03:03.574212+02 | public | abcd  |
| 2015-04-18 00:52:35.008874+02 |0 |2
   2015-06-10 01:03:03.574212+02 | public | xyz   |
| 2015-05-02 06:01:35.220651+02 |0 |   20
 
  NOTE: I changed the relname for above two tables due to confidentiality.

 Are there dead tuples in tables?  Maybe vacuums are getting executed and
 these values are not updated, for instance?

 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services




 --
 Cheers,
 Prakash




-- 
Cheers,
Prakash


Re: [HACKERS] Auto-vacuum is not running in 9.1.12

2015-06-15 Thread Prakash Itnal
Hi,

@Avaro Herrera, Thanks for quick reply. I was on leave and hence not able
to reply soon.

This issue was observed on customer site. However after long discussion and
digging into what happened around the date 2nd May 2015, we got to know
that NTP server suddenly went back in time to 1995. It remained there for
some time until it is noticed and corrected. So after correcting NTP server
time the whole cluster is synced to current date. After this change in time
the auto-vacuum stopped. Since auto-vacuuming is triggered periodically, I
doubt if this time change has affected any timer!

So I suspect the time change is the root cause! It would be great if
someone can clarify if this is the root cause for auto-vacuum stopped.


On Wed, Jun 10, 2015 at 8:19 PM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

 Prakash Itnal wrote:
  Hello,
 
  Recently we encountered a issue where the disc space is continuously
  increasing towards 100%. Then a manual vacuum freed the disc space. But
  again it is increasing. When digged more it is found that auto-vacuuming
  was not running or it is either stucked/hanged.

 Hm, we have seen this on Windows, I think.

 Is the stats collector process running?  Is it stuck?

 If you attach to process 6504 (autovac launcher), what's the backtrace?

  4) Last run auto-vacuum:
  SELECT now(), schemaname, relname, last_vacuum, last_autovacuum,
 vacuum_count, autovacuum_count FROM pg_stat_user_tables;
 
now  | schemaname |relname|
 last_vacuum |last_autovacuum| vacuum_count |
 autovacuum_count
 
 ---++---+-+---+--+--
   2015-06-10 01:03:03.574212+02 | public | abcd  |
  | 2015-04-18 00:52:35.008874+02 |0 |2
   2015-06-10 01:03:03.574212+02 | public | xyz   |
  | 2015-05-02 06:01:35.220651+02 |0 |   20
 
  NOTE: I changed the relname for above two tables due to confidentiality.

 Are there dead tuples in tables?  Maybe vacuums are getting executed and
 these values are not updated, for instance?

 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services




-- 
Cheers,
Prakash


[HACKERS] Auto-vacuum is not running in 9.1.12

2015-06-10 Thread Prakash Itnal
Hello,

Recently we encountered a issue where the disc space is continuously
increasing towards 100%. Then a manual vacuum freed the disc space. But
again it is increasing. When digged more it is found that auto-vacuuming
was not running or it is either stucked/hanged.

Version: 9.1.12
Auto vacuum is enabled: check configuration details attached file.
Auto vacuum daemon running.
From stats it shows that auto-vacuum last run almost more than month back.
There are no error logs from database.

The attached file has all these details. If any other details needed please
let me know. I will try to collect it and share. Please help to analyze why
auto-vacuum stopped suddenly?


-- 
Cheers,
Prakash
1) Auto-vacuum configuration:
=
 autovacuum  | on   
| Starts the autovacuum subprocess.
 autovacuum_analyze_scale_factor | 0.1  
| Number of tuple inserts, updates or deletes prior to analyze as a fraction of 
reltuples.
 autovacuum_analyze_threshold| 50   
| Minimum number of tuple inserts, updates or deletes prior to analyze.
 autovacuum_freeze_max_age   | 2
| Age at which to autovacuum a table to prevent transaction ID wraparound.
 autovacuum_max_workers  | 3
| Sets the maximum number of simultaneously running autovacuum worker processes.
 autovacuum_naptime  | 1min 
| Time to sleep between autovacuum runs.
 autovacuum_vacuum_cost_delay| 20ms 
| Vacuum cost delay in milliseconds, for autovacuum.
 autovacuum_vacuum_cost_limit| -1   
| Vacuum cost amount available before napping, for autovacuum.
 autovacuum_vacuum_scale_factor  | 0.2  
| Number of tuple updates or deletes prior to vacuum as a fraction of reltuples.
 autovacuum_vacuum_threshold | 50   
| Minimum number of tuple updates or deletes prior to vacuum.
 log_autovacuum_min_duration | -1   
| Sets the minimum execution time above which autovacuum actions will be logged.
=

2) Auto-vacuum daemon is running:
ps -eg | grep autovacuum
5432  6504  6001  0 Apr14 ?00:02:45 postgres: autovacuum launcher 
process

3) Table sizes:
SELECT nspname || '.' || relname AS relation,
pg_size_pretty(pg_total_relation_size(C.oid)) AS total_size,
pg_size_pretty(pg_total_relation_size(C.oid) - pg_relation_size(C.oid)) as 
External Size
  FROM pg_class C
  LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
  WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND C.relkind  'i'
AND nspname !~ '^pg_toast'
  ORDER BY pg_total_relation_size(C.oid) DESC
  LIMIT 20;

 relation | total_size | External Size 
--++---
 public.abcd  | 1108 MB| 592 kB
 public.xyz   | 5904 kB| 56 kB

4) Last run auto-vacuum:
SELECT now(), schemaname, relname, last_vacuum, last_autovacuum, vacuum_count, 
autovacuum_count FROM pg_stat_user_tables;

  now  | schemaname |relname| last_vacuum | 
   last_autovacuum| vacuum_count | autovacuum_count 
---++---+-+---+--+--
 2015-06-10 01:03:03.574212+02 | public | abcd  | | 
2015-04-18 00:52:35.008874+02 |0 |2
 2015-06-10 01:03:03.574212+02 | public | xyz   | | 
2015-05-02 06:01:35.220651+02 |0 |   20

NOTE: I changed the relname for above two tables due to confidentiality.
 
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ISSUE] pg_dump: schema with OID 0 does not exist

2014-03-10 Thread Prakash Itnal
Hi,

Can someone confirm is this really an issue? or any reasons for missing
rows?


On Tue, Feb 25, 2014 at 3:51 PM, Prakash Itnal prakash...@gmail.com wrote:

 Hi,

 Recently we observed below errors while taking dump after upgrading from
 9.0.13 to 9.1.9.

 pg_dump: schema with OID 0 does not exist

 I referred similar issues reported previously (
 http://www.postgresql.org/message-id/CAGWYGjXRJj=zugejv0ckvn4zf9hb92q+7e3aqfcvbgbmb9z...@mail.gmail.com)
 and get issue resolved after identifying and inserting some of the missing
 rows from *pg_opclass* table. Below are the rows that are missed after
 upgrade (from *pg_opclass *table):

405 | aclitem_ops |   11 |   10 |  2235
 |  1033 | t  |  0

783 | box_ops |   11 |   10 |  2593
 |   603 | t  |  0

783 | point_ops   |   11 |   10 |  1029
 |   600 | t  |603

783 | poly_ops|   11 |   10 |  2594
 |   604 | t  |603

783 | circle_ops  |   11 |   10 |  2595
 |   718 | t  |603

   2742 | _int4_ops   |   11 |   10 |  2745
 |  1007 | t  | 23

   2742 | _text_ops   |   11 |   10 |  2745
 |  1009 | t  | 25

   2742 | _abstime_ops|   11 |   10 |  2745
 |  1023 | t  |702


 Can some one help me to understand whether it is an issue? If not how and
 why these rows got disappeared after upgrade?

 Since we have an fully automated environment we do not encourage manual
 intervention. So I am trying to understand how can we handle these issues.

 --
 Cheers,
 Prakash




-- 
Cheers,
Prakash


[HACKERS] [ISSUE] pg_dump: schema with OID 0 does not exist

2014-02-25 Thread Prakash Itnal
Hi,

Recently we observed below errors while taking dump after upgrading from
9.0.13 to 9.1.9.

pg_dump: schema with OID 0 does not exist

I referred similar issues reported previously (
http://www.postgresql.org/message-id/CAGWYGjXRJj=zugejv0ckvn4zf9hb92q+7e3aqfcvbgbmb9z...@mail.gmail.com)
and get issue resolved after identifying and inserting some of the missing
rows from *pg_opclass* table. Below are the rows that are missed after
upgrade (from *pg_opclass *table):

   405 | aclitem_ops |   11 |   10 |  2235
|  1033 | t  |  0

   783 | box_ops |   11 |   10 |  2593 |
   603 | t  |  0

   783 | point_ops   |   11 |   10 |  1029
|   600 | t  |603

   783 | poly_ops|   11 |   10 |  2594
|   604 | t  |603

   783 | circle_ops  |   11 |   10 |  2595
|   718 | t  |603

  2742 | _int4_ops   |   11 |   10 |  2745
|  1007 | t  | 23

  2742 | _text_ops   |   11 |   10 |  2745
|  1009 | t  | 25

  2742 | _abstime_ops|   11 |   10 |  2745
|  1023 | t  |702


Can some one help me to understand whether it is an issue? If not how and
why these rows got disappeared after upgrade?

Since we have an fully automated environment we do not encourage manual
intervention. So I am trying to understand how can we handle these issues.

-- 
Cheers,
Prakash


[HACKERS] Possible deadlock issue when one transaction waiting on other and vice versa? Should, ideally, postgres recognize blocking situation?

2011-04-26 Thread Prakash Itnal
Hi,


I have create the following tables:
1. rnc table
CREATE TABLE act_rnc(rnc_id integer NOT NULL PRIMARY KEY, rnc_data BYTEA);
2. rncgen table
CREATE TABLE act_rncgen(rnc_id integer NOT NULL PRIMARY KEY, rncsubObj_Cnt
integer, rncgen_data BYTEA);
3. iuo table which has a foreign key reference to rnc table
CREATE TABLE act_iuo(iuo_id integer NOT NULL primary key, rnc_id integer NOT
NULL, iuo_data BYTEA, FOREIGN KEY(rnc_id) references act_rnc(rnc_id) on
delete cascade);

Now i open two transactions (separate session with psql). In the first
transaction I give the following sql sequence:
begin;
update act_rnc set rnc_data='rnc_data' where rnc_id=1;

The transaction will be open.

In a second transaction i give the following sql sequence:
begin;
insert into act_iuo values (1,1,'iuo_data');

-- now the second transaction is blocked. I work with PostgreSQL 9.0.

Some outputs:
select * from pg_locks;
   locktype| database | relation | page | tuple | virtualxid |
transactionid | classid | objid | objsubid | virtualtransaction |  pid
|   mode   | granted

---+--+--+--+---++---+-+---+--++---+--+-

 tuple |16385 |16427 |0 | 8 |
|   | |   |  | 3/80   |  9230 |
ShareLock| t

 relation  |16385 |10985 |  |   |
|   | |   |  | 4/247  | 16535 |
AccessShareLock  | t

 virtualxid|  |  |  |   | 4/247
|   | |   |  | 4/247  | 16535 |
ExclusiveLock| t

 relation  |16385 |16443 |  |   |
|   | |   |  | 3/80   |  9230 |
RowExclusiveLock | t

 transactionid |  |  |  |   ||
584 | |   |  | 3/80   |  9230 |
ExclusiveLock| t

 virtualxid|  |  |  |   | 3/80
|   | |   |  | 3/80   |  9230 |
ExclusiveLock| t

 relation  |16385 |16433 |  |   |
|   | |   |  | 3/80   |  9230 |
AccessShareLock  | t

 relation  |16385 |16427 |  |   |
|   | |   |  | 5/535  |  2814 |
RowExclusiveLock | t

 virtualxid|  |  |  |   | 5/535
|   | |   |  | 5/535  |  2814 |
ExclusiveLock| t

 transactionid |  |  |  |   ||
583 | |   |  | 5/535  |  2814 |
ExclusiveLock| t

 relation  |16385 |16449 |  |   |
|   | |   |  | 3/80   |  9230 |
RowExclusiveLock | t

 relation  |16385 |16427 |  |   |
|   | |   |  | 3/80   |  9230 |
RowShareLock | t

 transactionid |  |  |  |   ||
583 | |   |  | 3/80   |  9230 |
ShareLock| f

 relation  |16385 |16433 |  |   |
|   | |   |  | 5/535  |  2814 |
RowExclusiveLock | t

(14 rows)

select relname, pg_class.oid from pg_class;
 act_rnc_pkey| 16433
 pg_inherits_parent_index|  2187
 pg_inherits_relid_seqno_index   |  2680
 pg_toast_16435  | 16438
 pg_trigger_oid_index|  2702
 pg_toast_16435_index| 16440
 act_rncgen  | 16435
 act_rncgen_pkey | 16441
 pg_toast_16443  | 16446
 pg_toast_16443_index| 16448
 act_iuo_pkey| 16449
 pg_amop |  2602
 act_iuo | 16443
 pg_largeobject  |  2613
 act_rnc | 16427
 pg_toast_11361  | 11363
 pg_toast_11361_index| 11365
 pg_toast_11366_index| 11370

I assume that the access to act_rnc_pkey causes the blocking, however why?
Or how I can resolve the blocking (commit one transaction solves the
problem, but should Postgres not recognize the blocking situation and
release one transaction?). Is this an error in Postgres?

-- 
Cheers,
Prakash