from:"Миша Тюрин"

[HACKERS] Re[2]: [HACKERS] Remove fsync ON/OFF as a visible option?

2015-03-21 Thread Миша Тюрин



why does we take so many attention to fsync issue?
but there are also table spaces in tmpfs, wal in tmpfs, disks with cache 
without bbu, writeback writes and fs without ordering and journal, any CLOUDS, 
etc etc... in our real world installations.
more over not all of these issues are usually in dba's medium, and what dba 
really have to do -- is to accept ugly bottom storage properties and DO 
properly PITR/standby.
and if we have PITR then "fsync or not fsync" in master host doesn't matter so 
much. and could matter providing fsync in archive host. but doing fsync in 
archive -- it is workaround for archive_command realization.
in conclusion -- imho, full-page writes is more sensual than fsync when we 
guarantee PITR.
-- misha

[HACKERS] Re: [HACKERS] Streaming replication and WAL archive interactions

2015-02-11 Thread Миша Тюрин


>  This should be a very common setup in the field, so how are  people doing it 
>in practice?

One of possible workaround with archive and streaming was to use pg_receivexlog 
from standby to copy/save WALs to archive. but with pg_receivexlog was also 
issue with fsync.


[ master ] -- streaming --> [ standby ] -- pg_receivexlog --> [ /archive ]


In that case archive is always in pre standby state and it could be better than 
had archive broken on promote.
--
Misha

[HACKERS] Re: [HACKERS] On partitioning

2014-12-03 Thread Миша Тюрин


maybe vertica's approach will be a useful example

http://my.vertica.com/docs/7.1.x/HTML/index.htm#Authoring/AdministratorsGuide/Partitions/PartitioningTables.htm

http://my.vertica.com/docs/7.1.x/HTML/index.htm#Authoring/SQLReferenceManual/Statements/CREATETABLE.htm
... [ PARTITION BY partition-clause ]
--
Mikhail

[HACKERS] Re[2]: [HACKERS] Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)

2014-01-21 Thread Миша Тюрин


Hi
But maybe postgres should provide its own subsystem like linux active/inactive 
memory over and/or near shared buffers? There 
could be some postgres special heuristics in its own approach.
And does anyone know how mysql-innodb guys are getting with similar issues?
Thank you!

[HACKERS] freeze cannot be finished

2013-11-16 Thread Миша Тюрин

Hello! 

Could anyone review patch suggested by Jeff Janes ?

Initial thread 
http://www.postgresql.org/message-id/flat/1384356585.995240...@f50.i.mail.ru#1384356585.995240...@f50.i.mail.ru

Thanks in advance!

> 
> On Wed, Nov 13, 2013 at 3:53 PM, Sergey Burladyan  < eshkin...@gmail.com > 
> wrote:
> >Jeff Janes < jeff.ja...@gmail.com > writes:
> >
> >If I not mistaken, looks like lazy_scan_heap() called from lazy_vacuum_rel()
> >(see [1]) skip pages, even if it run with scan_all == true, lazy_scan_heap()
> >does not increment scanned_pages if lazy_check_needs_freeze() return false, 
> >so
> >if this occurred at wraparound vacuum it cannot update pg_class, because
> >pg_class updated via this code:
> >
> >    new_frozen_xid = FreezeLimit;
> >    if (vacrelstats->scanned_pages < vacrelstats->rel_pages)
> >        new_frozen_xid = InvalidTransactionId;
> >
> >    vac_update_relstats(onerel,
> >                        new_rel_pages,
> >                        new_rel_tuples,
> >                        new_rel_allvisible,
> >                        vacrelstats->hasindex,
> >                        new_frozen_xid);
> >
> >so i think in our prevent wraparound vacuum vacrelstats->scanned_pages always
> >less than vacrelstats->rel_pages and pg_class relfrozenxid never updated.
> 
> Yeah, I think that that is a bug.  If the clean-up lock is unavailable but 
> the page is inspected without it and found not to need freezing, then the 
> page needs to be counted as scanned, but is not so counted.
> 
> commit bbb6e559c4ea0fb4c346beda76736451dc24eb4e
> Date:   Mon Nov 7 21:39:40 2011 -0500
> 
> But this was introduced in 9.2.0, so unless the OP didn't upgrade to 9.2 
> until recently, I don't know why it just started happening.
> 
> It looks like a simple fix (to HEAD attached), but I don't know how to test 
> it.
> 
> Also, it seem like it might be worth issuing a warning if scan_all is true 
> but all was not scanned.
> 
> Cheers,
> 
> Jeff
> 
> --
> Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

> 

--
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
new file mode 100644
index e1d6d1c..6778c7d
*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
*** lazy_scan_heap(Relation onerel, LVRelSta
*** 602,607 
--- 602,608 
  			if (!lazy_check_needs_freeze(buf))
  			{
  UnlockReleaseBuffer(buf);
+ vacrelstats->scanned_pages++;
  continue;
  			}
  			LockBuffer(buf, BUFFER_LOCK_UNLOCK);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [HACKERS] Wal sync odirect

2013-07-22 Thread Миша Тюрин


i tell about wal_level is higher than MINIMAL


wal_level != minimal
http://doxygen.postgresql.org/xlogdefs_8h_source.html
"
48   * Because O_DIRECT bypasses the kernel buffers, and because we never
49   * read those buffers except during crash recovery or if wal_level != 
minimal "

>> hi, list. there are my proposal. i would like to tell about odirect in wal 
>> sync in wal_level is higher than minimal. i think in my case when wal 
>> traffic is up to 1gb per 2-3 minutes but discs hardware with 2gb bbu cache 
>> (or maybe ssd under wal) - there would be better if wall traffic could not 
>> harm os memory eviction. and i do not use streaming. my archive command may 
>> read wal directly without os cache. just opinion, i have not done any tests 
>> yet. but i am still under the some memory eviction anomaly.
>
>PostgreSQL already uses O_DIRECT for WAL writes if you use O_SYNC mode
>for WAL writes. See comments in src/include/access/xlogdefs.h (search
>for O_DIRECT). You should also examine
>src/backend/access/transam/xlog.c, particularly the function
>get_sync_bit(...)
>
>Try doing some tests with pg_test_fsync, see how performance looks. If
>your theory is right and WAL traffic is putting pressure on kernel write
>buffers, using fsync=open_datasync - which should be the default on
>Linux - may help.
>
>I'd recommend doing some detailed tracing and performance measurements
>before trying to proceed further.
>
>-- 
> Craig Ringer  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
>-- 
>Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Auto explain target tables

2013-07-21 Thread Миша Тюрин


hi, list, again. the next proposal into auto explain. one would be happy if 
could set list of target tables and indexes. sometimes it is very hard to 
detect who is using your indexes. but turn total logging on under thousands 
transactions per seconds is not seems like nice idea couse size of resulting 
log files (cpu utilization might not be so critical)
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Wal sync odirect

2013-07-21 Thread Миша Тюрин

hi, list. there are my proposal. i would like to tell about odirect in wal sync 
in wal_level is higher than minimal. i think in my case when wal traffic is up 
to 1gb per 2-3 minutes but discs hardware with 2gb bbu cache (or maybe ssd 
under wal) - there would be better if wall traffic could not harm os memory 
eviction. and i do not use streaming. my archive command may read wal directly 
without os cache. just opinion, i have not done any tests yet. but i am still 
under the some memory eviction anomaly.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re[2]: [HACKERS] Re: [HACKERS] high io BUT huge amount of free memory

2013-06-03 Thread Миша Тюрин

> hm, in that case, wouldn't adding 48gb of physical memory have
> approximately the same effect?  or is something else going on?

imho, adding 48gb would have no effects.

server already has 376GB memory and still has a lot of unused GB.
let me repeat, we added 80GB for files cache by decreasing buffers from 64GB to 
16GB.
there are was 150GB of unused, and now unused part is only 70GB.

some of links i read about eviction
http://linux-mm.org/PageReplacementDesign
http://linux-mm.org/PageReplacementRequirements


mikhail
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [HACKERS] Re[3]: [HACKERS] high io BUT huge amount of free memory

2013-04-24 Thread Миша Тюрин

vm state

root@avi-sql09:~# /sbin/sysctl -a|grep vm
vm.overcommit_memory = 0
vm.panic_on_oom = 0
vm.oom_kill_allocating_task = 0
vm.oom_dump_tasks = 0
vm.overcommit_ratio = 50
vm.page-cluster = 3
vm.dirty_background_ratio = 10
vm.dirty_background_bytes = 0
vm.dirty_ratio = 20
vm.dirty_bytes = 0
vm.dirty_writeback_centisecs = 500
vm.dirty_expire_centisecs = 3000
vm.nr_pdflush_threads = 0
vm.swappiness = 0
vm.nr_hugepages = 0
vm.hugetlb_shm_group = 0
vm.hugepages_treat_as_movable = 0
vm.nr_overcommit_hugepages = 0
vm.lowmem_reserve_ratio = 256   256 32
vm.drop_caches = 0
vm.min_free_kbytes = 65536
vm.percpu_pagelist_fraction = 0
vm.max_map_count = 65530
vm.laptop_mode = 0
vm.block_dump = 0
vm.vfs_cache_pressure = 100
vm.legacy_va_layout = 0
vm.zone_reclaim_mode = 0
vm.min_unmapped_ratio = 1
vm.min_slab_ratio = 5
vm.stat_interval = 1
vm.mmap_min_addr = 65536
vm.numa_zonelist_order = default
vm.scan_unevictable_pages = 0
vm.memory_failure_early_kill = 0
vm.memory_failure_recovery = 1

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re[3]: [HACKERS] high io BUT huge amount of free memory

2013-04-24 Thread Миша Тюрин


typo

> if ( user cpu + io wait ) is ~140% then i have ~140GB free.

140% ===>> 1400%

if ~14 cores are busy then ~140GB is free

10GB per process

hmmm...



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re[2]: [HACKERS] high io BUT huge amount of free memory

2013-04-24 Thread Миша Тюрин


  thanks a lot for responses
 
1) just remind my case

Intel 32 core = 2*8 *2threads
Linux 2.6.32-5-amd64 #1 SMP Sun May 6 04:00:17 UTC 2012 x86_64 GNU/Linux
PostgreSQL 9.2.2 on x86_64-unknown-linux-gnu, compiled by gcc-4.4.real (Debian 
4.4.5-8) 4.4.5, 64-bit
shared_buffers 64GB / constant hit rate - 99,18
max_connections 160 / with pgbouncer pools there could not be more than 120 
connections at all
work_mem 32M
checkpoint 1h 1.0
swap off
numa off, interleaving on 
and
! disks usage 100% (free 128GB! WHY?)
disk throughput - up-to 30MB/s (24r+6w)
io - up-to 2,5-3K/s (0,5w + 2-2,5r)
typical work load - pk-index-scans
my warm work set is about 400GB
db at all - 700GB

2) numactl 
mtyurin@avi-sql09:~$ numactl --hardware
available: 1 nodes (0-0)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 
25 26 27 28 29 30 31
node 0 size: 393181 MB
node 0 free: 146029 MB
node distances:
node   0
  0:  10

3) !! i just found suspicious relation between "active" processes and free 
memory. ~1GB per process.
in 376GB total memmory and 32 core
if ( user cpu + io wait ) is ~140% then i have ~140GB free.
but it could be just a coincidence.

4) now i think
a) upgrade linux core (to 3.2!?) and then (if case still will be)
b) set buffers to something like 300-320Gb

5) what do you know about workload in Berkus's case 
http://www.databasesoup.com/2012/04/red-hat-kernel-cache-clearing-issue.html ?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pending triggers infinite list

2013-02-08 Thread Миша Тюрин


  Hi all

  I've got suspicious behavior for transaction cooked with deferrable trigger. 
if trigger has update on row of his target table we get infinite "recursion" 
without limitation of stack depth. 

  trigger -> update -> trigger -> update -> ... ...  -- infinite pending list :)


 - Misha


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re[2]: [HACKERS] standby, pg_basebackup and last xlog file

2013-02-07 Thread Миша Тюрин



Hello all and Heikki personally
Thank you for your answer

I have some new points:


21.01.2013, 10:08 +02:00 от Heikki Linnakangas :
>On 21.01.2013 09:14, Миша Тюрин wrote:
>>Is there any reason why pg_basebackup has limitation in an online backup 
>> from the standby: "The backup history file is not created in the database 
>> cluster backed up." ?
>
>WAL archiving isn't active in a standby, so even if it created a backup 
>history file, it wouldn't go anywhere. Also, the way the backup history 
>files are named, if you take a backup on the master at the same time (or 
>another backup at the same time in the standby), you would end up with 
>multiple backup history files with the same name.
>
>>So i can't get last xlog file needed to restore :(
>
>Good point. That was an oversight in the patch that allowed base backups 
>from a standby.
>
>>Also maybe i can use something like ( pg_last_xlog_replay_location() + 1 
>> ) after pg_basebackup finished.
>
>Yeah, that should work.
>
>>Does anybody know true way to getting last xlog file in case of applying 
>> pg_basebackup to standby?
>>How pg_basebackup gets last xlog file in case of standby and -x option?
>
>The server returns the begin and end WAL locations to pg_basebackup, 
>pg_basebackup just doesn't normally print them out to the user. In 
>verbose mode, it does actually print them out, but only with -x, so that 
>doesn't help you either. If you can compile from source, you could 
>modify pg_basebackup.c to print the locations without -x and --verbose, 
>search lines that print out "transaction log start point / end position".


1) we can get last xlog by using control data's "Minimum recovery ending 
location" 


>
>
>As a workaround, without modifying the source, you can do this after 
>pg_basebackup has finished, to get the first WAL file you need:
>
>$ pg_controldata backupdir | grep "REDO WAL file"
>Latest checkpoint's REDO WAL file:00030009


and I would like to correct your suggestion about first wal file
2.1)  we should use backup_label to determine first needed wal
2.2)  and we must not use redo from checkpoint. because there are might be more 
than one checkpoint during base_backup


>
>And as you suggested, pg_last_xlog_replay_location() for the last WAL 
>file you need.
>
>- Heikki
>
>
>-- 
>Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-hackers

- Misha

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re[2]: [HACKERS] standby, pg_basebackup and last xlog file

2013-02-07 Thread Миша Тюрин



Hello all and Heikki personally

Thank you for your answer

I have some new points:


Понедельник, 21 января 2013, 10:08 +02:00 от Heikki Linnakangas 
:
>On 21.01.2013 09:14, Миша Тюрин wrote:
>>Is there any reason why pg_basebackup has limitation in an online backup 
>> from the standby: "The backup history file is not created in the database 
>> cluster backed up." ?
>
>WAL archiving isn't active in a standby, so even if it created a backup 
>history file, it wouldn't go anywhere. Also, the way the backup history 
>files are named, if you take a backup on the master at the same time (or 
>another backup at the same time in the standby), you would end up with 
>multiple backup history files with the same name.
>
>>So i can't get last xlog file needed to restore :(
>
>Good point. That was an oversight in the patch that allowed base backups 
>from a standby.
>
>>Also maybe i can use something like ( pg_last_xlog_replay_location() + 1 
>> ) after pg_basebackup finished.
>
>Yeah, that should work.
>
>>Does anybody know true way to getting last xlog file in case of applying 
>> pg_basebackup to standby?
>>How pg_basebackup gets last xlog file in case of standby and -x option?
>
>The server returns the begin and end WAL locations to pg_basebackup, 
>pg_basebackup just doesn't normally print them out to the user. In 
>verbose mode, it does actually print them out, but only with -x, so that 
>doesn't help you either. If you can compile from source, you could 
>modify pg_basebackup.c to print the locations without -x and --verbose, 
>search lines that print out "transaction log start point / end position".

1) we can get last xlog by using control data's "Minimum recovery ending 
location" 


>
>
>As a workaround, without modifying the source, you can do this after 
>pg_basebackup has finished, to get the first WAL file you need:
>
>$ pg_controldata backupdir | grep "REDO WAL file"
>Latest checkpoint's REDO WAL file:00030009

and I would like to correct your suggestion about first wal file
2.1)  we should use backup_label to determine first needed wal
2.2)  and we must not use redo from checkpoint. because there are might be more 
than one checkpoint during base_backup


>
>And as you suggested, pg_last_xlog_replay_location() for the last WAL 
>file you need.
>
>- Heikki
>
>
>-- 
>Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-hackers

- Misha

[HACKERS] standby, pg_basebackup and last xlog file

2013-01-20 Thread Миша Тюрин


  Hello!

  I wrote to general (  [GENERAL] standby, pg_basebackup and last xlog file ) 
some times ago. but still hasn't got any feedback.


  Hello!

  Is there any reason why pg_basebackup has limitation in an online backup from 
the standby: "The backup history file is not created in the database cluster 
backed up." ?

  So i can't get last xlog file needed to restore :(

  Think i can use -x option for getting last xlog file, but i would like to 
minimize size of resulting backup. // i also get all WALs by archive_command, 
so there is no reason to get two copies of each wal during basebackup.

  Also maybe i can use something like ( pg_last_xlog_replay_location() + 1 ) 
after pg_basebackup finished.

  Does anybody know true way to getting last xlog file in case of applying 
pg_basebackup to standby?
  How pg_basebackup gets last xlog file in case of standby and -x option?

[HACKERS] Re[2]: [HACKERS] Remove fsync ON/OFF as a visible option?

[HACKERS] Re: [HACKERS] Streaming replication and WAL archive interactions

[HACKERS] Re: [HACKERS] On partitioning

[HACKERS] Re[2]: [HACKERS] Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)

[HACKERS] freeze cannot be finished

[HACKERS] Re: [HACKERS] Wal sync odirect

[HACKERS] Auto explain target tables

[HACKERS] Wal sync odirect

[HACKERS] Re[2]: [HACKERS] Re: [HACKERS] high io BUT huge amount of free memory

[HACKERS] Re: [HACKERS] Re[3]: [HACKERS] high io BUT huge amount of free memory

[HACKERS] Re[3]: [HACKERS] high io BUT huge amount of free memory

[HACKERS] Re[2]: [HACKERS] high io BUT huge amount of free memory

[HACKERS] pending triggers infinite list

[HACKERS] Re[2]: [HACKERS] standby, pg_basebackup and last xlog file

[HACKERS] Re[2]: [HACKERS] standby, pg_basebackup and last xlog file

[HACKERS] standby, pg_basebackup and last xlog file

16 matches

Site Navigation

Mail list logo

Footer information