Re: pglz performance

2019-05-17 Thread Gasper Zejn

On 16. 05. 19 19:13, Andrey Borodin wrote:
>
>> 15 мая 2019 г., в 15:06, Andrey Borodin  написал(а):
>>
>> Owners of AMD and ARM devices are welcome.

I've tested according to instructions at the test repo
https://github.com/x4m/test_pglz

Test_pglz is at a97f63b and postgres at 6ba500.

Hardware is desktop AMD Ryzen 5 2600, 32GB RAM

Decompressor score (summ of all times):

NOTICE:  Decompressor pglz_decompress_hacked result 6.988909
NOTICE:  Decompressor pglz_decompress_hacked8 result 7.562619
NOTICE:  Decompressor pglz_decompress_hacked16 result 8.316957
NOTICE:  Decompressor pglz_decompress_vanilla result 10.725826


Attached is the full test run, if needed.

Kind regards,

Gasper

> Yandex hardware RND guys gave me ARM server and Power9 server. They are 
> looking for AMD and some new Intel boxes.
>
> Meanwhile I made some enhancements to test suit:
> 1. I've added Shakespeare payload: concatenation of works of this prominent 
> poet.
> 2. For each payload compute "sliced time" - time to decompress payload if it 
> was sliced by 2Kb pieces or 8Kb pieces.
> 3. For each decompressor we compute "score": (sum of time to decompress each 
> payload, each payload sliced by 2Kb and 8Kb) * 5 times
>
> I've attached full test logs, meanwhile here's results for different 
> platforms.
>
> Intel Server
> NOTICE:  0: Decompressor pglz_decompress_hacked result 10.346763
> NOTICE:  0: Decompressor pglz_decompress_hacked8 result 11.192078
> NOTICE:  0: Decompressor pglz_decompress_hacked16 result 11.957727
> NOTICE:  0: Decompressor pglz_decompress_vanilla result 14.262256
>
> ARM Server
> NOTICE:  Decompressor pglz_decompress_hacked result 12.98
> NOTICE:  Decompressor pglz_decompress_hacked8 result 13.004935
> NOTICE:  Decompressor pglz_decompress_hacked16 result 13.043015
> NOTICE:  Decompressor pglz_decompress_vanilla result 18.239242
>
> Power9 Server
> NOTICE:  Decompressor pglz_decompress_hacked result 10.992974
> NOTICE:  Decompressor pglz_decompress_hacked8 result 11.747443
> NOTICE:  Decompressor pglz_decompress_hacked16 result 11.026342
> NOTICE:  Decompressor pglz_decompress_vanilla result 16.375315
>
> Intel laptop
> NOTICE:  Decompressor pglz_decompress_hacked result 9.445808
> NOTICE:  Decompressor pglz_decompress_hacked8 result 9.105360
> NOTICE:  Decompressor pglz_decompress_hacked16 result 9.621833
> NOTICE:  Decompressor pglz_decompress_vanilla result 10.661968
>
> From these results pglz_decompress_hacked looks best.
>
> Best regards, Andrey Borodin.
>
✔ ~/project/pgsql/contrib/test_pglz 
[master|…5] 

15:29 $ ./test.sh 
make -C ../../src/backend generated-headers
make[1]: Entering directory '/home/hruske/project/pgsql/src/backend'
make -C catalog distprep generated-header-symlinks
make[2]: Entering directory '/home/hruske/project/pgsql/src/backend/catalog'
make[2]: Nothing to be done for 'distprep'.
make[2]: Nothing to be done for 'generated-header-symlinks'.
make[2]: Leaving directory '/home/hruske/project/pgsql/src/backend/catalog'
make -C utils distprep generated-header-symlinks
make[2]: Entering directory '/home/hruske/project/pgsql/src/backend/utils'
make[2]: Nothing to be done for 'distprep'.
make[2]: Nothing to be done for 'generated-header-symlinks'.
make[2]: Leaving directory '/home/hruske/project/pgsql/src/backend/utils'
make[1]: Leaving directory '/home/hruske/project/pgsql/src/backend'
/bin/mkdir -p '/home/hruske/project/lib/postgresql'
/bin/mkdir -p '/home/hruske/project/share/postgresql/extension'
/bin/mkdir -p '/home/hruske/project/share/postgresql/extension'
/usr/bin/install -c -m 755  test_pglz.so 
'/home/hruske/project/lib/postgresql/test_pglz.so'
/usr/bin/install -c -m 644 ./test_pglz.control 
'/home/hruske/project/share/postgresql/extension/'
/usr/bin/install -c -m 644 ./test_pglz--1.0.sql ./00010006 
./00010001 ./00010008 ./16398 ./shakespeare.txt 
 '/home/hruske/project/share/postgresql/extension/'
The files belonging to this database system will be owned by user "hruske".
This user must also own the server process.

The database cluster will be initialized with locale "sl_SI.UTF-8".
The default database encoding has accordingly been set to "UTF8".
initdb: could not find suitable text search configuration for locale 
"sl_SI.UTF-8"
The default text search configuration will be set to "simple".

Data page checksums are disabled.

creating directory /home/hruske/DemoDb1 ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default timezone ... Europe/Ljubljana
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing 

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Gasper Zejn
On 09. 04. 2018 15:42, Tomas Vondra wrote:
> On 04/09/2018 12:29 AM, Bruce Momjian wrote:
>> An crazy idea would be to have a daemon that checks the logs and
>> stops Postgres when it seems something wrong.
>>
> That doesn't seem like a very practical way. It's better than nothing,
> of course, but I wonder how would that work with containers (where I
> think you may not have access to the kernel log at all). Also, I'm
> pretty sure the messages do change based on kernel version (and possibly
> filesystem) so parsing it reliably seems rather difficult. And we
> probably don't want to PANIC after I/O error on an unrelated device, so
> we'd need to understand which devices are related to PostgreSQL.
>
> regards
>

For a bit less (or more) crazy idea, I'd imagine creating a Linux kernel
module with kprobe/kretprobe capturing the file passed to fsync or even
byte range within file and corresponding return value shouldn't be that
hard. Kprobe has been a part of Linux kernel for a really long time, and
from first glance it seems like it could be backported to 2.6 too.

Then you could have stable log messages or implement some kind of "fsync
error log notification" via whatever is the most sane way to get this
out of kernel.

If the kernel is new enough and has eBPF support (seems like >=4.4),
using bcc-tools[1] should enable you to write a quick script to get
exactly that info via perf events[2].

Obviously, that's a stopgap solution ...


Kind regards,
Gasper


[1] https://github.com/iovisor/bcc
[2]
https://blog.yadutaf.fr/2016/03/30/turn-any-syscall-into-event-introducing-ebpf-kernel-probes/



Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-04 Thread Gasper Zejn


On 04. 04. 2018 15:49, Bruce Momjian wrote:
> I can understand why kernel developers don't want to keep failed sync
> buffers in memory, and once they are gone we lose reporting of their
> failure.  Also, if the kernel is going to not retry the syncs, how long
> should it keep reporting the sync failure?  To the first fsync that
> happens after the failure?  How long should it continue to record the
> failure?  What if no fsync() every happens, which is likely for
> non-Postgres workloads?  I think once they decided to discard failed
> syncs and not retry them, the fsync behavior we are complaining about
> was almost required.
Ideally the kernel would keep its data for as little time as possible.
With fsync, it doesn't really know which process is interested in
knowing about a write error, it just assumes the caller will know how to
deal with it. Most unfortunate issue is there's no way to get
information about a write error.

Thinking aloud - couldn't/shouldn't a write error also be a file system
event reported by inotify? Admittedly that's only a thing on Linux, but
still.


Kind regards,
Gasper



Re: disable SSL compression?

2018-03-08 Thread Gasper Zejn
On 09. 03. 2018 06:24, Craig Ringer wrote:
> I'm totally unconvinced by the threat posed by exploiting a client by
> tricking it into requesting protocol compression - or any other
> protocol change the client lib doesn't understand - with a connection
> option in PGOPTIONS or the "options" connstring entry. The attacker
> must be able to specify either environment variables (in which case I
> present "LD_PRELOAD") or the connstr. If they can set a connstr they
> can direct the client to talk to a different host that tries to
> exploit the connecting client in whatever manner they wish by sending
> any custom crafted messages they like.
>
If the attacker has access to client process or environment, he's
already won and this is not where the compression vulnerability lies.

CRIME and BREACH attacks with (SSL) compression are known plaintext
attacks, which require the attacker 1) to have ability to observe
encrypted data and 2) have a way to influence the plain text, in this
case SQL query. In the case of CRIME HTTPS attack, compression state was
shared between page content and request headers, thus by observing size
of responses, which are in HTTP headers, one could guess cookie values
and steal credentials even though the javascript making requests was
running on different domain.

So the vulnerability would be in guessing some values in request or
response, which the application or protocol might want to keep hidden,
while somehow getting the size of request or response from database.
Thus, sharing compression state too widely might not be wise.

Kind regards,
Gasper