Re: autovacuum is running but pg_stat_all_tables empty

2018-11-19 Thread Laurenz Albe
Mariel Cherkassky wrote:
> I'm trying to understand something that is weird on one of my environments.
> When I query pg_stat_all_tables I see that most of the tables dont have any
> value in the last_autovacuum/analyze column. In addition the columns
> autovacuum_count/analyze_count is set to 0. However, when checking the logs,
> I see that on some of those tables autovacuum run. I think that there is
> something wrong with the database statistics collector. In addition, the
> column n_dead_tup and n_live_tup are set and in some of the cases n_dead_tup
> is more then 20% of the table tuples. In addition, all tables have default
> vacuum threshold.
> 
> Any idea what else I can check ?
> The problem isnt only that dead tuples arent deleted (I dont have long running
> transaction that might cause it) but the fact that the statistics arent 
> accurate/wrong.

You can use the "pgstattuple" extension to check that table for the actual
dead tuple percentage to see if the statistics are accurate or not.

To see the statistic collector's UDP socket, run

netstat -a|grep udp|grep ESTABLISHED

Check if it is there.  If it is on IPv6, make sure that IPv6 is up, otherwise
that would explain why you have no accurate statistics.

Are there any log messages about statistics collection?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




autovacuum is running but pg_stat_all_tables empty

2018-11-19 Thread Mariel Cherkassky
Hi,
I'm trying to understand something that is weird on one of my environments.
When I query pg_stat_all_tables I see that most of the tables dont have any
value in the last_autovacuum/analyze column. In addition the columns
autovacuum_count/analyze_count is set to 0. However, when checking the
logs, I see that on some of those tables autovacuum run. I think that there
is something wrong with the database statistics collector. In addition, the
column n_dead_tup and n_live_tup are set and in some of the cases
n_dead_tup is more then 20% of the table tuples. In addition, all tables
have default vacuum threshold.

Any idea what else I can check ?
The problem isnt only that dead tuples arent deleted (I dont have long
running transaction that might cause it) but the fact that the statistics
arent accurate/wrong.


Re: PostgreSQL VS MongoDB: a use case comparison

2018-11-19 Thread Stephen Frost
Greetings,

* Fabio Pardi (f.pa...@portavita.eu) wrote:
> We are open to any kind of feedback and we hope you enjoy the reading.

Looks like a lot of the difference being seen and the comments made
about one being faster than the other are because one system is
compressing *everything*, while PG (quite intentionally...) only
compresses the data sometimes- once it hits the TOAST limit.  That
likely also contributes to why you're seeing the on-disk size
differences that you are.

Of course, if you want to see where PG will really shine, you'd stop
thinking of data as just blobs of JSON and actually define individual
fields in PG instead of just one 'jsonb' column, especially when you
know that field will always exist (which is obviously the case if you're
building an index on it, such as your MarriageDate) and then remove
those fields from the jsonb and just reconstruct the JSON when you
query.  Doing that you'll get the size down dramatically.

And that's without even going to that next-level stuff of actual
normalization where you pull out duplicate data from across the JSON
and have just one instance of that data in another, smaller, table and
use a JOIN to bring it all back together.  Even better is when you
realize that then you only have to update one row in this other table
when something changes in that subset of data, unlike when you
repeatedly store that data in individual JSON entries all across the
system and such a change requires rewriting every single JSON object in
the entire system...

Lastly, as with any performance benchmark, please include full details-
all scripts used, all commands run, all data used, so that others can
reproduce your results.  I'm sure it'd be fun to take your json data and
create actual tables out of it and see what it'd be like then.

Thanks!

Stephen


signature.asc
Description: PGP signature


PostgreSQL VS MongoDB: a use case comparison

2018-11-19 Thread Fabio Pardi
Hi list,


For your consideration I'm submitting you a research I made together with my 
colleague Wouter.

We compared the 2 databases for our specific use case on medical data stored 
using FHIR standard.

This is indeed a very restrictive use case, and moreover under read only 
circumstances. We are aware of that.

We might consider in a near future to enlarge the scope of the research 
including read and write operations, more use cases, using partitioning, and/or 
scaling Mongo.

This has to be considered as a starting point and we would like to share our 
results aiming in constructive feedbacks and perhaps fix mistakes or 
inaccuracies you might find.

This is the link:

https://portavita.github.io/2018-10-31-blog_A_JSON_use_case_comparison_between_PostgreSQL_and_MongoDB/

We are open to any kind of feedback and we hope you enjoy the reading.


Regards,

Fabio Pardi and Wouter van Teijlingen