Re: Writing 1100 rows per second

2020-02-09 Thread Jeff Janes
On Wed, Feb 5, 2020 at 12:25 PM Arya F  wrote:

> If I run the database on a server that has enough ram to load all the
> indexes and tables into ram. And then it would update the index on the HDD
> every x seconds. Would that work to increase performance dramatically?
>

Perhaps.  Probably not dramatically though.  If x seconds (called a
checkpoint) is not long enough for the entire index to have been dirtied,
then my finding is that writing half of the pages (randomly interspersed)
of a file, even in block order, still has the horrid performance of a long
sequence of random writes, not the much better performance of a handful of
sequential writes.  Although this probably depends strongly on your RAID
controller and OS version and such, so you should try it for yourself on
your own hardware.

Cheers,

Jeff


Re: Writing 1100 rows per second

2020-02-06 Thread Ogden Brash
On Wed, Feb 5, 2020 at 9:12 AM Laurenz Albe 
wrote:

> One idea I can come up with is a table that is partitioned by a column
> that appears
> in a selective search condition, but have no indexes on the table, so that
> you always get
> away with a sequential scan of a single partition.
>
>
This is an approach that I am currently using successfully. We have a large
dataset that continues to be computed and so insert speed is of importance
to us. The DB currently has about 45 billion rows. There are three columns
that are involved in all searches of the data. We have separate tables for
all unique combination of those 3 values (which gives us about 2000
tables). Thus, we were able to save the space for having to store those
columns (since the name of the table defines what those 3 columns are in
that table). We don't have any indices on those tables (except for the
default one which gets created for the pk serial number). As a result all
searches only involve 1 table and a sequential scan of that table. The
logic to choose the correct tables for insertionse or searches lives in our
application code and not in SQL.

The size of the 2000 tables forms a gaussian distirbution, so our largest
table is about a billion rows and there are many tables that have hundreds
of millions of rows. The ongoing insertions form the same distribution, so
the bulk of insertions is happening into the largest tables. It is not a
speed demon and I have not run tests recently but back of the envelope
calculations give me confidence that we are definitely inserting more than
1100 per second. And that is running the server on an old puny i5 processor
with regular HDDs and  only 32Gb of memory.


Re: Writing 1100 rows per second

2020-02-05 Thread Haroldo Kerry
Arya,
We ran into the issue of decreasing insert performance for tables of
hundreds of millions of rows and they are indeed due to index updates.
We tested TimescaleDB (a pgsql plugin) with success in all use cases that
we have. It does a "behind the scenes" single-level partitioning that is
indeed very efficient.
Not sure about the 1100 inserts/s  as it is hardware dependent, but we got
the flat response curve (inserts per second stayed stable with hundreds of
millions of rows, regardless of indexes).
My suggestion: have a look at
https://blog.timescale.com/timescaledb-vs-6a696248104e/ , and do some PoCs.

Regards,
Haroldo Kerry

On Wed, Feb 5, 2020 at 2:25 PM Arya F  wrote:

> If I run the database on a server that has enough ram to load all the
> indexes and tables into ram. And then it would update the index on the HDD
> every x seconds. Would that work to increase performance dramatically?
>
> On Wed, Feb 5, 2020, 12:15 PM Justin Pryzby  wrote:
>
>> On Wed, Feb 05, 2020 at 12:03:52PM -0500, Arya F wrote:
>> > I'm looking to write about 1100 rows per second to tables up to 100
>> million
>> > rows. I'm trying to come up with a design that I can do all the writes
>> to a
>> > database with no indexes. When having indexes the write performance
>> slows
>> > down dramatically after the table gets bigger than 30 million rows.
>> >
>> > I was thinking of having a server dedicated for all the writes and have
>> > another server for reads that has indexes and use logical replication to
>> > update the read only server.
>>
>> Wouldn't the readonly server still have bad performance for all the wites
>> being
>> replicated to it ?
>>
>> > Would that work? Or any recommendations how I can achieve good
>> performance
>> > for a lot of writes?
>>
>> Can you use partitioning so the updates are mostly affecting only one
>> table at
>> once, and its indices are of reasonable size, such that they can fit
>> easily in
>> shared_buffers.
>>
>> brin indices may help for some, but likely not for all your indices.
>>
>> Justin
>>
>

-- 

Haroldo Kerry

CTO/COO

Rua do Rócio, 220, 7° andar, conjunto 72

São Paulo – SP / CEP 04552-000

hke...@callix.com.br

www.callix.com.br


Re: Writing 1100 rows per second

2020-02-05 Thread Arya F
If I run the database on a server that has enough ram to load all the
indexes and tables into ram. And then it would update the index on the HDD
every x seconds. Would that work to increase performance dramatically?

On Wed, Feb 5, 2020, 12:15 PM Justin Pryzby  wrote:

> On Wed, Feb 05, 2020 at 12:03:52PM -0500, Arya F wrote:
> > I'm looking to write about 1100 rows per second to tables up to 100
> million
> > rows. I'm trying to come up with a design that I can do all the writes
> to a
> > database with no indexes. When having indexes the write performance slows
> > down dramatically after the table gets bigger than 30 million rows.
> >
> > I was thinking of having a server dedicated for all the writes and have
> > another server for reads that has indexes and use logical replication to
> > update the read only server.
>
> Wouldn't the readonly server still have bad performance for all the wites
> being
> replicated to it ?
>
> > Would that work? Or any recommendations how I can achieve good
> performance
> > for a lot of writes?
>
> Can you use partitioning so the updates are mostly affecting only one
> table at
> once, and its indices are of reasonable size, such that they can fit
> easily in
> shared_buffers.
>
> brin indices may help for some, but likely not for all your indices.
>
> Justin
>


Re: Writing 1100 rows per second

2020-02-05 Thread Justin Pryzby
On Wed, Feb 05, 2020 at 12:03:52PM -0500, Arya F wrote:
> I'm looking to write about 1100 rows per second to tables up to 100 million
> rows. I'm trying to come up with a design that I can do all the writes to a
> database with no indexes. When having indexes the write performance slows
> down dramatically after the table gets bigger than 30 million rows.
> 
> I was thinking of having a server dedicated for all the writes and have
> another server for reads that has indexes and use logical replication to
> update the read only server.

Wouldn't the readonly server still have bad performance for all the wites being
replicated to it ?

> Would that work? Or any recommendations how I can achieve good performance
> for a lot of writes?

Can you use partitioning so the updates are mostly affecting only one table at
once, and its indices are of reasonable size, such that they can fit easily in
shared_buffers.

brin indices may help for some, but likely not for all your indices.

Justin




Re: Writing 1100 rows per second

2020-02-05 Thread Laurenz Albe
On Wed, 2020-02-05 at 12:03 -0500, Arya F wrote:
> I'm looking to write about 1100 rows per second to tables up to 100 million 
> rows. I'm trying to
> come up with a design that I can do all the writes to a database with no 
> indexes. When having
> indexes the write performance slows down dramatically after the table gets 
> bigger than 30 million rows.
> 
> I was thinking of having a server dedicated for all the writes and have 
> another server for reads
> that has indexes and use logical replication to update the read only server.
> 
> Would that work? Or any recommendations how I can achieve good performance 
> for a lot of writes?

Logical replication wouldn't make a difference, because with many indexes, 
replay of the
inserts would be slow as well, and replication would lag more and more.

No matter what you do, there will be no magic way to have your tables indexed 
and
have fast inserts at the same time.

One idea I can come up with is a table that is partitioned by a column that 
appears
in a selective search condition, but have no indexes on the table, so that you 
always get
away with a sequential scan of a single partition.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Writing 1100 rows per second

2020-02-05 Thread Arya F
I'm looking to write about 1100 rows per second to tables up to 100 million
rows. I'm trying to come up with a design that I can do all the writes to a
database with no indexes. When having indexes the write performance slows
down dramatically after the table gets bigger than 30 million rows.


I was thinking of having a server dedicated for all the writes and have
another server for reads that has indexes and use logical replication to
update the read only server.


Would that work? Or any recommendations how I can achieve good performance
for a lot of writes?

Thank you