Re: [pmacct-discussion] bgp_as_path_radius / pgsql and bgp_aggregate

2018-12-10 Thread Karl O. Pinc
On Mon, 10 Dec 2018 16:10:23 +
Paolo Lucente  wrote:

> Thanks for your input - always very interesting. I sketched a new note
> in the SQL docs basing on the thread so far: 
> 
> https://github.com/pmacct/pmacct/commit/e4c594a9f10040b53d4523f7edecf455a20a9151
> 
> Hope it looks right enough.

Looks ok to me.  I can't speak to the performance
implications of fixed v.s. variable length string
columns in MySQL. And it probably depends on the
back-end storage engine anyway.  What you have
leaves everything to do with performance up to
the end-user, which is fine.



You might consider altering sql/pmacct-create-db.pgsql
so that the create database is:

CREATE DATABASE pmacct
  TEMPLATE=template0
  LC_COLLATE='C';

This should work from Postgres 8.4, released 2009-07-01
and out of support July 24, 2014.  RHEL 6 includes
Postgres 8.4 (or so I'm told here
https://developers.redhat.com/blog/2018/04/06/red-hat-open-source-data-bases-beta-adds-postgresql-10-mongodb-3-6-updates-mysql-5-7/
).  RHEL 5 is no longer supported.

It uses template0 because in theory template1 (the default) could
have been modified by the local db admin to contain
data which is not compatible with the C collation.
In practice template1 is probably never different
from template0.

Most likely, the OS is using UTF8.  And the PG cluster
will be created using the OS default.  So the db will
be able to contain UTF8 characters, which I presume
is what you want, but sorting/indexing will be in UTF8 codepoint
order.  Much faster than a dictionary sort.


While you're at it, a note in pmacct-create-db.pgsql
to increase "shared_buffers"
in postgresql.conf might also be useful.  The default
is 128M, which is a very small amount of memory these
days.

Something like:

  -- NOTE: Database performance is _much_ improved with minimal
  --   tuning.  Start by allocating more RAM to the database.
  --   Increase the "shared_buffers" setting in
  --   postgresql.conf to as much as 25% of RAM.
  --   Consider also setting "effective_cache_size".  It should
  --   be larger than shared_buffers.  "checkpoint_segments"
  --   should probably be at least 32, much more if write
  --   performance is an issue.  For details see:
  --https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

> On Sun, Dec 09, 2018 at 07:13:34PM -0600, Karl O. Pinc wrote:
> > On Fri, 7 Dec 2018 16:42:31 +
> > Paolo Lucente  wrote:
> >   
> > > You could make the field variable-length - optimizing space and
> > > avoiding you the try & error of finding the sweet spot size for
> > > the as path field at the expense of more computing.  
> > 
> > I would not expect a Postgres TEXT column, which is variable
> > length, to be significantly more CPU intensive than a
> > fixed length CHAR column.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] bgp_as_path_radius / pgsql and bgp_aggregate

2018-12-10 Thread Paolo Lucente


Hi Karl,

Thanks for your input - always very interesting. I sketched a new note
in the SQL docs basing on the thread so far: 

https://github.com/pmacct/pmacct/commit/e4c594a9f10040b53d4523f7edecf455a20a9151

Hope it looks right enough.

Paolo

On Sun, Dec 09, 2018 at 07:13:34PM -0600, Karl O. Pinc wrote:
> On Fri, 7 Dec 2018 16:42:31 +
> Paolo Lucente  wrote:
> 
> > You could make the field variable-length - optimizing space and
> > avoiding you the try & error of finding the sweet spot size for the
> > as path field at the expense of more computing.
> 
> I would not expect a Postgres TEXT column, which is variable
> length, to be significantly more CPU intensive than a
> fixed length CHAR column.
> 
> From the PG docs regarding the various types which store strings:
> 
> ---
> There is no performance difference among these three types, apart from 
> increased storage space when using the blank-padded type, and a few extra CPU 
> cycles to check the length when storing into a length-constrained column. 
> While character(n) has performance advantages in some other database systems, 
> there is no such advantage in PostgreSQL; in fact character(n) is usually the 
> slowest of the three because of its additional storage costs. In most 
> situations text or character varying should be used instead.
> ---
> 
> So, when using Postgres, there's no particular reason to
> limit string column lengths.
> 
> What is important with Postgres is the collation sequence
> used by the database.  UTF-8 collation support slows down sorting
> significantly, and so affects indexes and so forth.
> If you need UTF-8 characters you can do that, but
> sort them by byte-code values if you care about performance.
> If necessary collation can even be set down to the column
> level in the newest PG.  What's easiest is to avoid UTF-8
> entirely and set the locale for the whole db to "C".
> This gives you an entire byte to store each character
> and characters sorted by byte-code value.
> 
> https://www.postgresql.org/docs/11/charset.html
> 
> > On Fri, Dec 07, 2018 at 10:46:32AM +0100, Fabien VINCENT wrote:
> > > Dear List, 
> > > 
> > > I've an issue when nfacctd try to push to pgsql database : 
> > > 
> > > PGSQL log file : 
> > > 
> > > ERROR:  value too long for type character(80)
> > > CONTEXT: 
> > > 
> > > COPY flow _*_, line 74771, column as_path_src: "14061
> > > {46652,421001,4210010200,4210010201,4210010202,4210010297,4210010400,4210010402,4210010499..."
> > > 
> > > 
> > > In my nfacctd config file I've : 
> > > 
> > > bgp_aspath_radius: 10 
> > > 
> > > because as_path_src is set to CHAR(80). But seems BGP aggregates
> > > break the rules ? 
> > > 
> > > Is there anyway to limit / cut down BGP aggregates in column
> > > as_path_src ? 
> 
> Karl 
> Free Software:  "You don't pay back, you pay forward."
>  -- Robert A. Heinlein

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] bgp_as_path_radius / pgsql and bgp_aggregate

2018-12-09 Thread Karl O. Pinc
On Fri, 7 Dec 2018 16:42:31 +
Paolo Lucente  wrote:

> You could make the field variable-length - optimizing space and
> avoiding you the try & error of finding the sweet spot size for the
> as path field at the expense of more computing.

I would not expect a Postgres TEXT column, which is variable
length, to be significantly more CPU intensive than a
fixed length CHAR column.

>From the PG docs regarding the various types which store strings:

---
There is no performance difference among these three types, apart from 
increased storage space when using the blank-padded type, and a few extra CPU 
cycles to check the length when storing into a length-constrained column. While 
character(n) has performance advantages in some other database systems, there 
is no such advantage in PostgreSQL; in fact character(n) is usually the slowest 
of the three because of its additional storage costs. In most situations text 
or character varying should be used instead.
---

So, when using Postgres, there's no particular reason to
limit string column lengths.

What is important with Postgres is the collation sequence
used by the database.  UTF-8 collation support slows down sorting
significantly, and so affects indexes and so forth.
If you need UTF-8 characters you can do that, but
sort them by byte-code values if you care about performance.
If necessary collation can even be set down to the column
level in the newest PG.  What's easiest is to avoid UTF-8
entirely and set the locale for the whole db to "C".
This gives you an entire byte to store each character
and characters sorted by byte-code value.

https://www.postgresql.org/docs/11/charset.html

> On Fri, Dec 07, 2018 at 10:46:32AM +0100, Fabien VINCENT wrote:
> > Dear List, 
> > 
> > I've an issue when nfacctd try to push to pgsql database : 
> > 
> > PGSQL log file : 
> > 
> > ERROR:  value too long for type character(80)
> > CONTEXT: 
> > 
> > COPY flow _*_, line 74771, column as_path_src: "14061
> > {46652,421001,4210010200,4210010201,4210010202,4210010297,4210010400,4210010402,4210010499..."
> > 
> > 
> > In my nfacctd config file I've : 
> > 
> > bgp_aspath_radius: 10 
> > 
> > because as_path_src is set to CHAR(80). But seems BGP aggregates
> > break the rules ? 
> > 
> > Is there anyway to limit / cut down BGP aggregates in column
> > as_path_src ? 

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


[pmacct-discussion] bgp_as_path_radius / pgsql and bgp_aggregate

2018-12-07 Thread Fabien VINCENT
Dear List, 

I've an issue when nfacctd try to push to pgsql database : 

PGSQL log file : 

ERROR:  value too long for type character(80)
CONTEXT: 

COPY flow _*_, line 74771, column as_path_src: "14061
{46652,421001,4210010200,4210010201,4210010202,4210010297,4210010400,4210010402,4210010499..."


In my nfacctd config file I've : 

bgp_aspath_radius: 10 

because as_path_src is set to CHAR(80). But seems BGP aggregates break
the rules ? 

Is there anyway to limit / cut down BGP aggregates in column as_path_src
? 

Thanks =)

-- 

FABIEN VINCENT
---

@beufanet
---___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists