Re: [Csync2] csync_check_del broken under PostgreSQL

2017-10-31 Thread Lars Ellenberg
On Fri, Oct 27, 2017 at 11:00:14PM +0100, James Le Cuirot wrote:
> On Mon, 23 Oct 2017 13:23:51 +0200
> Lars Ellenberg  wrote:
> 
> My locale and collation is entirely en_GB.UTF-8.

> > afaik, = < > compare "collate character units", while LIKE, appart from
> > supporting wildcards, compares "byte per byte" when used with literals.
> > I may be wrong.
> > 
> > Anyways, maybe alter table ... binary ... helps?
> > Or change "TEXT" to "BLOB"?
> > Something like that?
> 
> This gives the right result:
> 
> csync2=> SELECT filename from file where filename::bytea > 
> '/mnt/csync2/portage/app-editors/ng/' and filename::bytea < 
> '/mnt/csync2/portage/app-editors/ng0' ORDER BY filename;
> 
> Presumably this is comparing "byte for byte" as you say LIKE does but
> then why not just use LIKE in the first place?

"probably" (code is / was like that since over ten years)
at some point it was thought (or even measured) to help with
performance (presumably back when it was sqlite2 only).
Maybe that can be re-evaluated.

Also we probably should
CREATE TABLE ... ( ... TEXT COLLATE "C" NOT NULL ...)
Or use BYTEA / BLOB. Not sure which option is better.

You want to check what "select 'ABC' = 'abc';"
returns for various locales/collations,
and then double check for your favorite
umlauts or accented characters...

> > I don't think many installations out there use csync2 with mysql or 
> > postgres,
> > so it is very likely that there are a number of shortcomings burried in 
> > there,
> > I'd expect "very long path names" to show problems as well, where "very
> > long" may be simply "longer than the default supported unique key length
> > on text fields.
> 
> I figured that few people use this, especially since you can't specify
> the database in the configuration file.

Supposedly you just put something like this into the config file:
database "pgsql://james:secret@localhost/";
optionally with full database name.
Same string you'd put after the "-D" command line switch.

> I happen to have PostgreSQL running on both nodes anyway and thought
> it might be faster than SQLite.

If you are willing to do some benchmarks,
let us know the results.

In "normal usage" of csync2, I doubt that the database
would ever become the bottleneck.

> I gather the unique key length varies but is generally around 2000-3000
> characters. I don't expect to have paths that long but thanks for the
> heads up. Does SQLite not have such a limit? I couldn't find any
> reference to one.

No, sqlite apparently does not care for text field length there.

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
: R, Integration, Ops, Consulting, Support

DRBD® and LINBIT® are registered trademarks of LINBIT
___
Csync2 mailing list
Csync2@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/csync2


Re: [Csync2] csync_check_del broken under PostgreSQL

2017-10-27 Thread James Le Cuirot
On Mon, 23 Oct 2017 13:23:51 +0200
Lars Ellenberg  wrote:

> On Sun, Oct 22, 2017 at 11:33:55AM +0100, James Le Cuirot wrote:
> > When checking for deleted files recursively, csync2 issues a SQL
> > statement like this.
> >   
> > => SELECT filename from file where filename > 
> > '/mnt/csync2/portage/app-editors/ng/' and filename < 
> > '/mnt/csync2/portage/app-editors/ng0' ORDER BY filename;  
> >  filename 
> > --
> > (0 rows)
> > 
> > This never returns anything under PostgreSQL. The behaviour is actually
> > rather odd. Using LIKE gives the result we expect.
> >   
> > => SELECT filename FROM file WHERE filename LIKE 
> > '/mnt/csync2/portage/app-editors/ng/%';  
> >filename   
> > --
> >  /mnt/csync2/portage/app-editors/ng/metadata.xml
> >  /mnt/csync2/portage/app-editors/ng/files
> >  /mnt/csync2/portage/app-editors/ng/files/ng-1.5beta1-ncurses.patch
> >  /mnt/csync2/portage/app-editors/ng/files/ng-1.5beta1-configure.patch
> >  /mnt/csync2/portage/app-editors/ng/Manifest
> >  /mnt/csync2/portage/app-editors/ng/ng-1.5_beta1-r2.ebuild
> > (6 rows)
> > 
> > But watch what happens when we change the 0 to a g.
> >   
> > => SELECT filename from file where filename > 
> > '/mnt/csync2/portage/app-editors/ng/' and filename < 
> > '/mnt/csync2/portage/app-editors/ngg' ORDER BY filename;  
> >filename   
> > --
> >  /mnt/csync2/portage/app-editors/ng/files
> >  /mnt/csync2/portage/app-editors/ng/files/ng-1.5beta1-configure.patch
> >  /mnt/csync2/portage/app-editors/ng/files/ng-1.5beta1-ncurses.patch
> > (3 rows)
> > 
> > It's almost like the / in ng/files isn't used in the comparison. I
> > couldn't find anything in the PostgreSQL documentation specifically
> > about using the <> operators with strings.  
> 
> What are your locale and collate settings?

My locale and collation is entirely en_GB.UTF-8.

> afaik, = < > compare "collate character units", while LIKE, appart from
> supporting wildcards, compares "byte per byte" when used with literals.
> I may be wrong.
> 
> Anyways, maybe alter table ... binary ... helps?
> Or change "TEXT" to "BLOB"?
> Something like that?

This gives the right result:

csync2=> SELECT filename from file where filename::bytea > 
'/mnt/csync2/portage/app-editors/ng/' and filename::bytea < 
'/mnt/csync2/portage/app-editors/ng0' ORDER BY filename;

Presumably this is comparing "byte for byte" as you say LIKE does but
then why not just use LIKE in the first place?

> I don't think many installations out there use csync2 with mysql or postgres,
> so it is very likely that there are a number of shortcomings burried in there,
> I'd expect "very long path names" to show problems as well, where "very
> long" may be simply "longer than the default supported unique key length
> on text fields.

I figured that few people use this, especially since you can't specify
the database in the configuration file. I happen to have PostgreSQL
running on both nodes anyway and thought it might be faster than
SQLite.

I gather the unique key length varies but is generally around 2000-3000
characters. I don't expect to have paths that long but thanks for the
heads up. Does SQLite not have such a limit? I couldn't find any
reference to one.

-- 
James Le Cuirot (chewi)
Gentoo Linux Developer


pgpBPL8lIHXkp.pgp
Description: OpenPGP digital signature
___
Csync2 mailing list
Csync2@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/csync2


Re: [Csync2] csync_check_del broken under PostgreSQL

2017-10-23 Thread Lars Ellenberg
On Sun, Oct 22, 2017 at 11:33:55AM +0100, James Le Cuirot wrote:
> When checking for deleted files recursively, csync2 issues a SQL
> statement like this.
> 
> => SELECT filename from file where filename > 
> '/mnt/csync2/portage/app-editors/ng/' and filename < 
> '/mnt/csync2/portage/app-editors/ng0' ORDER BY filename;
>  filename 
> --
> (0 rows)
> 
> This never returns anything under PostgreSQL. The behaviour is actually
> rather odd. Using LIKE gives the result we expect.
> 
> => SELECT filename FROM file WHERE filename LIKE 
> '/mnt/csync2/portage/app-editors/ng/%';
>filename   
> --
>  /mnt/csync2/portage/app-editors/ng/metadata.xml
>  /mnt/csync2/portage/app-editors/ng/files
>  /mnt/csync2/portage/app-editors/ng/files/ng-1.5beta1-ncurses.patch
>  /mnt/csync2/portage/app-editors/ng/files/ng-1.5beta1-configure.patch
>  /mnt/csync2/portage/app-editors/ng/Manifest
>  /mnt/csync2/portage/app-editors/ng/ng-1.5_beta1-r2.ebuild
> (6 rows)
> 
> But watch what happens when we change the 0 to a g.
> 
> => SELECT filename from file where filename > 
> '/mnt/csync2/portage/app-editors/ng/' and filename < 
> '/mnt/csync2/portage/app-editors/ngg' ORDER BY filename;
>filename   
> --
>  /mnt/csync2/portage/app-editors/ng/files
>  /mnt/csync2/portage/app-editors/ng/files/ng-1.5beta1-configure.patch
>  /mnt/csync2/portage/app-editors/ng/files/ng-1.5beta1-ncurses.patch
> (3 rows)
> 
> It's almost like the / in ng/files isn't used in the comparison. I
> couldn't find anything in the PostgreSQL documentation specifically
> about using the <> operators with strings.

What are your locale and collate settings?

afaik, = < > compare "collate character units", while LIKE, appart from
supporting wildcards, compares "byte per byte" when used with literals.
I may be wrong.

Anyways, maybe alter table ... binary ... helps?
Or change "TEXT" to "BLOB"?
Something like that?

I don't think many installations out there use csync2 with mysql or postgres,
so it is very likely that there are a number of shortcomings burried in there,
I'd expect "very long path names" to show problems as well, where "very
long" may be simply "longer than the default supported unique key length
on text fields.

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
: R, Integration, Ops, Consulting, Support

DRBD® and LINBIT® are registered trademarks of LINBIT
___
Csync2 mailing list
Csync2@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/csync2