from:"Oleg Bartunov"

Re: [HACKERS] WIP: Upper planner pathification

2016-03-01 Thread Oleg Bartunov

On Tue, Mar 1, 2016 at 3:11 PM, Teodor Sigaev  wrote:

> The basic point of this patch is to apply the generate-and-compare-Paths
>> paradigm to the planning steps after query_planner(), which only covers
>>
> ...
> > The present patch addresses this problem by inventing Path nodes to
> > represent every post-scan/join step
>
> I'm really glad to see that. Separating path nodes for later steps opens a
> new ways to optimize queries. For first glance, consider
> select * from a left outer join b on a.i = b.i limit 1;
> Limit node could be pushed down to scan over 'a' table if b.i is unique.
>

This patch opens a lot of possibilities to our ongoing project on indexing
subselects, which we plan to use for  jsonb. Having it in 9.6 will
certainly facilitate this. So, I'm +1 for this patch, even if we have to
postpone 9.6 a bit. Hope, Robert, Teodor and other reviewers could help Tom
with this patch.

>
> --
> Teodor Sigaev   E-mail: teo...@sigaev.ru
>WWW:
> http://www.sigaev.ru/
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] The plan for FDW-based sharding

2016-03-02 Thread Oleg Bartunov

On Tue, Mar 1, 2016 at 7:03 PM, Robert Haas  wrote:

> On Tue, Mar 1, 2016 at 10:37 AM, Bruce Momjian  wrote:
> > On Tue, Mar  1, 2016 at 10:19:45AM -0500, Robert Haas wrote:
> >> > Two reasons:
> >> > 1. There is no ideal implementation of DTM which will fit all
> possible needs
> >> > and be  efficient for all clusters.
> >>
> >> Hmm, what is the reasoning behind that statement?  I mean, it is
> >> certainly true that there are some places where we have decided that
> >> one-size-fits-all is not the right approach.  Indexing, for example.
> >
> > Uh, is that even true of indexing?  While the plug-in nature of indexing
> > allows for easier development and testing, does anyone create plug-in
> > indexing that isn't shipped by us?  I thought WAL support was something
> > that prevented external indexing solutions from working.
>
> True.  There is an API, though, and having pluggable WAL support seems
> desirable too.  At the same time, I don't think we know of anyone
> maintaining a non-core index AM ... and there are probably good
> reasons for that.  We end up revising the index AM API pretty
>

We'd love to develop new special index AM, that's why we all are for
pluggable WAL. I think there are will be other AM developers, once we open
the door for that.

> regularly every time somebody wants to do something new, so it's not
> really a stable API that extensions can just tap into.  I suspect that
> a transaction manager API would end up similarly situated.
>

I don't expect many other TM developers, so there is no problem with
improving API. We started from practical needs and analyses of many
academical papers. We spent a year to play with several prototypes to prove
our proposed API (expect more in several months). Everybody could download
them a test. Wish we can do that with FDW-based sharding solution.

Of course, we can fork postgres as XC/XL people did and certainly
eventually will do, if community don't accept our proposal, since it's very
difficult to work on cross-releases projects. But then there are will be no
winners, so why do we all are aggressively don't understand each other ! I
was watching  XC/XL for years and thought I don't want to go this way of
isolation from the community, so we decided to let TM pluggable to stay
with community and let everybody prove their concepts. if you have ideas
how to improve TM API, we are open, if you know it's broken by design,
let's help us to fix it.  I have my understanding about FDW, but I
deliberately don't participate in some very hot discussion, just because I
feel myself not commited to work on. Your group is very enthusiastic on
FDW, it's ok until you improve FDW in general way, I'm very happy on
current work.  I prefer you show prototype of sharding solution, which
convince us in functionality and perfromance. I agree with Thomas Vondra,
that we don't want to wait for years to see the result, we want to expect
results, based on prototype, which should be done between releases. If you
don't have enough resources for this, let's do together with community.
Nobody as I've seen are against FDW sharding, people complained about "the
only sharding solution" in postgres, without proving so.

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] The plan for FDW-based sharding

2016-03-02 Thread Oleg Bartunov

On Wed, Mar 2, 2016 at 4:36 AM, Tomas Vondra 
wrote:

Hi,
>
> On 03/01/2016 08:02 PM, Bruce Momjian wrote:
>
>> On Tue, Mar  1, 2016 at 07:56:58PM +0100, Petr Jelinek wrote:
>>
>>> Note that I am not saying that other discussed approaches are any
>>> better, I am saying that we should know approximately what we
>>> actually want and not just beat FDWs with a hammer and hope sharding
>>> will eventually emerge and call that the plan.
>>>
>>
>> I will say it again --- FDWs are the only sharding method I can think
>> of that has a chance of being accepted into Postgres core.
>>
>
>
>
> While I disagree with Simon on various things, I absolutely understand why
> he was asking about a prototype, and some sort of analysis of what usecases
> we expect to support initially/later/never, and what pieces are missing to
> get the sharding working. IIRC at the FOSDEM Dev Meeting you've claimed
> you're essentially working on a prototype - once we have the missing FDW
> pieces, we'll know if it works. I disagree that - it's not a prototype if
> it takes several years to find the outcome.
>
>
fully agree. Probably, we all need to help to build prototype in
between-releases period. I see no legal way to resolve the situation.


>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] The plan for FDW-based sharding

2016-03-02 Thread Oleg Bartunov

On Mar 3, 2016 4:47 AM, "Michael Paquier"  wrote:
>
> On Wed, Mar 2, 2016 at 6:54 PM, Alexander Korotkov
>  wrote:
> > If FDWs existed then Postgres XC/XL were being developed then I believe
they
> > would try to build full-featured prototype of FDW based sharding. If
this
> > prototype succeed then we could make a full roadmap.
>
> Speaking here with my XC hat, that's actually the case. A couple of
> years back when I worked on it, there were discussions about reusing
> FDW routines for the purpose of XC, which would have been roughly
> reusing postgres_fdw + the possibility to send XID, snapshot and
> transaction timestamp to the remote nodes after getting that from the
> GTM (global transaction manager ensuring global data visibility and
> consistency), and have the logic for query pushdown in the FDW itself
> when planning query on what would have been roughly foreign tables
> (not entering in the details here, those would have not been entirely
> foreign tables). At this point the global picture was not completely
> set, XC being based on 9.1~9.2 and the FDW base routines were not as
> extended as they are now. As history has told, this global picture has
> never showed up, though it would should XC have been merged with 9.3.
> The point is that XC would have moved as using the FDW approach, as a
> set of plugins.
>
> This was a reason behind this email of 2013 on -hackers actually:
>
http://www.postgresql.org/message-id/cab7npqtdjf-58wuf-xz01nkj7wf0e+eukggqhd0igvsod4h...@mail.gmail.com

Good to remember!

> Michael
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SP-GiST support for inet datatypes

2016-03-02 Thread Oleg Bartunov

On Wed, Mar 2, 2016 at 11:56 PM, Emre Hasegeli  wrote:

> Attached patches add SP-GiST support to the inet datatypes.  The operator
> class comes with a small change on the SP-GiST framework to allow fixed
> number of child nodes.
>
> The index is like prefix tree except that it doesn't bother to split the
> addresses into parts as text is split.  It also doesn't use labels to know
> the part after the prefix, but relies on static node numbers.
>
>
Thanks, Emre for interesting spgist. We are bit busy and will take a look
on your patches when come to our spgist patch.


> The GiST index released with version 9.4 performs really bad with real
> world data.  SP-GiST works much better with the query posted to the
> performance list [1] a while ago:
>
> > hasegeli=# SELECT DISTINCT route INTO hmm FROM routes_view WHERE asn =
> 2914;
> > SELECT 732
> >
> > hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
> routes.route && hmm.route;
> >QUERY PLAN
> >
> 
> >  Nested Loop  (cost=0.41..571742.27 rows=2248 width=7) (actual
> time=12.643..20474.813 rows=8127 loops=1)
> >->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
> time=0.017..0.524 rows=732 loops=1)
> >->  Index Only Scan using route_gist on routes  (cost=0.41..552.05
> rows=22900 width=7) (actual time=4.851..27.948 rows=11 loops=732)
> >  Index Cond: (route && (hmm.route)::inet)
> >  Heap Fetches: 8127
> >  Planning time: 1.507 ms
> >  Execution time: 20475.605 ms
> > (7 rows)
> >
> > hasegeli=# DROP INDEX route_gist;
> > DROP INDEX
> >
> > hasegeli=# CREATE INDEX route_spgist ON routes USING spgist (route);
> > CREATE INDEX
> >
> > hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
> routes.route && hmm.route;
> >   QUERY PLAN
> >
> -
> >  Nested Loop  (cost=0.41..588634.27 rows=2248 width=7) (actual
> time=0.081..16.961 rows=8127 loops=1)
> >->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
> time=0.022..0.079 rows=732 loops=1)
> >->  Index Only Scan using route_spgist on routes  (cost=0.41..575.13
> rows=22900 width=7) (actual time=0.014..0.021 rows=11 loops=732)
> >  Index Cond: (route && (hmm.route)::inet)
> >  Heap Fetches: 8127
> >  Planning time: 1.376 ms
> >  Execution time: 15.936 ms
>
> [1]
> http://www.postgresql.org/message-id/flat/alpine.DEB.2.11.1508251504160.31004@pyrite#alpine.DEB.2.11.1508251504160.31004@pyrite
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Re: [HACKERS] SP-GiST support for inet datatypes

2016-03-02 Thread Oleg Bartunov

On Thu, Mar 3, 2016 at 8:51 AM, Oleg Bartunov  wrote:

>
>
> On Wed, Mar 2, 2016 at 11:56 PM, Emre Hasegeli  wrote:
>
>> Attached patches add SP-GiST support to the inet datatypes.  The
>> operator class comes with a small change on the SP-GiST framework to allow
>> fixed number of child nodes.
>>
>> The index is like prefix tree except that it doesn't bother to split the
>> addresses into parts as text is split.  It also doesn't use labels to know
>> the part after the prefix, but relies on static node numbers.
>>
>>
> Thanks, Emre for interesting spgist. We are bit busy and will take a look
> on your patches when come to our spgist patch.
>
>

Emre, I checked original thread and didn't find sample data. Could you
provide them for testing ?


> The GiST index released with version 9.4 performs really bad with real
>> world data.  SP-GiST works much better with the query posted to the
>> performance list [1] a while ago:
>>
>> > hasegeli=# SELECT DISTINCT route INTO hmm FROM routes_view WHERE asn =
>> 2914;
>> > SELECT 732
>> >
>> > hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
>> routes.route && hmm.route;
>> >QUERY
>> PLAN
>> >
>> 
>> >  Nested Loop  (cost=0.41..571742.27 rows=2248 width=7) (actual
>> time=12.643..20474.813 rows=8127 loops=1)
>> >->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
>> time=0.017..0.524 rows=732 loops=1)
>> >->  Index Only Scan using route_gist on routes  (cost=0.41..552.05
>> rows=22900 width=7) (actual time=4.851..27.948 rows=11 loops=732)
>> >  Index Cond: (route && (hmm.route)::inet)
>> >  Heap Fetches: 8127
>> >  Planning time: 1.507 ms
>> >  Execution time: 20475.605 ms
>> > (7 rows)
>> >
>> > hasegeli=# DROP INDEX route_gist;
>> > DROP INDEX
>> >
>> > hasegeli=# CREATE INDEX route_spgist ON routes USING spgist (route);
>> > CREATE INDEX
>> >
>> > hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
>> routes.route && hmm.route;
>> >   QUERY PLAN
>> >
>> -
>> >  Nested Loop  (cost=0.41..588634.27 rows=2248 width=7) (actual
>> time=0.081..16.961 rows=8127 loops=1)
>> >->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
>> time=0.022..0.079 rows=732 loops=1)
>> >->  Index Only Scan using route_spgist on routes  (cost=0.41..575.13
>> rows=22900 width=7) (actual time=0.014..0.021 rows=11 loops=732)
>> >  Index Cond: (route && (hmm.route)::inet)
>> >  Heap Fetches: 8127
>> >  Planning time: 1.376 ms
>> >  Execution time: 15.936 ms
>>
>> [1]
>> http://www.postgresql.org/message-id/flat/alpine.DEB.2.11.1508251504160.31004@pyrite#alpine.DEB.2.11.1508251504160.31004@pyrite
>>
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>>
>

Re: [HACKERS] The plan for FDW-based sharding

2016-03-08 Thread Oleg Bartunov

On Tue, Mar 8, 2016 at 6:40 AM, Craig Ringer  wrote:



> Either that, or bless experimental features/API as an official concept.
> I'd quite like that myself - stuff that's in Pg, but documented as "might
> change or go away in the next release, experimental feature". As we're
> doing more stuff that spans multiple release cycles, where patches in a
> prior cycle might need revision based on what we learn in a later one, we
> might need more freedom to change things that're committed and user visible.
>
>
+1


> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

Re: [HACKERS] SP-GiST support for inet datatypes

2016-03-08 Thread Oleg Bartunov

On Thu, Mar 3, 2016 at 11:45 AM, Emre Hasegeli  wrote:

> > Emre, I checked original thread and didn't find sample data. Could you
> provide them for testing ?
>
> I found it on the Git history:
>
>
> https://github.com/job/irrexplorer/blob/9e8b5330d7ef0022abbe1af18291257e044eb24b/data/irrexplorer_dump.sql.gz?raw=true
>

Thanks !

spgist index creates 2 times faster than gist, but index size is
noticeably  bugger

\di+ route_*
List of relations
 Schema | Name | Type  |  Owner   | Table  |  Size  | Description
+--+---+--+++-
 public | route_gist   | index | postgres | routes | 96 MB  |
 public | route_spgist | index | postgres | routes | 132 MB |
(2 rows)

Spgist index tree is much better  than gist - 12149 pages vs 1334760 !



EXPLAIN (ANALYZE, buffers) SELECT routes.route FROM routes JOIN hmm ON
routes.route && hmm.route;
   QUERY PLAN

 Nested Loop  (cost=0.41..570430.27 rows=2338 width=7) (actual
time=5.730..12085.747 rows=8127 loops=1)
   Buffers: shared hit=1334760
   ->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
time=0.013..0.528 rows=732 loops=1)
 Buffers: shared hit=4
   ->  Index Only Scan using route_gist on routes  (cost=0.41..550.26
rows=22900 width=7) (actual time=2.491..16.503 rows=11 loops=732)
 Index Cond: (route && (hmm.route)::inet)
 Heap Fetches: 8127
 Buffers: shared hit=1334756
 Planning time: 0.827 ms
 Execution time: 12086.513 ms
(10 rows)

EXPLAIN (ANALYZE, buffers) SELECT routes.route FROM routes JOIN hmm ON
routes.route && hmm.route;
   QUERY PLAN
-
 Nested Loop  (cost=0.41..588634.27 rows=2338 width=7) (actual
time=0.043..12.150 rows=8127 loops=1)
   Buffers: shared hit=12149
   ->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
time=0.013..0.075 rows=732 loops=1)
 Buffers: shared hit=4
   ->  Index Only Scan using route_spgist on routes  (cost=0.41..575.13
rows=22900 width=7) (actual time=0.011..0.015 rows=11 loops=732)
 Index Cond: (route && (hmm.route)::inet)
 Heap Fetches: 8127
 Buffers: shared hit=12145
 Planning time: 0.779 ms
 Execution time: 12.603 ms
(10 rows)

Re: [HACKERS] SP-GiST support for inet datatypes

2016-03-08 Thread Oleg Bartunov

On Tue, Mar 8, 2016 at 11:17 PM, Oleg Bartunov  wrote:

>
>
> On Thu, Mar 3, 2016 at 11:45 AM, Emre Hasegeli  wrote:
>
>> > Emre, I checked original thread and didn't find sample data. Could you
>> provide them for testing ?
>>
>> I found it on the Git history:
>>
>>
>> https://github.com/job/irrexplorer/blob/9e8b5330d7ef0022abbe1af18291257e044eb24b/data/irrexplorer_dump.sql.gz?raw=true
>>
>
> Thanks !
>
> spgist index creates 2 times faster than gist, but index size is
> noticeably  bugger
>
> \di+ route_*
> List of relations
>  Schema | Name | Type  |  Owner   | Table  |  Size  | Description
> +--+---+--+++-
>  public | route_gist   | index | postgres | routes | 96 MB  |
>  public | route_spgist | index | postgres | routes | 132 MB |
> (2 rows)
>
> Spgist index tree is much better  than gist - 12149 pages vs 1334760 !
>

I also noticed, that spgist is much faster than gist for other inet
operators. I'd like to see in 9.6.



>
>
>
> EXPLAIN (ANALYZE, buffers) SELECT routes.route FROM routes JOIN hmm ON
> routes.route && hmm.route;
>QUERY PLAN
>
> 
>  Nested Loop  (cost=0.41..570430.27 rows=2338 width=7) (actual
> time=5.730..12085.747 rows=8127 loops=1)
>Buffers: shared hit=1334760
>->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
> time=0.013..0.528 rows=732 loops=1)
>  Buffers: shared hit=4
>->  Index Only Scan using route_gist on routes  (cost=0.41..550.26
> rows=22900 width=7) (actual time=2.491..16.503 rows=11 loops=732)
>  Index Cond: (route && (hmm.route)::inet)
>  Heap Fetches: 8127
>  Buffers: shared hit=1334756
>  Planning time: 0.827 ms
>  Execution time: 12086.513 ms
> (10 rows)
>
> EXPLAIN (ANALYZE, buffers) SELECT routes.route FROM routes JOIN hmm ON
> routes.route && hmm.route;
>QUERY PLAN
>
> -
>  Nested Loop  (cost=0.41..588634.27 rows=2338 width=7) (actual
> time=0.043..12.150 rows=8127 loops=1)
>Buffers: shared hit=12149
>->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
> time=0.013..0.075 rows=732 loops=1)
>  Buffers: shared hit=4
>->  Index Only Scan using route_spgist on routes  (cost=0.41..575.13
> rows=22900 width=7) (actual time=0.011..0.015 rows=11 loops=732)
>  Index Cond: (route && (hmm.route)::inet)
>  Heap Fetches: 8127
>  Buffers: shared hit=12145
>  Planning time: 0.779 ms
>  Execution time: 12.603 ms
> (10 rows)
>
>
>
>
>

Re: [HACKERS] The plan for FDW-based sharding

2016-03-11 Thread Oleg Bartunov

On Fri, Mar 11, 2016 at 9:09 AM, Bruce Momjian  wrote:

>
>
>
> 3.  I have tried to encourage others to get involved, with limited
> success.  I do think the FDW is perhaps the only reasonable way to get
> _built-in_ sharding.  The external sharding solutions are certainly
> viable, but external.  It is possible we will make all the FDW
> improvements, find out it doesn't work, but find out the improvements
> allow us to go in another direction.
>

I remember last summer emails and we really wanted to participate in
development, but it happens all slots were occupied by edb and ntt people.
We wanted to work on distributed transactions and proposed our XTM.  Our
feeling that time from discussion was that we were invited, but all doors
were closed. It was very bad experience. Hopefully, we understand our
misunderstanding.

>
> There seems to be serious interest in how this idea came about, so let
> me say what I remember.
>

I think the idea was so obvious, so let's don't discuss this.

>
> As for why there is so much hostility, I think this is typical for any
> ill-defined feature development.  There was simmering hostility to the
> Windows port and pg_upgrade for many years because those projects were
> not easy to define and risky, and had few active developers.  The
> agreement was that work could continue as long as destabilization wasn't
> introduced.  Ideally everything would have a well-defined plan, it is
> sometimes hard to do.  Similar to our approach on parallelism (which is
> also super-important and doesn't many active developers), sometimes you
> just need to create infrastructure and see how well it solves problems.
>
>

Our XTM is the yet another example of infrastructure we need to work on
clustering. Should we wait other smart guy starts thinking on distributed
transactions ?  We described in https://wiki.postgresql.org/wiki/DTM our
API, which is just a wrapper on existed functions, but it will allow us and
fortunately others to play with their ideas.  We did several prototypes,
including FDW, to demonstrate viability of API, and plan to continue our
work on built-in high availability, multi-master.  Of course, there will be
a lot to learn, but it will be much easier if XTM will exists not as
separate patch, which is really small.

>
> --
>   Bruce Momjian  http://momjian.us
>   EnterpriseDB http://enterprisedb.com
>
> + As you are, so once was I. As I am, so you will be. +
> + Roman grave inscription +
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] WIP: Access method extendability

2016-03-11 Thread Oleg Bartunov

On Wed, Mar 9, 2016 at 8:31 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> Hi!
>
> On Wed, Mar 9, 2016 at 3:27 PM, Alvaro Herrera 
> wrote:
>
>> Hi.  As I just said to Tomas Vondra: since your patch creates a new
>> object type, please make sure to add a case to cover it in the
>> object_address.sql test.  That verifies some things such as
>> pg_identify_object and related.
>>
>
> Good catch, thanks! Tests were added.
> I also introduced numbering into patch names to make evident the order to
> their application.
>
>
Nice to see progress ! I hope to see Alexander' work in 9.6.

I and Teodor will show at PGCon new GIN AM as an extension, optimized for
full text search (millisecond FTS) , which we gave up to push into core.


> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Re: [HACKERS] eXtensible Transaction Manager API (v2)

2016-03-11 Thread Oleg Bartunov

On Fri, Mar 11, 2016 at 7:11 PM, David Steele  wrote:

> On 2/10/16 12:50 PM, Konstantin Knizhnik wrote:
>
> > PostgresProffesional cluster teams wants to propose new version of
> > eXtensible Transaction Manager API.
> > Previous discussion concerning this patch can be found here:
> >
> >
> http://www.postgresql.org/message-id/f2766b97-555d-424f-b29f-e0ca0f6d1...@postgrespro.ru
>
> I see a lot of discussion on this thread but little in the way of
> consensus.
>
> > The API patch itself is small enough, but we think that it will be
> > strange to provide just API without examples of its usage.
>
> It's not all that small, though it does apply cleanly even after a few
> months.  At least that indicates there is not a lot of churn in this area.
>
> I'm concerned about the lack of response or reviewers for this patch.
> It may be because everyone believes they had their say on the original
> thread, or because it seems like a big change to go into the last CF, or
> for other reasons altogether.
>

We'll prepare easy setup to play with our solutions, so any developers
could see how it works.  Hope this weekend we'll post something about this.



>
> I think you should try to make it clear why this patch would be a win
> for 9.6.
>

Looks like discussion shifted to different thread, we'll answer here.



>
> Is anyone willing to volunteer a review or make an argument for the
> importance of this patch?
>
> --
> -David
> da...@pgmasters.net
>
>

Re: [HACKERS] Background Processes and reporting

2016-03-12 Thread Oleg Bartunov

On Sat, Mar 12, 2016 at 12:45 AM, Andres Freund  wrote:

> On 2016-03-12 02:24:33 +0300, Alexander Korotkov wrote:
>
>

> > So, situation looks like dead-end.  I have no idea how to convince Robert
> > about any kind of advanced functionality of wait monitoring to
> PostgreSQL.
> > I'm thinking about implementing sampling extension over current
> > infrastructure just to make community see that it sucks. Andres, it would
> > be very nice if you have any idea how to move this situation forward.
>
> I've had my share of conflicts with Robert. But if I were in his shoes,
> targeted by this kind of rhetoric, I'd be very tempted to just ignore
> any further arguments from the origin.  So I think the way forward is
> for everyone to cool off, and to see how we can incrementally make
> progress from here on.
>
>
We all are very different people from different cultures, so online
discussion on ill-defined topics  wouldn't work. Let's back to work.

> > Another aspect is that EnterpriseDB offers waits monitoring in
> proprietary
> > fork [5].
>
>
So?
>

So, Robert already has experience with the subject, probably,  he has bad
experience with edb implementation and he'd like to see something better in
community version. That's fair and I accept his position.

Wait monitoring is one of the popular requirement of russian companies, who
migrated from Oracle. Overwhelming majority of them use Linux, so I suggest
to have configure flag for including wait monitoring at compile time
(default is no wait monitoring), or have GUC variable, which is also off by
default, so we have zero to minimal overhead of monitoring. That way we'll
satisfy many enterprises and help them to choose postgres, will get
feedback from production use and have time for feature improving.

>
> Greetings,
>
> Andres Freund
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] [PATCH] we have added support for box type in SP-GiST index

2016-03-15 Thread Oleg Bartunov

On Mon, Mar 14, 2016 at 9:01 PM, David Steele  wrote:

> On 2/15/16 10:29 AM, Teodor Sigaev wrote:
>
> It's very pity but author is not able to continue work on this patch,
>> and I would like to raise this flag.
>>
>> I'd like to add some comments about patches:
>>
>> traversalValue patch adds arbitrary value assoсiated with branch in
>> SP-GiST tree walk. Unlike to recostructedValue it could be just pointer,
>> it havn't to be a regular pgsql type. Also, it could be used independly
>> from reconstructedValue. This patch is used both following attached
>> patches.
>>
>> range patch just switchs using reconstructedValue to traversalValue in
>> range opclasses. reconstructedValue was used just because it was an
>> acceptable workaround in case of range type. Range opclase stores a full
>> value in leafs and doesn't need to use reconstructedValue to return
>> tuple in index only scan.
>> See http://www.postgresql.org/message-id/5399.1343512...@sss.pgh.pa.us
>>
>> q4d patch implements index over boxes using SP-GiST. Basic idea was an
>> observation, range opclass thinks about one-dimensional ranges as 2D
>> points.
>> Following this idea, we can think that 2D box (what is 2 points or 4
>> numbers) could represent a 4D point. We hoped that this approach will be
>> much more effective than traditional R-Tree in case of many overlaps in
>> box's collection.
>> Performance results are coming shortly.
>>
>
> It appears that the issues raised in this thread have been addressed but
> the patch still has not gone though a real review.
>
> Anybody out there willing to take a crack at a review?  All three patches
> apply (with whitespace issues).
>

Emre Hasegeli will review the patch.


>
> Thanks,
> --
> -David
> da...@pgmasters.net
>

Re: [HACKERS] Background Processes and reporting

2016-03-15 Thread Oleg Bartunov

On Tue, Mar 15, 2016 at 7:43 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> On Tue, Mar 15, 2016 at 12:57 AM, Robert Haas 
> wrote:
>
>> On Mon, Mar 14, 2016 at 4:42 PM, Andres Freund 
>> wrote:
>> > On 2016-03-14 16:16:43 -0400, Robert Haas wrote:
>> >> > I have already shown [0, 1] the overhead of measuring timings in
>> linux on
>> >> > representative workload. AFAIK, these tests were the only one that
>> showed
>> >> > any numbers. All other statements about terrible performance have
>> been and
>> >> > remain unconfirmed.
>> >>
>> >> Of course, those numbers are substantial regressions which would
>> >> likely make it impractical to turn this on on a heavily-loaded
>> >> production system.
>> >
>> > A lot of people operating production systems are fine with trading a <=
>> > 10% impact for more insight into the system; especially if that
>> > configuration can be changed without a restart.  I know a lot of systems
>> > that use pg_stat_statements, track_io_timing = on, etc; just to get
>> > that. In fact there's people running perf more or less continuously in
>> > production environments; just to get more insight.
>> >
>> > I think it's important to get as much information out there without
>> > performance overhead, so it can be enabled by default. But I don't think
>> > it makes sense to not allow features in that cannot be enabled by
>> > default, *if* we tried to make them cheap enough beforehand.
>>
>> Hmm, OK.  I would have expected you to be on the other side of this
>> question, so maybe I'm all wet.  One point I am concerned about is
>> that, right now, we have only a handful of types of wait events.  I'm
>> very interested in seeing us add more, like I/O and client wait.  So
>> any overhead we pay here is likely to eventually be paid in a lot of
>> places - thus it had better be extremely small.
>>
>
> OK. Let's start to produce light, not heat.
>
> As I get we have two features which we suspect to introduce overhead:
> 1) Recording parameters of wait events which requires some kind of
> synchronization protocol.
> 2) Recording time of wait events because time measurements might be
> expensive on some platforms.
>
> Simultaneously there are machines and workloads where both of these
> features doesn't produce measurable overhead.  And, we're talking not about
> toy databases. Vladimir is DBA from Yandex which is in TOP-20 (by traffic)
> internet companies in the world.  They do run both of this features in
> production highload database without noticing any overhead of them.
>

> It would be great progress, if we decide that we could add both of these
> features controlled by GUC (off by default).
>

enable_waits_statistics ?


>
> If we decide so, then let's start working on this. At first, we should
> construct list of machines and workloads for testing. Each list of machines
> and workloads would be not comprehensive. But let's find something that
> would be enough for testing of GUC controlled, off by default features.
> Then we can turn our conversation from theoretical thoughts to particular
> benchmarks which would be objective and convincing to everybody.
>

Vladimir, could you provide a test suite, so other people could measure
overhead on their machines ?




>
> Otherwise, let's just add these features to the list of unwanted
> functionality and close this question.
>
> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>

Re: [HACKERS] Bug in searching path in jsonb_set when walking through JSONB array

2016-03-23 Thread Oleg Bartunov

On Wed, Mar 23, 2016 at 6:37 AM, Vitaly Burovoy 
wrote:

> Hello, Hackers!
>
> While I was reviewed a patch with "json_insert" function I found a bug
> which wasn't connected with the patch and reproduced at master.
>
> It claims about non-integer whereas input values are obvious integers
> and in an allowed range.
> More testing lead to understanding it appears when numbers length are
> multiplier of 4:
>
> postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }',
> '"4"');
> ERROR:  path element at the position 2 is not an integer
>

Hmm, I see in master

select version();
 version
-
 PostgreSQL 9.6devel on x86_64-apple-darwin15.4.0, compiled by Apple LLVM
version 7.3.0 (clang-703.0.29), 64-bit
(1 row)

select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }', '"4"');
 jsonb_set

 {"a": [[], 1, 2, 3, "4"], "b": []}
(1 row)




>
> postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"b", 1000}',
> '"4"');
> ERROR:  path element at the position 2 is not an integer
>
> postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", -999}',
> '"4"');
> ERROR:  path element at the position 2 is not an integer
>
> postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a",
> 1000}', '"4"');
> ERROR:  path element at the position 2 is not an integer
>
> Close values are ok:
> postgres=# select jsonb_set('{"a":[[],1,2,3]}', '{"a", 0, 999}', '"4"');
> jsonb_set
> -
>  {"a": [["4"], 1, 2, 3]}
> (1 row)
>
> postgres=# select jsonb_set('{"a":[[],1,2,3]}', '{"a", 0, 1}', '"4"');
> jsonb_set
> -
>  {"a": [["4"], 1, 2, 3]}
> (1 row)
>
>
> Research lead to setPathArray where a string which is got via
> VARDATA_ANY but is passed to strtol which expects cstring.
>
> In case of string number length is not a multiplier of 4 rest bytes
> are padding by '\0', when length is a multiplier of 4 there is no
> padding, just garbage after the last digit of the value.
>
> Proposed patch in an attachment fixes it.
>
> There is a magic number "20" as a length of an array for copying key
> from a path before passing it to strtol. It is a maximal length of a
> value which can be parsed by the function. I could not find a proper
> constant for it. Also I found similar direct value in the code (e.g.
> in numeric.c).
>
> I've added a comment, I hope it is enough for it.
>
>
> --
> Best regards,
> Vitaly Burovoy
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Re: [HACKERS] Draft release notes for next week's releases

2016-03-28 Thread Oleg Bartunov

On Mar 28, 2016 09:44, "Peter Geoghegan"  wrote:
>
> On Sat, Mar 26, 2016 at 4:34 PM, Tom Lane  wrote:
> > Probably the most discussion-worthy item is whether we can say
> > anything more about the strxfrm mess.  Should we make a wiki
> > page about that and have the release note item link to it?
>
> I think that there is an argument against doing so, which is that
> right now, all we have to offer on that are weasel words. However, I'm
> still in favor of a Wiki page, because I would not be at all surprised
> if our understanding of this problem evolved, and we were able to
> offer better answers in several weeks. Realistically, it will probably
> take at least that long before affected users even start to think
> about this.

Should we start thinking about ICU ? I compare Postgres with ICU and
without and found 27x improvement in btree index creation for russian
strings. This includes effect of abbreviated keys and ICU itself. Also,
we'll get system independent locale.
>
>
> --
> Peter Geoghegan
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Draft release notes for next week's releases

2016-03-28 Thread Oleg Bartunov

On Mon, Mar 28, 2016 at 1:21 PM, Peter Geoghegan  wrote:

> On Mon, Mar 28, 2016 at 12:08 AM, Oleg Bartunov 
> wrote:
> > Should we start thinking about ICU ? I compare Postgres with ICU and
> without
> > and found 27x improvement in btree index creation for russian strings.
> This
> > includes effect of abbreviated keys and ICU itself. Also, we'll get
> system
> > independent locale.
>
> I think we should. I want to develop a detailed proposal before
> talking about it more, though, because the idea is controversial.
>
> Did you use the FreeBSD ports patch? Do you have your own patch that
> you could share?
>

 We'll post the patch. Teodor made something to get abbreviated keys work
as
I remember. I should say, that 27x improvement I got on my macbook. I will
check on linux.

>
> I'm not surprised that ICU is so much faster, especially now that
> UTF-8 is not a second class citizen (it's been possible to build ICU
> to specialize all its routines to handle UTF-8 for years now). As you
> may know, ICU supports partial sort keys, and sort key compression,
> which may have also helped:
> http://userguide.icu-project.org/collation/architecture
>

>
> That page also describes how binary sort keys are versioned, which
> allows them to be stored on disk. It says "A common example is the use
> of keys to build indexes in databases". We'd be crazy to trust Glibc
> strxfrm() to be stable *on disk*, but ICU already cares deeply about
> the things we need to care about, because it's used by other database
> systems like DB2, Firebird, and in some configurations SQLite [1].
>
> Glibc strxfrm() is not great with codepoints from the Cyrillic
> alphabet -- it seems to store 2 bytes per code-point in the primary
> weight level. So ICU might also do better in your test case for that
> reason.
>

Yes, I see on this page, that ICU is ~3 times faster for russian text.
http://site.icu-project.org/charts/collation-icu4c48-glibc


>
> [1]
> https://www.sqlite.org/src/artifact?ci=trunk&filename=ext/icu/README.txt
> --
> Peter Geoghegan
>

Re: [HACKERS] Draft release notes for next week's releases

2016-03-28 Thread Oleg Bartunov

On Mon, Mar 28, 2016 at 2:06 PM, Peter Geoghegan  wrote:

> On Mon, Mar 28, 2016 at 12:55 AM, Oleg Bartunov 
> wrote:
> >  We'll post the patch.
>
> Cool.
>
> > Teodor made something to get abbreviated keys work as
> > I remember. I should say, that 27x improvement I got on my macbook. I
> will
> > check on linux.
>
> I think that Linux will be much faster. The stxfrm() blob produced by
> Mac OSX will have a horribly low concentration of entropy. For an 8
> byte Datum, you get only 2 distinguishing bytes. It's really, really
> bad. Mac OSX probably makes very little use of strxfrm() in practice;
> there are proprietary APIs that do something similar, but all using
> UTF-16 only.
>

Yes, Linux is much-much faster, I see no difference in performance using
latest icu 57_1.
I tested on Ubuntu 14.4.04.  But still, icu provides us abbreviated keys
and collation stability,
so let's add --with-icu.


>
> --
> Peter Geoghegan
>

Re: [HACKERS] Dealing with collation and strcoll/strxfrm/etc

2016-03-29 Thread Oleg Bartunov

On Mon, Mar 28, 2016 at 5:57 PM, Stephen Frost  wrote:

> All,
>
> Changed the thread name (we're no longer talking about release
> notes...).
>
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > Oleg Bartunov  writes:
> > > Should we start thinking about ICU ?
> >
> > Isn't it still true that ICU fails to meet our minimum requirements?
> > That would include (a) working with the full Unicode character range
> > (not only UTF16) and (b) working with non-Unicode encodings.  No doubt
> > we could deal with (b) by inserting a conversion, but that would take
> > a lot of shine off the performance numbers you mention.
> >
> > I'm also not exactly convinced by your implicit assumption that ICU is
> > bug-free.
>
> We have a wiki page about ICU.  I'm not sure that it's current, but if
> it isn't and people are interested then perhaps we should update it:
>
> https://wiki.postgresql.org/wiki/Todo:ICU
>
>
Good point, I forget about this page.



> If we're going to talk about minimum requirements, I'd like to argue
> that we require whatever system we're using to have versioning (which
> glibc currently lacks, as I understand it...) to avoid the risk that
> indexes will become corrupt when whatever we're using for collation
> changes.  I'm pretty sure that's already bitten us on at least some
> RHEL6 -> RHEL7 migrations in some locales, even forgetting the issues
> with strcoll vs. strxfrm.
>

agree.


>
> Regarding key abbreviation and performance, if we are confident that
> strcoll and strxfrm are at least independently internally consistent
> then we could consider offering an option to choose between them.
> We'd need to identify what each index was built with to do so, however,
> as they would need to be rebuilt if the choice changes, at least
> until/unless they're made to reliably agree.  Even using only one or the
> other doesn't address the versioning problem though, which is a problem
> for all currently released versions of PG and is just going to continue
> to be an issue.
>

Ideally, we should benchmarking all locales on all platforms for all kind
indexes. But that's  big project.


>
> Thanks!
>
> Stephen
>

Re: [HACKERS] Draft release notes for next week's releases

2016-03-29 Thread Oleg Bartunov

On Mon, Mar 28, 2016 at 1:36 PM, Thomas Kellerer  wrote:

> Oleg Bartunov-2 wrote
> > But still, icu provides us abbreviated keys and collation stability,
>
> Does include ICU mean that collation handling is identical across
> platforms?
> E.g. a query on Linux involving string comparison would yield the same
> result on MacOS and Windows?
>

Yes, it does and that's the most important issue for us.


>
> If that is the case I'm all for it.
>
> Currently the different behaviour in handling collation aware string
> comparisons is a bug in my eyes from a user's perspective. I do understand
> and can accept the technical reasons for that, but it still feels odd that
> a
> query yields different results (with identical data) just because it runs
> on
> a different platform.
>
>
>
>
> --
> View this message in context:
> http://postgresql.nabble.com/Draft-release-notes-for-next-week-s-releases-tp5895357p5895484.html
> Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Draft release notes for next week's releases

2016-03-29 Thread Oleg Bartunov

On Mon, Mar 28, 2016 at 6:08 PM, Robert Haas  wrote:

> On Mon, Mar 28, 2016 at 10:24 AM, Tom Lane  wrote:
> > I'm also not exactly convinced by your implicit assumption that ICU is
> > bug-free.
>

> Noah spent some time looking at ICU back when he was EnterpriseDB, and
> his conclusion was that ICU collations weren't stable across releases,
> which is pretty much the same problem we're running into with glibc
> collations.  Now it might still be true that they have the equivalent
> of strxfrm() and strcoll() and that those things behave consistently
> with each other, and that would be very good.  Everybody seems to
> agree it's faster, and that's good, too.  But I wonder what we do
> about the fact that, as with glibc, an ICU upgrade involves a REINDEX
> of every potentially affected index.  It seems like ICU has some
> facilities built into it that might be useful for detecting and
> handling such situations, but I don't understand them well enough to
> know whether they'd solve our versioning problems or how effectively
> they would do so, and I think there are packaging issues that tie into
> it, too.  http://userguide.icu-project.org/design mentions building
> with specific configure flags if you need to link with multiple server
> versions, and I don't know what operating system packagers typically
> do about that stuff.
>
> In any case, I agree that we'd be very unwise to think that ICU is
> necessarily going to be bug-free without testing it carefully.
>

agree.

In other thread I wrote:
"Ideally, we should benchmarking all locales on all platforms for all kind
indexes. But that's  big project."


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] New committer

2016-01-28 Thread Oleg Bartunov

On Thu, Jan 28, 2016 at 5:37 PM, Magnus Hagander 
wrote:

> Hello!
>
> The PostgreSQL core team would like to welcome Dean Rasheed as a new
> committer for the PostgreSQL project.
>
> Dean - welcome! Now let's see how quickly you can break the buildfarm!
>

Congratulations, of course !

I'd like to see next time a short review of a committer, so other
developers could see what they are missing :)


>
> --
>  Magnus Hagander
>  Me: http://www.hagander.net/
>  Work: http://www.redpill-linpro.com/
>

[HACKERS] thanks for FOSDEM/PGDay 2016 Developer Meeting

2016-01-28 Thread Oleg Bartunov

I read  https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2016_Developer_Meeting
and would like to say thanks for such nice review of meeting.

Oleg

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-29 Thread Oleg Bartunov

On Fri, Jan 29, 2016 at 1:11 PM, Alvaro Herrera 
wrote:

> Artur Zakirov wrote:
>
> > What status of this patch? In commitfest it is "Needs review".
>
> "Needs review" means it needs a reviewer to go over it and, uh, review
> it.  Did I send an email to you prodding you to review patches?  I sent
> such an email to several people from PostgresPro, but I don't remember
> getting a response from anyone, and honestly I don't see you guys/gal
> doing much review on-list.  If you can please talk to your colleagues so
> that they look over your patch, while at the same time your review their
> patches, that would help not only this one patch but everyone else's
> patches as well.
>

I think Teodor is planning to review these patches.


>
> > Can this patch get the status "Ready for Commiter"?
>
> Sure, as soon as it has gotten enough review to say it's past the "needs
> review" phase.  Just like all patches.
>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-02 Thread Oleg Bartunov

On Tue, Feb 2, 2016 at 10:18 AM, Andreas Joseph Krogh 
wrote:

> På tirsdag 02. februar 2016 kl. 04:22:57, skrev Michael Paquier <
> michael.paqu...@gmail.com>:
>
>
>
> On Mon, Feb 1, 2016 at 8:21 PM, Dmitry Ivanov 
> wrote:
>>
>> This patch was originally developed by Teodor Sigaev and Oleg Bartunov in
>> 2009, so all credit goes to them. Any feedback is welcome.
>
> This is not a small patch:
> 28 files changed, 2441 insertions(+), 380 deletions(-)
> And the last CF of 9.6 should not contain rather large patches.
> --
> Michael
>
>
>
> OTOH; It would be extremely nice to get this into 9.6.
>

will see how community decided.
anyway, it's already in our distribution.



>
>
> --
> *Andreas Joseph Krogh*
>

Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-02 Thread Oleg Bartunov

On Tue, Feb 2, 2016 at 2:04 PM, Alvaro Herrera 
wrote:

> Andreas Joseph Krogh wrote:
>
> > Which seems to indicate it has received a fair amount of testing and is
> quite
> > stable.
> > Hopefully it integrates into the 9.6 codebase without too much risk.
>
> Yes, yes, that's all very good, but we're nearing the closure of the 9.6
> development cycle and we only have one commitfest left.  If someone had
> lots of community brownie points because of doing lots of reviews of
> other people's patches, they might push their luck by posting this patch
> to the final commitfest.  But if that someone didn't, then it wouldn't
> be fair, and if I were the commitfest manager of that commitfest I would
> boot their patch to the 9.7-First commitfest.
>
> The current commitfest which I'm trying to close still has 24 patches in
> needs-review state and 11 patches ready-for-committer; the next one (not
> closed yet) has 40 patches that will need review.  That means a total of
> 75 patches, and those should all be processed ahead of this one.  The
> effort needed to process each of those patches is not trivial, and I'm
> sorry I have to say this but I don't see PostgresPro contributing enough
> reviews, even though I pinged a number of people there, so putting one
> more patch on the rest of the community's shoulders doesn't seem fair to
> me.
>

I'll talk about this.


>
> Everybody has their favorite patch that they want in the next release,
> but we only have so much manpower to review and integrate those patches.
> All review help is welcome.
>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

[HACKERS] old bug in full text parser

2016-02-10 Thread Oleg Bartunov

It  looks like there is a very old bug in full text parser (somebody
pointed me on it), which appeared after moving tsearch2 into the core.  The
problem is in how full text parser process hyphenated words. Our original
idea was to report hyphenated word itself as well as its parts and ignore
hyphen. That was how tsearch2 works.

This behaviour was changed after moving tsearch2 into the core:
1. hyphen now reported by parser, which is useless.
2.  Hyphenated words with numbers ('4-dot', 'dot-4')  processed differently
than ones with plain text words like 'four-dot', no hyphenated word itself
reported.

I think we should consider this as a bug and produce fix for all supported
versions.

After  investigation we found this commit:

commit 73e6f9d3b61995525785b2f4490b465fe860196b
Author: Tom Lane 
Date:   Sat Oct 27 19:03:45 2007 +

Change text search parsing rules for hyphenated words so that digit
strings
containing decimal points aren't considered part of a hyphenated word.
Sync the hyphenated-word lookahead states with the subsequent
part-by-part
reparsing states so that we don't get different answers about how much
text
is part of the hyphenated word.  Per my gripe of a few days ago.


8.2.23

select tok_type, description, token from ts_debug('dot-four');
  tok_type   |  description  |  token
-+---+--
 lhword  | Latin hyphenated word | dot-four
 lpart_hword | Latin part of hyphenated word | dot
 lpart_hword | Latin part of hyphenated word | four
(3 rows)

select tok_type, description, token from ts_debug('dot-4');
  tok_type   |  description  | token
-+---+---
 hword   | Hyphenated word   | dot-4
 lpart_hword | Latin part of hyphenated word | dot
 uint| Unsigned integer  | 4
(3 rows)

select tok_type, description, token from ts_debug('4-dot');
 tok_type |   description| token
--+--+---
 uint | Unsigned integer | 4
 lword| Latin word   | dot
(2 rows)

8.3.23

select alias, description, token from ts_debug('dot-four');
  alias  |   description   |  token
-+-+--
 asciihword  | Hyphenated word, all ASCII  | dot-four
 hword_asciipart | Hyphenated word part, all ASCII | dot
 blank   | Space symbols   | -
 hword_asciipart | Hyphenated word part, all ASCII | four
(4 rows)

select alias, description, token from ts_debug('dot-4');
   alias   |   description   | token
---+-+---
 asciiword | Word, all ASCII | dot
 int   | Signed integer  | -4
(2 rows)

select alias, description, token from ts_debug('4-dot');
   alias   |   description| token
---+--+---
 uint  | Unsigned integer | 4
 blank | Space symbols| -
 asciiword | Word, all ASCII  | dot
(3 rows)


Regards,
Oleg

Re: [HACKERS] old bug in full text parser

2016-02-10 Thread Oleg Bartunov

On Wed, Feb 10, 2016 at 12:28 PM, Oleg Bartunov  wrote:

> It  looks like there is a very old bug in full text parser (somebody
> pointed me on it), which appeared after moving tsearch2 into the core.  The
> problem is in how full text parser process hyphenated words. Our original
> idea was to report hyphenated word itself as well as its parts and ignore
> hyphen. That was how tsearch2 works.
>
> This behaviour was changed after moving tsearch2 into the core:
> 1. hyphen now reported by parser, which is useless.
> 2.  Hyphenated words with numbers ('4-dot', 'dot-4')  processed
> differently than ones with plain text words like 'four-dot', no hyphenated
> word itself reported.
>
> I think we should consider this as a bug and produce fix for all supported
> versions.
>
> After  investigation we found this commit:
>
> commit 73e6f9d3b61995525785b2f4490b465fe860196b
> Author: Tom Lane 
> Date:   Sat Oct 27 19:03:45 2007 +
>
> Change text search parsing rules for hyphenated words so that digit
> strings
> containing decimal points aren't considered part of a hyphenated word.
> Sync the hyphenated-word lookahead states with the subsequent
> part-by-part
> reparsing states so that we don't get different answers about how much
> text
> is part of the hyphenated word.  Per my gripe of a few days ago.
>
>
> 8.2.23
>
> select tok_type, description, token from ts_debug('dot-four');
>   tok_type   |  description  |  token
> -+---+--
>  lhword  | Latin hyphenated word | dot-four
>  lpart_hword | Latin part of hyphenated word | dot
>  lpart_hword | Latin part of hyphenated word | four
> (3 rows)
>
> select tok_type, description, token from ts_debug('dot-4');
>   tok_type   |  description  | token
> -+---+---
>  hword   | Hyphenated word   | dot-4
>  lpart_hword | Latin part of hyphenated word | dot
>  uint| Unsigned integer  | 4
> (3 rows)
>
> select tok_type, description, token from ts_debug('4-dot');
>  tok_type |   description| token
> --+--+---
>  uint | Unsigned integer | 4
>  lword| Latin word   | dot
> (2 rows)
>
> 8.3.23
>
> select alias, description, token from ts_debug('dot-four');
>   alias  |   description   |  token
> -+-+--
>  asciihword  | Hyphenated word, all ASCII  | dot-four
>  hword_asciipart | Hyphenated word part, all ASCII | dot
>  blank   | Space symbols   | -
>  hword_asciipart | Hyphenated word part, all ASCII | four
> (4 rows)
>
> select alias, description, token from ts_debug('dot-4');
>alias   |   description   | token
> ---+-+---
>  asciiword | Word, all ASCII | dot
>  int   | Signed integer  | -4
> (2 rows)
>
> select alias, description, token from ts_debug('4-dot');
>alias   |   description| token
> ---+--+---
>  uint  | Unsigned integer | 4
>  blank | Space symbols| -
>  asciiword | Word, all ASCII  | dot
> (3 rows)
>
>

Oh, one more bug, which existed even in tsearch2.

select tok_type, description, token from ts_debug('4-dot');
 tok_type |   description| token
--+--+---
 uint | Unsigned integer | 4
 lword| Latin word   | dot
(2 rows)




>
> Regards,
> Oleg
>

Re: [HACKERS] old bug in full text parser

2016-02-10 Thread Oleg Bartunov

On Wed, Feb 10, 2016 at 7:45 PM, Mike Rylander  wrote:

> On Wed, Feb 10, 2016 at 4:28 AM, Oleg Bartunov 
> wrote:
> > It  looks like there is a very old bug in full text parser (somebody
> pointed
> > me on it), which appeared after moving tsearch2 into the core.  The
> problem
> > is in how full text parser process hyphenated words. Our original idea
> was
> > to report hyphenated word itself as well as its parts and ignore hyphen.
> > That was how tsearch2 works.
> >
> > This behaviour was changed after moving tsearch2 into the core:
> > 1. hyphen now reported by parser, which is useless.
> > 2.  Hyphenated words with numbers ('4-dot', 'dot-4')  processed
> differently
> > than ones with plain text words like 'four-dot', no hyphenated word
> itself
> > reported.
> >
> > I think we should consider this as a bug and produce fix for all
> supported
> > versions.
> >
>
> The Evergreen project has long depended on tsearch2 (both as an
> extension and in-core FTS), and one thing we've struggled with is date
> range parsing such as birth and death years for authors in the form of
> 1979-2014, for instance.  Strings like that end up being parsed as two
> lexems, "1979" and "-2014".  We work around this by pre-normalizing
> strings matching /(\d+)-(\d+)/ into two numbers separated by a space
> instead of a hyphen, but if fixing this bug would remove the need for
> such a preprocessing step it would be a great help to us.  Would such
> strings be parsed "properly" into lexems of the form of "1979" and
> "2014" with you proposed change?
>
>
I'd love to consider all hyphenated "words" in one way, disregarding to
what is "a word", number of plain text, namely,  'w1-w2' should be reported
as {'w1-w2', 'w1', 'w2'}. The problem is in definition of "word".


We'll definitely look on parser again, fortunately, we could just fork
default parser and develop new one to not break compatibility. You have
chance to help us to produce "consistent" view of what tokens new parser
should recognize and how process them.





> Thanks!
>
> --
> Mike Rylander
>
> > After  investigation we found this commit:
> >
> > commit 73e6f9d3b61995525785b2f4490b465fe860196b
> > Author: Tom Lane 
> > Date:   Sat Oct 27 19:03:45 2007 +
> >
> > Change text search parsing rules for hyphenated words so that digit
> > strings
> > containing decimal points aren't considered part of a hyphenated
> word.
> > Sync the hyphenated-word lookahead states with the subsequent
> > part-by-part
> > reparsing states so that we don't get different answers about how
> much
> > text
> > is part of the hyphenated word.  Per my gripe of a few days ago.
> >
> >
> > 8.2.23
> >
> > select tok_type, description, token from ts_debug('dot-four');
> >   tok_type   |  description  |  token
> > -+---+--
> >  lhword  | Latin hyphenated word | dot-four
> >  lpart_hword | Latin part of hyphenated word | dot
> >  lpart_hword | Latin part of hyphenated word | four
> > (3 rows)
> >
> > select tok_type, description, token from ts_debug('dot-4');
> >   tok_type   |  description  | token
> > -+---+---
> >  hword   | Hyphenated word   | dot-4
> >  lpart_hword | Latin part of hyphenated word | dot
> >  uint| Unsigned integer  | 4
> > (3 rows)
> >
> > select tok_type, description, token from ts_debug('4-dot');
> >  tok_type |   description| token
> > --+--+---
> >  uint | Unsigned integer | 4
> >  lword| Latin word   | dot
> > (2 rows)
> >
> > 8.3.23
> >
> > select alias, description, token from ts_debug('dot-four');
> >   alias  |   description   |  token
> > -+-+--
> >  asciihword  | Hyphenated word, all ASCII  | dot-four
> >  hword_asciipart | Hyphenated word part, all ASCII | dot
> >  blank   | Space symbols   | -
> >  hword_asciipart | Hyphenated word part, all ASCII | four
> > (4 rows)
> >
> > select alias, description, token from ts_debug('dot-4');
> >alias   |   description   | token
> > ---+-+---
> >  asciiword | Word, all ASCII | dot
> >  int   | Signed integer  | -4
> > (2 rows)
> >
> > select alias, description, token from ts_debug('4-dot');
> >alias   |   description| token
> > ---+--+---
> >  uint  | Unsigned integer | 4
> >  blank | Space symbols| -
> >  asciiword | Word, all ASCII  | dot
> > (3 rows)
> >
> >
> > Regards,
> > Oleg
>

Re: [HACKERS] old bug in full text parser

2016-02-10 Thread Oleg Bartunov

On Wed, Feb 10, 2016 at 7:21 PM, Tom Lane  wrote:

> Oleg Bartunov  writes:
> > It  looks like there is a very old bug in full text parser (somebody
> > pointed me on it), which appeared after moving tsearch2 into the core.
> The
> > problem is in how full text parser process hyphenated words. Our original
> > idea was to report hyphenated word itself as well as its parts and ignore
> > hyphen. That was how tsearch2 works.
>
> > This behaviour was changed after moving tsearch2 into the core:
> > 1. hyphen now reported by parser, which is useless.
> > 2.  Hyphenated words with numbers ('4-dot', 'dot-4')  processed
> differently
> > than ones with plain text words like 'four-dot', no hyphenated word
> itself
> > reported.
>
> > I think we should consider this as a bug and produce fix for all
> supported
> > versions.
>
> I don't see anything here that looks like a bug, more like a definition
> disagreement.  As such, I'd be pretty dubious about back-patching a
> change.  But it's hard to debate the merits when you haven't said exactly
> what you'd do instead.
>

Yeah, better say not bug, but inconsistency. We definitely should work on
better
"consistent" parser with predicted behaviour.


>
> I believe the commit you mention was intended to fix this inconsistency:
>
> http://www.postgresql.org/message-id/6269.1193184...@sss.pgh.pa.us
>
> so I would be against simply reverting it.  In any case, the examples
> given there make it look like there was already inconsistency about mixed
> words and numbers.  Do we really think that "4-dot" should be considered
> a hyphenated word?  I'm not sure.
>

I agree, that we shouldn't  just revert it.  My idea is to work on new
parser and leave old as is for compatibility reason. Fortunately, fts is
flexible enough, so we could add new parser at any time as an extension.



>
> regards, tom lane
>

Re: [HACKERS] [WIP] ALTER ... OWNER TO ... CASCADE

2016-02-16 Thread Oleg Bartunov

On Mon, Feb 15, 2016 at 7:25 PM, Tom Lane  wrote:

> Teodor Sigaev  writes:
> >> So basically, a generic CASCADE facility sounds like a lot of work to
> >> produce something that would seldom be anything but a foot-gun.
>
> > DELETE FROM  or TRUNCATE could be a foot-gun too, but it's not a reason
> to
> > remove tham. I faced with problem when I tried to change owner of
> datadase with
> > all objects inside. Think, this feature could be useful although it
> should
> > restricted to superuser obly.
>
> That's a pretty weak argument, and I do not think you have thought through
> all the consequences.  It is not hard at all to imagine cases where using
> this sort of thing could be a security vulnerability.  Are you familiar
> with the reasons why Unix systems don't typically allow users to "give
> away" ownership of files?  The same problem exists here.
>

yes, I remember AT&T and BSD :)



>
> To be concrete about it:
>
> 1. Alice does, say, "CREATE EXTENSION cube".
>
> 2. Bob creates a security-definer function owned by himself, using a
>"cube"-type parameter so that it's dependent on the extension.
>(It needn't actually do anything with that parameter.)
>
> 3. Alice does ALTER EXTENSION cube OWNER TO charlie CASCADE.
>
> 4. Bob now has a security-definer function owned by (and therefore
>executing as) Charlie, whose contents were determined by Bob.
>Game over for Charlie ... and for everyone else too, if Charlie is
>a superuser, which is not unlikely for an extension owner.
>
> The only way Alice can be sure that the ALTER EXTENSION is safe is if
> she manually inspects every dependent object, in which case she might
> as well not use CASCADE.
>
> Moreover, the use case you've sketched (ie, change ownership of all
> objects inside a database) doesn't actually have anything to do with
> following dependencies.  It's a lot closer to REASSIGN OWNED ... in
> fact, it's not clear to me why REASSIGN OWNED doesn't solve that
> use-case already.
>
> I remain of the opinion that this is a terrible idea.
>

+1, I also suggest to check REASSIGN OWNED.


>
>
regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

[HACKERS] commitfest problem ?

2016-02-16 Thread Oleg Bartunov

This entry https://commitfest.postgresql.org/8/419/ contains very unrelated
patches from another commitfest. I think


Oleg

Re: [HACKERS] pglogical - logical replication contrib module

2016-02-16 Thread Oleg Bartunov

On Tue, Feb 16, 2016 at 5:38 PM, Bruce Momjian  wrote:

> On Tue, Jan 26, 2016 at 08:14:26PM -0800, Joshua Drake wrote:
> > On 12/31/2015 03:34 PM, Petr Jelinek wrote:
> > >Hi,
> > >
> > >I'd like to submit the replication solution which is based on the
> > >pglogical_output [1] module (which is obviously needed for this to
> > >compile).
> >
> > This is fantastic! However, history presents itself here and
> > PostgreSQL in the past has not "blessed" a single solution for
> > Replication. Obviously that changed a bit with streaming replication
> > but this is a bit different than that. As I understand it, PgLogical
> > is Logical Replication (similar to Slony and Londiste). I wouldn't
> > be surprised (although I don't know) if Slony were to start using
> > some of the pglogical_output module features in the future.
> >
> > If we were to accept PgLogical into core, it will become the default
> > blessed solution for PostgreSQL. While that is great in some ways
> > it is a different direction than the project has taken in the past.
> > Is this what we want to do?
>
> Replying late here, but I think with binary replication, we decided
> that, assuming you were happy with the features provided, our streaming
> binary replication solution was going to be the best and recommended way
> of doing it.
>
> I don't think we ever had that feeling with Slony or Londiste in that
> there were so many limitations and so many different ways of
> implementing logical replication that we never recommended a best way.
>
> So, the question is, do we feel that PgLogical is best and recommended
> way to do logical replication.  If it is, then having it in core makes
> sense.
>

DDL support is what it's missed for now.


>
> --
>   Bruce Momjian  http://momjian.us
>   EnterpriseDB http://enterprisedb.com
>
> + As you are, so once was I. As I am, so you will be. +
> + Roman grave inscription +
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] tsearch_extras extension

2016-02-16 Thread Oleg Bartunov

On Wed, Feb 17, 2016 at 6:57 AM, Tim Abbott  wrote:

> Just following up here since I haven't gotten a reply -- I'd love to work
> with someone from the Postgres community on a plan to make the
> tsearch_extras functionality available as part of mainline postgres.
>
>
> -Tim Abbott
>
> On Wed, Feb 3, 2016 at 9:41 PM, Tim Abbott  wrote:
>
>> Hello,
>>
>> I'm a maintainer of the Zulip open source group chat application.  Zulip
>> depends on a small (~200 lines) postgres extension called tsearch_extras (
>> https://github.com/zbenjamin/tsearch_extras) that returns the actual
>> (offset, length) pairs of all the matches for a postgres full text search
>> query.  This extension allows Zulip to do its own highlighting of the full
>> text search matches, using a more complicated method than what Postgres
>> supports natively.
>>
>> I think tsearch_extras is probably of value to others using postgres
>> full-text search (and I'd certainly love to get out of the business of
>> maintaining an out-of-tree postgres extension), so I'm wondering if this
>> feature (or a variant of it) would be of interest to upstream?
>>
>> Thanks!
>>
>> -Tim Abbott
>>
>> (See
>> http://www.postgresql.org/message-id/flat/52c7186d.8010...@strangersgate.com#52c7186d.8010...@strangersgate.com
>> for the discussion on postgres mailing lists that caused us to write this
>> module in the first place.)
>>
>
Tim,

take a look on this patch (https://commitfest.postgresql.org/7/385/) and
contact author.  It contains everything you need to your purposes.

btw, Stas, check on status "Returned with feedback" !


Regards,
Oleg

Re: [HACKERS] Figures in docs

2016-02-16 Thread Oleg Bartunov

On Wed, Feb 17, 2016 at 4:17 AM, Tatsuo Ishii  wrote:

> It seems there's no figures/diagrams in our docs. I vaguely recall that
> we used to have a few diagrams in our docs. If so, was there any
> technical reason to remove them?
>

I don't know the reason, but it's shame, we are still in sgml !

We already do our translations (as others) in xml using custom scripting.
xml provides us better integration with available tools and ability to
insert graphics. Last time we asked in -docs about moving to xml and
Alexander demonstrated acceptable speed of xml build, but there were no
reply from Peter, who is (?) responsible for our documentation
infrastructure. Probably, we should just created a patch and submit to
commitfest.  You can check this thread
http://www.postgresql.org/message-id/1428009501118.85...@postgrespro.ru

> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] [WIP] speeding up GIN build with parallel workers

2016-02-17 Thread Oleg Bartunov

On Wed, Feb 17, 2016 at 6:55 PM, Constantin S. Pan  wrote:

> On Sat, 16 Jan 2016 01:38:39 +0300
> "Constantin S. Pan"  wrote:
>
> > The task of building GIN can require lots of time and eats 100 % CPU,
> > but we could easily make it use more than a 100 %, especially since we
> > now have parallel workers in postgres.
> >
> > The process of building GIN looks like this:
> >
> > 1. Accumulate a batch of index records into an rbtree in maintenance
> > work memory.
> >
> > 2. Dump the batch to disk.
> >
> > 3. Repeat.
> >
> > I have a draft implementation which divides the whole process between
> > N parallel workers, see the patch attached. Instead of a full scan of
> > the relation, I give each worker a range of blocks to read.
> >
> > This speeds up the first step N times, but slows down the second one,
> > because when multiple workers dump item pointers for the same key,
> > each of them has to read and decode the results of the previous one.
> > That is a huge waste, but there is an idea on how to eliminate it.
> >
> > When it comes to dumping the next batch, a worker does not do it
> > independently. Instead, it (and every other worker) sends the
> > accumulated index records to the parent (backend) in ascending key
> > order. The backend, which receives the records from the workers
> > through shared memory, can merge them and dump each of them once,
> > without the need to reread the records N-1 times.
> >
> > In current state the implementation is just a proof of concept
> > and it has all the configuration hardcoded, but it already works as
> > is, though it does not speed up the build process more than 4 times
> > on my configuration (12 CPUs). There is also a problem with temporary
> > tables, for which the parallel mode does not work.
>
> Hey Hackers!
>
> I have made some progress on the proposal (see the attached patch):
>
> 0. Moved some repeated code to functions (e.g. ginDumpAccumulator,
> ginbuildCommon).
>
> 1. Implemented results merging on backend.
>
> 2. Disabled the usage of parallel mode when creating index on temporary
> tables. No point in using parallel mode for temporary tables anyway,
> right?
>
> 3. Added GUC parameter to control the number of workers for GIN
> building.
>
> 4. Hit the 8x speedup limit. Made some analysis of the reasons (see the
> attached plot or the data file).
>
> In order to analyze the performance issues, I have made the following:
>
> create table t (k int, v int[]);
>
> create or replace
> function randarray(width int, low int, high int)
> returns int[] as
> $$
> select array(select (random()*(high-low) + low)::int
> from generate_series(1,width))
> $$ language sql;
>
> insert into t select k, randarray(3000, 0, 10)
> from generate_series(1, 10) as k;
>
> create index t_v_idx on t using gin (v);
>
> This creates 10 arrays of 3000 random numbers each. The random
> numbers are in range [0, 10]. Then I measure how long the gin
> building steps take. There are two steps: scan and merge.
>
> The results show that 'scan' step is sped up perfectly. But the
> 'merge' step takes longer as you increase the number of workers. The
> profiler shows that the bottleneck here is ginMergeItemPointers(), which
> I use to merge the results.
>
> Also, I did encounter the problem with workers deadlocking during
> heap_open, but that seems to have been resolved by Robert Haas in his
> commit regarding group locking.
>
> Please leave your feedback!
>

My feedback is (Mac OS X 10.11.3)

set gin_parallel_workers=2;
create index message_body_idx on messages using gin(body_tsvector);
LOG:  worker process: parallel worker for PID 5689 (PID 6906) was
terminated by signal 11: Segmentation fault
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably me

Re: [HACKERS] The plan for FDW-based sharding

2016-02-24 Thread Oleg Bartunov

On Wed, Feb 24, 2016 at 12:17 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> Hi, Bruce!
>
> The important point for me is to distinguish different kind of plans:
> implementation plan and research plan.
> If we're talking about implementation plan then it should be proven that
> proposed approach works in this case. I.e research should be already done.
> If we're talking about research plan then we should realize that result is
> unpredictable. And we would probably need to dramatically change our way.
>
> This two things would work with FDW:
> 1) Pull data from data nodes to coordinator.
> 2) Pushdown computations from coordinator to data nodes: joins, aggregates
> etc.
> It's proven and clear. This is good.
> Another point is that these FDW advances are useful by themselves. This is
> good too.
>
> However, the model of FDW assumes that communication happen only between
> coordinator and data node. But full-weight distributed optimized can't be
> done under this restriction, because it requires every node to communicate
> every other node if it makes distributed query faster. And as I get, FDW
> approach currently have no research and no particular plan for that.
>
> As I get from Robert Haas's talk (
> https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxyb2JlcnRtaGFhc3xneDo1ZmFhYzBhNjNhNzVhMDM0
> )
>
>> Before we consider repartitioning joins, we should probably get
>> everything previously discussed working first.
>> – Join Pushdown For Parallelism, FDWs
>> – PartialAggregate/FinalizeAggregate
>> – Aggregate Pushdown For Parallelism, FDWs
>> – Declarative Partitioning
>> – Parallel-Aware Append
>
>
> So, as I get we didn't ever think about possibility of data redistribution
> using FDW. Probably, something changed since that time. But I haven't heard
> about it.
>
> On Tue, Feb 23, 2016 at 7:43 PM, Bruce Momjian  wrote:
>
>> Second, as part of this staged implementation, there are several use
>> cases that will be shardable at first, and then only later, more complex
>> ones.  For example, here are some use cases and the technology they
>> require:
>>
>> 1. Cross-node read-only queries on read-only shards using aggregate
>> queries, e.g. data warehouse:
>>
>> This is the simplest to implement as it doesn't require a global
>> transaction manager, global snapshot manager, and the number of rows
>> returned from the shards is minimal because of the aggregates.
>>
>> 2. Cross-node read-only queries on read-only shards using non-aggregate
>> queries:
>>
>> This will stress the coordinator to collect and process many returned
>> rows, and will show how well the FDW transfer mechanism scales.
>>
>
> FDW would work for queries which fits pull-pushdown model. I see no plan
> to make other queries work.
>
>
>> 3. Cross-node read-only queries on read/write shards:
>>
>> This will require a global snapshot manager to make sure the shards
>> return consistent data.
>>
>> 4. Cross-node read-write queries:
>>
>> This will require a global snapshot manager and global snapshot manager.
>>
>
> At this point, it unclear why don't you refer work done in the direction
> of distributed transaction manager (which is also distributed snapshot
> manager in your terminology)
> http://www.postgresql.org/message-id/56bb7880.4020...@postgrespro.ru
>
>
>> In 9.6, we will have FDW join and sort pushdown
>> (http://thombrown.blogspot.com/2016/02/postgresql-96-part-1-horizontal-s
>> calability.html
>> ).
>> Unfortunately I don't think we will have aggregate
>> pushdown, so we can't test #1, but we might be able to test #2, even in
>> 9.5.  Also, we might have better partitioning syntax in 9.6.
>>
>> We need things like parallel partition access and replicated lookup
>> tables for more join pushdown.
>>
>> In a way, because these enhancements are useful independent of sharding,
>> we have not tested to see how well an FDW sharding setup will work and
>> for which workloads.
>>
>
> This is the point I agree. I'm not objecting against any single FDW
> advance, because it's useful by itself.
>
> We know Postgres XC/XL works, and scales, but we also know they require
>> too many code changes to be merged into Postgres (at least based on
>> previous discussions).  The FDW sharding approach is to enhance the
>> existing features of Postgres to allow as much sharding as possible.
>>
>
> This comparison doesn't seems correct to me. Postgres XC/XL supports data
> redistribution between nodes. And I haven't heard any single idea of
> supporting this in FDW. You are comparing not equal things.
>
>
>> Once that is done, we can see what workloads it covers and
>> decide if we are willing to copy the volume of code necessary
>> to implement all supported Postgres XC or XL workloads.
>> (The Postgres XL license now matches the Postgres license,
>> http://www.postgres-xl.org/2015/07/license-change-and-9-5-merge/.
>> Postgres XC has

Re: [HACKERS] The plan for FDW-based sharding

2016-02-26 Thread Oleg Bartunov

On Fri, Feb 26, 2016 at 3:50 PM, Robert Haas  wrote:

> On Wed, Feb 24, 2016 at 3:05 PM, Oleg Bartunov 
> wrote:
> > I already several times pointed, that we need XTM to be able to continue
> > development in different directions, since there is no clear winner.
> > Moreover, I think there is no fits-all  solution and while I agree we
> need
> > one built-in in the core, other approaches should have ability to exists
> > without patching.
>
> I don't think I necessarily agree with that.  Transaction management
> is such a fundamental part of the system that I think making it
> pluggable is going to be really hard.  I understand that you've done
> several implementations based on your proposed API, and that's good as
> far as it goes, but how do we know that's really going to be general
> enough for what other people might need?

Right now tm is hardcoded and it's doesn't matter  "if other people might
need" at all.  We at least provide developers ("other people")  ability to
work on their implementations and the patch  is safe and doesn't sacrifices
anything in core.

> And what makes us think we
> really need multiple transaction managers, anyway?

If you brave enough to say that one tm-fits-all and you are able to teach
existed tm to play well  in various clustering environment during
development period, which is short, than probably we don't need  multiple
tms. But It's too perfect to believe and practical solution is to let
multiple groups to work on their solutions.

> Even writing one
> good distributed transaction manager seems like a really hard project
> - why would we want to write two or three or five?
>

again, right now it's simply impossible to any bright person to work on
dtms.  It's time to start working on dtm, I believe. The fact you don't
think about distributed transactions support doesn't mean there no "other
people", who has different ideas on postgres future.  That's why we propose
this patch, let's play the game !

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

[HACKERS] Fwd: [Snowball-discuss] Greek stemmer

2016-08-25 Thread Oleg Bartunov

This is a chance to add default configuration for Greek language if
somebody with good knowledge could follow this development.

Oleg

-- Forwarded message --
From: Oleg Smirnov 
Date: Thu, Aug 25, 2016 at 5:26 PM
Subject: [Snowball-discuss] Greek stemmer
To: "snowball-discu." 

Hi all,

 I have implemented a stemmer for Modern Greek language [1] based on a
thesis by G. Ntais [2] with improvements proposed by S. Saroukos [3]

 I'm pretty new to Snowball so it will be great if someone could
review my code. Any feedback is much appreciated.

 1. https://github.com/snowballstem/snowball/pull/44
 2. http://sais.se/mthprize/2007/ntais2007.pdf
 3. http://tampub.uta.fi/bitstream/handle/10024/80480/gradu03463.pdf

--
Regards,
Oleg Smirnov

___
Snowball-discuss mailing list
snowball-disc...@lists.tartarus.org
http://lists.tartarus.org/mailman/listinfo/snowball-discuss

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better tracking of free space during SP-GiST index build

2016-09-25 Thread Oleg Bartunov

On Sat, Sep 24, 2016 at 11:32 PM, Tomas Vondra
 wrote:
> On 09/22/2016 07:37 PM, Tom Lane wrote:
>>
>> Tomas Vondra  writes:
>>
>>> ... I've tried increasing the cache size to 768
>>> entries, with vast majority of them (~600) allocated to leaf pages.
>>> Sadly, this seems to only increase the CREATE INDEX duration a bit,
>>> without making the index significantly smaller (still ~120MB).
>>
>>
>> Yeah, that's in line with my results: not much further gain from a
>> larger cache.  Though if you were testing with the same IRRExplorer
>> data, it's not surprising that our results would match.  Would be
>> good to try some other cases...
>>
>
> Agreed, but I don't have any other data sets at hand. One possibility would
> be to generate something randomly (e.g. it's not particularly difficult to
> generate random IP addresses), but I'd much rather use some real-world data
> sets.

Tomas, I have one real dataset, which I used for testing spgist
(https://www.postgresql.org/message-id/caf4au4zxd2xov0a__fu7xohxsiwjzm1z2xhs-ffat1dzb9u...@mail.gmail.com)
Let me know if you need it.

>
>>>
>>>
>>> One thing I'd change is making the SpGistLUPCache dynamic, i.e.
>>> storing the size and lastUsedPagesMap on the meta page. That
>>> should allow us resizing the cache and tweak lastUsedPagesMap in
>>> the future.
>>
>>
>> Yeah, probably a good idea. I had thought of bumping
>> SPGIST_MAGIC_NUMBER again if we want to revisit the cache size; but
>> keeping it as a separate field won't add noticeable cost, and it
>> might save some trouble.
>>
>
> I see you plan to track only the cache size, while I proposed to track also
> the map, i.e. number of pages per category. I think that'd useful in case we
> come up with better values (e.g. more entries for leaf pages), or even
> somewhat adaptive way.
>
> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to implement a SP-GiST index as a extension module?

2017-10-30 Thread Oleg Bartunov

On Sun, Oct 29, 2017 at 10:07 AM, Connor Wolf
 wrote:
> Hi there!
>
> I'm looking at implementing a custom indexing scheme, and I've been having
> trouble understanding the proper approach.
>
> Basically, I need a BK tree, which is a tree-structure useful for indexing
> arbitrary discrete metric-spaces (in my case, I'm interested in indexing
> across the hamming edit-distance of perceptual hashes, for fuzzy image
> searching). I'm pretty sure a SP-GiST index is the correct index type, as my
> tree is intrinsically unbalanced.
>
> I have a functional stand-alone implementation of a BK-Tree, and it works
> very well, but the complexity of managing what is basically a external index
> for my database has reached the point where it's significantly problematic,
> and it seems to be it should be moved into the database.
>
> Anyways, looking at the contents of postgres/src/backend/access/spgist, it
> looks pretty straightforward in terms of the actual C implementation, but
> I'm stuck understanding how to "install" a custom SP-GiST implementation.
> There are several GiST indexing implementations in the contrib directory,
> but no examples for how I'd go about implementing a loadable SP-GiST index.
>
> Basically, my questions are:
>
> Is it possible to implement a SP-GiST indexing scheme as a loadable module?
>
> If so, how?
> And is there an example I can base my implementation off of?

Look on RUM access method ( https://github.com/postgrespro/rum ) we
developed using
api available since 9.6.


>
> I'm relatively comfortable with C (much moreso with C++), but I haven't
> spent a lot of time looking at the postgresql codebase.  I don't think I
> could start from a empty folder and make a properly-implemented module in
> any reasonable period of time, so if I have a working example for some sort
> of index that uses the same interfaces that would really help a lot.
>
> Thanks!
> Connor


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to implement a SP-GiST index as a extension module?

2017-10-30 Thread Oleg Bartunov

On Mon, Oct 30, 2017 at 12:05 PM, Oleg Bartunov  wrote:
> On Sun, Oct 29, 2017 at 10:07 AM, Connor Wolf
>  wrote:
>> Hi there!
>>
>> I'm looking at implementing a custom indexing scheme, and I've been having
>> trouble understanding the proper approach.
>>
>> Basically, I need a BK tree, which is a tree-structure useful for indexing
>> arbitrary discrete metric-spaces (in my case, I'm interested in indexing
>> across the hamming edit-distance of perceptual hashes, for fuzzy image
>> searching). I'm pretty sure a SP-GiST index is the correct index type, as my
>> tree is intrinsically unbalanced.
>>
>> I have a functional stand-alone implementation of a BK-Tree, and it works
>> very well, but the complexity of managing what is basically a external index
>> for my database has reached the point where it's significantly problematic,
>> and it seems to be it should be moved into the database.
>>
>> Anyways, looking at the contents of postgres/src/backend/access/spgist, it
>> looks pretty straightforward in terms of the actual C implementation, but
>> I'm stuck understanding how to "install" a custom SP-GiST implementation.
>> There are several GiST indexing implementations in the contrib directory,
>> but no examples for how I'd go about implementing a loadable SP-GiST index.
>>
>> Basically, my questions are:
>>
>> Is it possible to implement a SP-GiST indexing scheme as a loadable module?
>>
>> If so, how?
>> And is there an example I can base my implementation off of?
>
> Look on RUM access method ( https://github.com/postgrespro/rum ) we
> developed using
> api available since 9.6.

or even simple, there is contrib/bloom access method, which illustrates
developing access method as an extension.

>
>
>>
>> I'm relatively comfortable with C (much moreso with C++), but I haven't
>> spent a lot of time looking at the postgresql codebase.  I don't think I
>> could start from a empty folder and make a properly-implemented module in
>> any reasonable period of time, so if I have a working example for some sort
>> of index that uses the same interfaces that would really help a lot.
>>
>> Thanks!
>> Connor


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Custom compression methods

2017-11-05 Thread Oleg Bartunov

On Thu, Nov 2, 2017 at 6:02 PM, Craig Ringer  wrote:
> On 2 November 2017 at 17:41, Ildus Kurbangaliev
>  wrote:
>
>> In this patch compression methods is suitable for MAIN and EXTENDED
>> storages like in current implementation in postgres. Just instead only
>> of LZ4 you can specify any other compression method.
>
> We've had this discussion before.
>
> Please read the "pluggable compression support" thread. See you in a
> few days ;) sorry, it's kinda long.
>
> https://www.postgresql.org/message-id/flat/20130621000900.GA12425%40alap2.anarazel.de#20130621000900.ga12...@alap2.anarazel.de
>

the proposed patch provides "pluggable" compression and let's user
decide by their own which algorithm to use.
The postgres core doesn't responsible for any patent problem.


> IIRC there were some concerns about what happened with pg_upgrade,
> with consuming precious toast bits, and a few other things.

yes, pg_upgrade may be a problem.

>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] need info about extensibility in other databases

2017-11-08 Thread Oleg Bartunov

On Wed, Nov 8, 2017 at 2:37 PM, Li Song  wrote:
> Hi,
>
> When is the English version of "GiST programming tutorial" available?

Well, I wrote it many years ago, so it needs to be updated. For now,
you can use google translate, which seems works fine
https://translate.google.com/translate?sl=auto&tl=en&js=y&prev=_t&hl=en&ie=UTF-8&u=http%3A%2F%2Fwww.sai.msu.su%2F~megera%2Fpostgres%2Ftalks%2Fgist_tutorial.html&edit-text=

Also, we have 7 papers about indexing, also in russian, check
https://habrahabr.ru/company/postgrespro/

>
> Regards,
> Song
>
>
>
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Generic type subscription

2016-10-04 Thread Oleg Bartunov

On Sat, Oct 1, 2016 at 12:52 PM, Dmitry Dolgov <9erthali...@gmail.com>
wrote:

> > I've tried to compile this patch with current state of master (commit
> > 51c3e9fade76c12)  and found out that, when configured with
> --enable-cassert,
> > it doesn't pass make check.
>
> Thanks for the feedback. Yes, unexpectedly for me, `ExecEvalExpr` can
> return
> expanded `jbvArray` and `jbvObject` instead `jbvBinary` in both cases. It's
> interesting, that this doesn't break anything, but obviously violates
> the `pushJsonbValueScalar` semantics. I don't think `ExecEvalExpr` should
> be
> changed for jsonb, we can handle this situation in `pushJsonbValue`
> instead. I've
> attached a new version of patch with this modification.
>
>
have you ever run 'make check' ?

=
 53 of 168 tests failed.
=



> On 27 September 2016 at 19:08, Victor Wagner  wrote:
>
>> On Fri, 9 Sep 2016 18:29:23 +0700
>> Dmitry Dolgov <9erthali...@gmail.com> wrote:
>>
>> > Hi,
>> >
>> > Regarding to the previous conversations [1], [2], here is a patch
>> > (with some improvements from Nikita Glukhov) for the generic type
>> > subscription. It allows
>> > to define type-specific subscription logic for any data type and has
>> > implementations for the array and jsonb types. There are following
>> > changes in this
>> > patch:
>>
>> I've tried to compile this patch with current state of master (commit
>> 51c3e9fade76c12)  and found out that, when configured with
>> --enable-cassert, it doesn't pass make check.
>>
>> LOG:  server process (PID 4643) was terminated by signal 6: Aborted
>> DETAIL:  Failed process was running:
>> update test_jsonb_subscript set test_json['a'] = '{"b":
>> 1}'::jsonb;
>>
>>
>>
>> --
>>  Victor
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Re: [HACKERS] [PATCH] Generic type subscription

2016-10-05 Thread Oleg Bartunov

On Wed, Oct 5, 2016 at 6:48 AM, Dmitry Dolgov <9erthali...@gmail.com> wrote:

> On 5 October 2016 at 03:00, Oleg Bartunov  wrote:
>
>>
>> have you ever run 'make check' ?
>>
>> =
>>  53 of 168 tests failed.
>> =
>>
>>
> Sure, how else could I write tests for this feature? But right now on my
> machine
> everything is ok (the same for `make installcheck`):
>
> $ make check
> 
> ===
>  All 168 tests passed.
> ===
>

Oops, something was wrong in my setup :)

Re: [HACKERS] [PATCH] few fts functions for jsonb

2017-02-28 Thread Oleg Bartunov

The proposed patch looks not very important, but I consider it as an
important feature, which Oracle and Microsoft already have, that's why I
asked Dmitry to work on this and made it before feature freeze. My comments
follows below the post.

On Tue, Feb 28, 2017 at 1:59 PM, Dmitry Dolgov <9erthali...@gmail.com>
wrote:

> Hi all
>
> I would like to propose patch with a set of new small functions for fts in
> case of
> jsonb data type:
>
> * to_tsvector(config, jsonb) - make a tsvector from all string values and
>   elements of jsonb object. To prevent the situation, when tsquery can
> find a
>   phrase consisting of lexemes from two different values/elements, this
>   function will add an increment to position of each lexeme from every new
>   value/element.
>
> * ts_headline(config, jsonb, tsquery, options) - generate a headline
> directly
>   from jsonb object
>
> Here are the examples how they work:
>
> ```
> =# select to_tsvector('{"a": "aaa bbb", "b": ["ccc ddd"], "c": {"d": "eee
> fff"}}'::jsonb);
>to_tsvector
> -
>  'aaa':1 'bbb':2 'ccc':4 'ddd':5 'eee':7 'fff':8
> (1 row)
>
>
> =# select ts_headline('english', '{"a": "aaa bbb", "b": {"c": "ccc
> ddd"}}'::jsonb, tsquery('bbb & ddd & hhh'), 'StartSel = <, StopSel = >');
>  ts_headline
> --
>  aaa  ccc 
> (1 row)
> ```
>

> Any comments or suggestions?
>

1. add json support
2. Its_headline  should returns the original json with highlighting.  As a
first try the proposed ts_headline  could be ok, probably need special
option.



> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

[HACKERS] SQL/JSON in PostgreSQL

2017-02-28 Thread Oleg Bartunov

Hi there,


Attached patch is an implementation of SQL/JSON data model from SQL-2016
standard (ISO/IEC 9075-2:2016(E)), which was published 2016-12-15 and is
available only for purchase from ISO web site (
https://www.iso.org/standard/63556.html). Unfortunately I didn't find any
public sources of the standard or any preview documents, but Oracle
implementation of json support in 12c release 2 is very close (
http://docs.oracle.com/database/122/ADJSN/json-in-oracle-database.htm),
also we used https://livesql.oracle.com/  to understand some details.

Postgres has already two json data types - json and jsonb and implementing
another json data type, which strictly conforms the standard, would be not
a good idea. Moreover, SQL standard doesn’t describe data type, but only
data model, which “comprises SQL/JSON items and SQL/JSON sequences. The
components of the SQL/JSON data model are:

1) An SQL/JSON item is defined recursively as any of the following:

a) An SQL/JSON scalar, defined as a non-null value of any of the following
predefined (SQL) types:

character string with character set Unicode, numeric, Boolean, or datetime.

b) An SQL/JSON null, defined as a value that is distinct from any value of
any SQL type.

NOTE 122 — An SQL/JSON null is distinct from the SQL null value.

c) An SQL/JSON array, defined as an ordered list of zero or more SQL/JSON
items, called the SQL/JSON

elements of the SQL/JSON array.

d) An SQL/JSON object, defined as an unordered collection of zero or more
SQL/JSON members….

“

Our jsonb corresponds to SQL/JSON with UNIQUE KEYS and implicit ordering of
keys and our main intention was to provide support of jsonb as a most
important and usable data type.

We created repository for reviewing (ask for write access) -
https://github.com/postgrespro/sqljson/tree/sqljson


Examples of usage can be found in src/test/regress/sql/sql_json.sql

The whole documentation about json support should be reorganized and added,
and we plan to do this before release. We need help of community here.

Our goal is to provide support of main features of SQL/JSON to release 10,
as we discussed at developers meeting in Brussels (Andrew Dunstan has
kindly agreed to review the patch).

We had not much time to develop the complete support, because of standard
availability), but hope all major features are here, namely, all nine
functions as described in the standard (but see implementation notes below):

“All manipulation (e.g., retrieval, creation, testing) of SQL/JSON items is
performed through a number of SQL/JSON functions. There are nine such
functions, categorized as SQL/JSON retrieval functions and SQL/JSON
construction functions. The SQL/JSON retrieval functions are characterized
by operating on JSON data and returning an SQL value (possibly a Boolean
value) or a JSON value. The SQL/JSON construction functions return JSON
data created from operations on SQL data or other JSON data.

The SQL/JSON retrieval functions are:

— : extracts an SQL value of a predefined type from a
JSON text.

— : extracts a JSON text from a JSON text.

— : converts a JSON text to an SQL table.

— : tests whether a string value is or is not properly
formed JSON text.

— : tests whether an SQL/JSON path expression
returns any SQL/JSON items.

The SQL/JSON construction functions are:

— : generates a string that is a serialization of
an SQL/JSON object.

— : generates a string that is a serialization of
an SQL/JSON array.

— : generates, from an aggregation of
SQL data, a string that is a serialization

of an SQL/JSON object.

— : generates, from an aggregation of SQL
data, a string that is a serialization

of an SQL/JSON array.

A JSON-returning function is an SQL/JSON construction function or
JSON_QUERY.”

The standard describes SQL/JSON path language, which used by SQL/JSON query
operators to query JSON. It defines path language as string literal. We
implemented the path language as  JSONPATH data type, since other
approaches are not friendly to planner and executor.

The functions and JSONPATH provide a new functionality for json support,
namely, ability to operate (in standard specified way) with json structure
at SQL-language level - the often requested feature by the users.

The patch is consists of about 15000 insertions (about 5000 lines are from
tests), passes all regression tests and doesn’t touches critical parts, so
we hope with community help to bring it to committable state.

Authors: Nikita Glukhov, Teodor Sigaev, Oleg Bartunov and Alexander Korotkov

Implementation notes:


   1.

   We didn’t implemented ‘datetime’ support, since it’s not clear from
   standard.
   2.

   JSON_OBJECT/JSON_OBJECTAGG (KEY  VALUE , ...) doesn’t
   implemented, only (:, …) and ( VALUE , …) are
   supported, because of  grammar conflicts with leading KEY keyword.
   3.

   FORMAT (JSON|JSONB))  in JSON_ARRAYAGG with subquery  doesn’t supported,
   because of grammar conflicts with non-reserved word FORMAT.
   4.

   JSO

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-02-28 Thread Oleg Bartunov

On Tue, Feb 28, 2017 at 10:55 PM, Pavel Stehule 
wrote:

> Hi
>
>
> 2017-02-28 20:08 GMT+01:00 Oleg Bartunov :
>
>> Hi there,
>>
>>
>> Attached patch is an implementation of SQL/JSON data model from SQL-2016
>> standard (ISO/IEC 9075-2:2016(E)), which was published 2016-12-15 and is
>> available only for purchase from ISO web site (
>> https://www.iso.org/standard/63556.html). Unfortunately I didn't find
>> any public sources of the standard or any preview documents, but Oracle
>> implementation of json support in 12c release 2 is very close (
>> http://docs.oracle.com/database/122/ADJSN/json-in-oracle-database.htm),
>> also we used https://livesql.oracle.com/  to understand some details.
>>
>> Postgres has already two json data types - json and jsonb and
>> implementing another json data type, which strictly conforms the standard,
>> would be not a good idea. Moreover, SQL standard doesn’t describe data
>> type, but only data model, which “comprises SQL/JSON items and SQL/JSON
>> sequences. The components of the SQL/JSON data model are:
>>
>> 1) An SQL/JSON item is defined recursively as any of the following:
>>
>> a) An SQL/JSON scalar, defined as a non-null value of any of the
>> following predefined (SQL) types:
>>
>> character string with character set Unicode, numeric, Boolean, or
>> datetime.
>>
>> b) An SQL/JSON null, defined as a value that is distinct from any value
>> of any SQL type.
>>
>> NOTE 122 — An SQL/JSON null is distinct from the SQL null value.
>>
>> c) An SQL/JSON array, defined as an ordered list of zero or more SQL/JSON
>> items, called the SQL/JSON
>>
>> elements of the SQL/JSON array.
>>
>> d) An SQL/JSON object, defined as an unordered collection of zero or more
>> SQL/JSON members….
>>
>> “
>>
>> Our jsonb corresponds to SQL/JSON with UNIQUE KEYS and implicit ordering
>> of keys and our main intention was to provide support of jsonb as a most
>> important and usable data type.
>>
>> We created repository for reviewing (ask for write access) -
>> https://github.com/postgrespro/sqljson/tree/sqljson
>>
>> Examples of usage can be found in src/test/regress/sql/sql_json.sql
>>
>> The whole documentation about json support should be reorganized and
>> added, and we plan to do this before release. We need help of community
>> here.
>>
>> Our goal is to provide support of main features of SQL/JSON to release
>> 10, as we discussed at developers meeting in Brussels (Andrew Dunstan has
>> kindly agreed to review the patch).
>>
>> We had not much time to develop the complete support, because of standard
>> availability), but hope all major features are here, namely, all nine
>> functions as described in the standard (but see implementation notes below):
>>
>> “All manipulation (e.g., retrieval, creation, testing) of SQL/JSON items
>> is performed through a number of SQL/JSON functions. There are nine such
>> functions, categorized as SQL/JSON retrieval functions and SQL/JSON
>> construction functions. The SQL/JSON retrieval functions are characterized
>> by operating on JSON data and returning an SQL value (possibly a Boolean
>> value) or a JSON value. The SQL/JSON construction functions return JSON
>> data created from operations on SQL data or other JSON data.
>>
>> The SQL/JSON retrieval functions are:
>>
>> — : extracts an SQL value of a predefined type from
>> a JSON text.
>>
>> — : extracts a JSON text from a JSON text.
>>
>> — : converts a JSON text to an SQL table.
>>
>> — : tests whether a string value is or is not properly
>> formed JSON text.
>>
>> — : tests whether an SQL/JSON path expression
>> returns any SQL/JSON items.
>>
>> The SQL/JSON construction functions are:
>>
>> — : generates a string that is a serialization
>> of an SQL/JSON object.
>>
>> — : generates a string that is a serialization of
>> an SQL/JSON array.
>>
>> — : generates, from an aggregation of
>> SQL data, a string that is a serialization
>>
>> of an SQL/JSON object.
>>
>> — : generates, from an aggregation of
>> SQL data, a string that is a serialization
>>
>> of an SQL/JSON array.
>>
>> A JSON-returning function is an SQL/JSON construction function or
>> JSON_QUERY.”
>>
>> The standard describes SQL/JSON path language, which used by SQL/JSON
>> query operators to query JSON. It defines path language as string literal.
>> We implemente

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-07 Thread Oleg Bartunov

On Fri, Mar 3, 2017 at 11:49 PM, David Steele  wrote:

> Hi Oleg,
>
> On 2/28/17 2:55 PM, Pavel Stehule wrote:
> > 2017-02-28 20:08 GMT+01:00 Oleg Bartunov  >
> > Attached patch is an implementation of SQL/JSON data model from
> > SQL-2016 standard (ISO/IEC 9075-2:2016(E)), which was published
> > 2016-12-15 and is available only for purchase from ISO web site
> > (https://www.iso.org/standard/63556.html
> > <https://www.iso.org/standard/63556.html>). Unfortunately I didn't
> > find any public sources of the standard or any preview documents,
> > but Oracle implementation of json support in 12c release 2 is very
> > close
> > (http://docs.oracle.com/database/122/ADJSN/json-in-
> oracle-database.htm
> > <http://docs.oracle.com/database/122/ADJSN/json-in-
> oracle-database.htm>),
> > also we used https://livesql.oracle.com/  to understand some
> details.
>
> <...>
>
> > This is last commitfest for current release cycle - are you sure, so is
> > good idea to push all mentioned features?
>
> Implementing standards is always a goal of the PostgreSQL community, but
> this is a very large patch arriving very late in the release cycle with
> no prior discussion.
>

We discussed this in Brussels, but I agree, the patch is rather big.


>
> That the patch proposed follows a standard which will not be available
> to the majority of reviewers is very worrisome, let alone the sheer
> size.  While much of the code is new, I see many changes to core data
> structures that could very easily be destabilizing.
>

I don't know when the standard will be publicly available.


>
> I propose we move this patch to the 2017-07 CF so further development
> and review can be done without haste and as the standard becomes more
> accessible.
>

I wanted to have one more  good feature in 10 and let postgres be on par
with other competitors.  SQL/JSON adds many interesting features and users
will be dissapointed if we postpone it for next two years.   Let's wait for
reviewers, probably they will find the patch is not very  intrusive. We
have a plenty of time and we dedicate one full-time developer for this
project.


>
> Regards,
> --
> -David
> da...@pgmasters.net
>

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-08 Thread Oleg Bartunov

On Wed, Mar 8, 2017 at 12:43 AM, Sven R. Kunze  wrote:

> Hi,
>
> about the datetime issue: as far as I know, JSON does not define a
> serialization format for dates and timestamps.
>
> On the other hand, YAML (as a superset of JSON) already supports a
> language-independent date(time) serialization format (
> http://yaml.org/type/timestamp.html).
>
> I haven't had a glance into the SQL/JSON standard yet and a quick search
> didn't reveal anything. However, reading your test case here
> https://github.com/postgrespro/sqljson/blob/5a8a241/src/
> test/regress/sql/sql_json.sql#L411 it seems as if you intend to parse all
> strings in the form of "-MM-DD" as dates. This is problematic in case a
> string happens to look like this but is not intended to be a date.
>

SQL/JSON defines methods in jsonpath, in particularly,


| datetime  [  ] 
| keyvalue  

 ::=


datetime template is also specified in the standard (very rich)

 ::=
{  }...
 ::=

| 
 ::=

| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
 ::=

| 
| 
| 
| 
| 
| 
| 
 ::=
 | YYY | YY | Y
 ::=
 | RR
 ::=
MM
 ::=
DD
 ::=
DDD
 ::=
HH | HH12
 ::=
HH24
 ::=
MI
 ::=
SS
 ::=
S
 ::=
FF1 | FF2 | FF3 | FF4 | FF5 | FF6 | FF7 | FF8 | FF9
 ::=
A.M. | P.M.
 ::=
TZH
 ::=
TZM



> Just for the sake of completeness: YAML solves this issue by omitting the
> quotation marks around the date string (just as JSON integers have no
> quotations marks around them).
>

interesting idea, but need to dig the standard first.


>
> Regards,
> Sven
>

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-08 Thread Oleg Bartunov

On Wed, Mar 8, 2017 at 7:05 AM, David Steele  wrote:

> On 3/7/17 11:38 AM, Andres Freund wrote:
>
> <...>
>
> We have a plenty of time and we dedicate one full-time developer for
>>> this project.
>>>
>>
>> How about having that, and perhaps others, developer participate in
>> reviewing patches and getting to the bottom of the commitfest?  Should
>> we end up being done early, we can look at this patch...  There's not
>> been review activity corresponding to the amount of submissions from
>> pgpro...
>>
>
> This patch has been moved to CF 2017-07.
>

Yes, after committing XMLTABLE, we anyway need to extend its infrastructure
to support JSON_TABLE.


>
> --
> -David
> da...@pgmasters.net
>

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-11 Thread Oleg Bartunov

On Fri, Mar 10, 2017 at 7:07 AM, Petr Jelinek 
wrote:

> On 09/03/17 19:50, Peter van Hardenberg wrote:
> > Anecdotally, we just stored dates as strings and used a convention (key
> > ends in "_at", I believe) to interpret them. The lack of support for
> > dates in JSON is well-known, universally decried... and not a problem
> > the PostgreSQL community can fix.
> >
>
> The original complain was about JSON_VALUE extracting date but I don't
> understand why there is problem with that, the SQL/JSON defines that
> behavior. The RETURNING clause there is more or less just shorthand for
> casting with some advanced options.
>

There is no problem with serializing date and SQL/JSON describes it rather
well. There is no correct procedure to deserialize date from a correct json
string and the standards keeps silence about this and now we understand
that date[time] is actually virtual and the only use of them is in jsonpath
(filter) expressions.

>
> --
>   Petr Jelinek  http://www.2ndQuadrant.com/
>   PostgreSQL Development, 24x7 Support, Training & Services
>

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-13 Thread Oleg Bartunov

On Mon, Mar 13, 2017 at 9:24 AM, Nico Williams 
wrote:

> On Thu, Mar 09, 2017 at 07:12:07PM +0100, Sven R. Kunze wrote:
> > From my day-to-day work I can tell, the date(time) type is the only
> missing
> > piece of JSON to make it perfect for business applications (besides,
> maybe,
> > a "currency" type).
>
> And a binary type.  And a chunked-string type (to avoid having to escape
> strings).  And an interval type.  And...
>

Let's first have this basic implementation in postgres, then we'll add
extendability support not only for types, but also for operators.
Right now I see in our regression tests:

select _jsonpath_object(
'["10.03.2017 12:34 +1", "10.03.2017 12:35 +1", "10.03.2017 12:36 +1",
"10.03.2017 12:35 +2", "10.03.2017 12:35 -2"]',
'$[*].datetime("dd.mm. HH24:MI TZH") ? (@ < "10.03.2017 12:35
+1".datetime("dd.mm. HH24:MI TZH"))'
);
 _jsonpath_object
--
 "2017-03-10 14:34:00+03"
 "2017-03-10 13:35:00+03"
(2 rows)

Re: [HACKERS] Why is it "JSQuery"?

2014-06-10 Thread Oleg Bartunov

People,

we have many other tasks than guessing the language name.
jsquery is just an extension, which we invent to test our indexing
stuff.  Eventually, it grew out.  I think we'll think on better name
if developers agree to have it in core. For now, jsquery is good
enough to us.

jsquery name doesn't need to be used at all, by the way.

Oleg

On Tue, Jun 10, 2014 at 10:04 PM, David E. Wheeler
 wrote:
> On Jun 6, 2014, at 3:50 PM, Josh Berkus  wrote:
>
>> Maybe we should call it "jsonesque"  ;-)
>
> I propose JOQL: JSON Object Query Language.
>
> Best,
>
> David
>
> PS: JAQL sounds better, but [already exists](http://code.google.com/p/jaql/).

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why is it "JSQuery"?

2014-06-10 Thread Oleg Bartunov

The closest problem we have is jsonb statistics (lack of, actually) ,
which prevents use of all the power of jsquery. I hope Jan Urbański
could work on this.

Oleg

On Tue, Jun 10, 2014 at 11:06 PM, Oleg Bartunov  wrote:
> People,
>
> we have many other tasks than guessing the language name.
> jsquery is just an extension, which we invent to test our indexing
> stuff.  Eventually, it grew out.  I think we'll think on better name
> if developers agree to have it in core. For now, jsquery is good
> enough to us.
>
> jsquery name doesn't need to be used at all, by the way.
>
> Oleg
>
> On Tue, Jun 10, 2014 at 10:04 PM, David E. Wheeler
>  wrote:
>> On Jun 6, 2014, at 3:50 PM, Josh Berkus  wrote:
>>
>>> Maybe we should call it "jsonesque"  ;-)
>>
>> I propose JOQL: JSON Object Query Language.
>>
>> Best,
>>
>> David
>>
>> PS: JAQL sounds better, but [already exists](http://code.google.com/p/jaql/).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minmax indexes

2014-08-07 Thread Oleg Bartunov

+1 for BRIN !

On Thu, Aug 7, 2014 at 6:16 PM, Simon Riggs  wrote:
> On 7 August 2014 14:53, Robert Haas  wrote:
>> On Wed, Aug 6, 2014 at 4:06 PM, Nicolas Barbier
>>  wrote:
>>> 2014-08-06 Claudio Freire :
>>>
 So, I like blockfilter a lot. I change my vote to blockfilter ;)
>>>
>>> +1 for blockfilter, because it stresses the fact that the "physical"
>>> arrangement of rows in blocks matters for this index.
>>
>> I don't like that quite as well as summary, but I'd prefer either to
>> the current naming.
>
> Yes, "summary index" isn't good. I'm not sure where the block or the
> filter part comes in though, so -1 to "block filter", not least
> because it doesn't have a good abbreviation (bfin??).
>
> A better description would be "block range index" since we are
> indexing a range of blocks (not just one block). Perhaps a better one
> would be simply "range index", which we could abbreviate to RIN or
> BRIN.
>
> --
>  Simon Riggs   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-14 Thread Oleg Bartunov

I did quick test on the same bookmarks to test performance of 9.4beta2 and
9.4beta2+patch

The query was the same we used in pgcon presentation:
SELECT count(*) FROM jb WHERE jb @> '{"tags":[{"term":"NYC"}]}'::jsonb;

  table size  |   time (ms)
9.4beta2:1374 Mb  | 1160
9.4beta2+patch: 1373 Mb  | 1213


Yes, performance degrades, but not much.  There is also small  win in table
size, but bookmarks are not big, so it's difficult to say about compression.

Oleg



On Thu, Aug 14, 2014 at 9:57 PM, Tom Lane  wrote:

> Bruce Momjian  writes:
> > On Thu, Aug 14, 2014 at 12:22:46PM -0400, Tom Lane wrote:
> >> This gets back to the problem of what test case are we going to consider
> >> while debating what solution to adopt.
>
> > Uh, we just one need one 12k JSON document from somewhere.  Clearly this
> > is something we can easily get.
>
> I would put little faith in a single document as being representative.
>
> To try to get some statistics about a real-world case, I looked at the
> delicio.us dataset that someone posted awhile back (1252973 JSON docs).
> These have a minimum length (in text representation) of 604 bytes and
> a maximum length of 5949 bytes, which means that they aren't going to
> tell us all that much about large JSON docs, but this is better than
> no data at all.
>
> Since documents of only a couple hundred bytes aren't going to be subject
> to compression, I made a table of four columns each containing the same
> JSON data, so that each row would be long enough to force the toast logic
> to try to do something.  (Note that none of these documents are anywhere
> near big enough to hit the refuses-to-compress problem.)  Given that,
> I get the following statistics for pg_column_size():
>
> min max avg
>
> JSON (text) representation  382 1155526.5
>
> HEAD's JSONB representation 493 1485695.1
>
> all-lengths representation  440 1257615.3
>
> So IOW, on this dataset the existing JSONB representation creates about
> 32% bloat compared to just storing the (compressed) user-visible text,
> and switching to all-lengths would about halve that penalty.
>
> Maybe this is telling us it's not worth changing the representation,
> and we should just go do something about the first_success_by threshold
> and be done.  I'm hesitant to draw such conclusions on the basis of a
> single use-case though, especially one that doesn't really have that
> much use for compression in the first place.  Do we have other JSON
> corpuses to look at?
>
> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] [GSoC] Push-based query executor discussion

2017-03-22 Thread Oleg Bartunov

On Wed, Mar 22, 2017 at 8:04 PM, Arseny Sher  wrote:

> > While I admire your fearlessness, I think the chances of you being
> > able to bring a project of this type to a successful conclusion are
> > remote.  Here is what I said about this topic previously:
> >
> > http://postgr.es/m/CA+Tgmoa=kzHJ+TwxyQ+vKu21nk3prkRjSdbhjubN7qvc8UKuG
> g...@mail.gmail.com
>
> Well, as I said, I don't pretend that I will support full functionality:
> >> instead, we should decide which part of this work (if any) is
> >> going to be done in the course of GSoC. Probably, all TPC-H queries
> >> with and without index support is a good initial target, but this
> >> needs to be discussed.
>
> I think that successfull completion of this project should be a clear
> and justified answer to the question "Is this idea is good enough to
> work on merging it into the master?", not the production-ready patches
> themselves. Nevertheless, of course project success criterion must be
> reasonably formalized -- e.g. implement nodes X with features Y, etc.
>

How many GSoC slots and possible students we have ?

Should we reject this interesting project, which based on several years of
research work of academician group in the institute ? May be better help
him to reformulate the scope of project and let him work ? I don't know
exactly if the results of GSoC project should be committed , but as a
research project it's certainly would be useful for the community.


>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] [PATCH] few fts functions for jsonb

2017-03-31 Thread Oleg Bartunov

On 30 Mar 2017 23:43, "Dmitry Dolgov" <9erthali...@gmail.com> wrote:

On 31 March 2017 at 00:01, Andrew Dunstan 
wrote:
>
> I have just noticed as I was writing/testing the non-existent docs for
> this patch that it doesn't supply variants of to_tsvector that take a
> regconfig as the first argument. Is there a reason for that? Why
> should the json(b) versions be different from the text versions?

No, there is no reason, I just missed that. Here is a new version of the
patch (only the functions part)
to add those variants.

Congratulations with patch committed, who will write an addition
documentation? I think we need to touch  FTS and JSON parts.

Re: [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiST indexes from gevel

2017-07-25 Thread Oleg Bartunov

On Mon, Jul 24, 2017 at 11:38 PM, Robert Haas  wrote:
> On Fri, Jul 21, 2017 at 8:05 AM, Alexey Chernyshov
>  wrote:
>> the following patch transfers functionality from gevel module
>> (http://www.sai.msu.su/~megera/wiki/Gevel) which provides functions for
>> analyzing GIN and GiST indexes to pageinspect. Gevel was originally
>> designed by Oleg Bartunov, and Teodor Sigaev for developers of GiST and
>> GIN indexes.
>
> It's not clear from the web site in question that the relevant code is
> released under the PostgreSQL license.

git clone git://sigaev.ru/gevel

from README.gevel

License

Stable version, included into PostgreSQL distribution, released under
BSD license. Development version, available from this site, released under
the GNU General Public License, version 2 (June 1991)

We would be happy to write anything community likes :)

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-02 Thread Oleg Bartunov

On Mon, May 1, 2017 at 7:22 AM, Pavel Stehule  wrote:
>
>
> 2017-05-01 1:21 GMT+02:00 Andres Freund :
>>
>> On 2017-04-30 07:19:21 +0200, Pavel Stehule wrote:
>> > why we cannot to introduce GUC option - enable_cteoptfence ?
>>
>> Doesn't really solve the issue, and we've generally shied away from GUCs
>> that influence behaviour after a few bad experiences.  What if you want
>> one CTE inlined, but another one not?
>
>
> It change behave in same sense like enable_nestloop, enable_hashjoin, ...
> with same limits.

And then we recall  plan hints :)

>
> Regards
>
> Pavel
>
>>
>>
>> - Andres
>
>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-09-15 Thread Oleg Bartunov

On Fri, Sep 15, 2017 at 7:31 PM, Robert Haas  wrote:
> On Fri, Sep 15, 2017 at 10:10 AM, Daniel Gustafsson  wrote:
>> Can we expect a rebased version of this patch for this commitfest?  Since 
>> it’s
>> a rather large feature it would be good to get it in as early as we can in 
>> the
>> process.
>
> Again, given that this needs a "major" rebase and hasn't been updated
> in a month, and given that the CF is already half over, this should
> just be bumped to the next CF.  We're supposed to be trying to review
> things that were ready to go by the start of the CF, not the end.

We are supporting v10 branch in our github repository
https://github.com/postgrespro/sqljson/tree/sqljson_v10

Since the first post we made a lot of changes, mostly because of
better understanding the standard and availability of technical report
(http://standards.iso.org/ittf/PubliclyAvailableStandards/c067367_ISO_IEC_TR_19075-6_2017.zip).
Most important are:

1.We abandoned FORMAT support, which could confuse our users, since we
have data types json[b].

2. We use XMLTABLE infrastructure, extended for  JSON_TABLE support.

3. Reorganize commits, so we could split one big patch by several
smaller patches, which could be reviewed independently.

4. The biggest problem is documentation, we are working on it.

Nikita will submit patches soon.

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-09-17 Thread Oleg Bartunov

On 16 Sep 2017 02:32, "Nikita Glukhov"  wrote:

On 15.09.2017 22:36, Oleg Bartunov wrote:

On Fri, Sep 15, 2017 at 7:31 PM, Robert Haas  wrote:
>
>> On Fri, Sep 15, 2017 at 10:10 AM, Daniel Gustafsson 
>> wrote:
>>
>>> Can we expect a rebased version of this patch for this commitfest?
>>> Since it’s
>>> a rather large feature it would be good to get it in as early as we can
>>> in the
>>> process.
>>>
>> Again, given that this needs a "major" rebase and hasn't been updated
>> in a month, and given that the CF is already half over, this should
>> just be bumped to the next CF.  We're supposed to be trying to review
>> things that were ready to go by the start of the CF, not the end.
>>
> We are supporting v10 branch in our github repository
> https://github.com/postgrespro/sqljson/tree/sqljson_v10
>
> Since the first post we made a lot of changes, mostly because of
> better understanding the standard and availability of technical report
> (http://standards.iso.org/ittf/PubliclyAvailableStandards/c0
> 67367_ISO_IEC_TR_19075-6_2017.zip).
> Most important are:
>
> 1.We abandoned FORMAT support, which could confuse our users, since we
> have data types json[b].
>
> 2. We use XMLTABLE infrastructure, extended for  JSON_TABLE support.
>
> 3. Reorganize commits, so we could split one big patch by several
> smaller patches, which could be reviewed independently.
>
> 4. The biggest problem is documentation, we are working on it.
>
> Nikita will submit patches soon.
>

Attached archive with 9 patches rebased onto latest master.

0001-jsonpath-v02.patch:
 - jsonpath type
 - jsonpath execution on jsonb type
 - jsonpath operators for jsonb type
 - GIN support for jsonpath operators

0002-jsonpath-json-v02.patch:
 - jsonb-like iterators for json type
 - jsonpath execution on json type
 - jsonpath operators for json type

0003-jsonpath-extensions-v02.patch:
0004-jsonpath-extensions-tests-for-json-v02.patch:
 - some useful standard extensions with tests
 0005-sqljson-v02.patch:
 - SQL/JSON constructors (JSON_OBJECT[AGG], JSON_ARRAY[AGG])
 - SQL/JSON query functions (JSON_VALUE, JSON_QUERY, JSON_EXISTS)
 - IS JSON predicate

0006-sqljson-json-v02.patch:
 - SQL/JSON support for json type and tests

0007-json_table-v02.patch:
 - JSON_TABLE using XMLTABLE infrastructure

0008-json_table-json-v02.patch:
 - JSON_TABLE support for json type

0009-wip-extensions-v02.patch:
 - FORMAT JSONB
 - jsonb to/from bytea casts
 - jsonpath operators
 - some unfinished jsonpath extensions


Originally, JSON path was implemented only for jsonb type, and I decided to
add jsonb-like iterators for json type for json support implementation with
minimal changes in JSON path code.  This solution (see jsonpath_json.c from
patch 0002) looks a little dubious to me, so I separated json support into
independent patches.

The last WIP patch 0009 is unfinished and contains a lot of FIXMEs.  But
the ability to use arbitrary Postgres operators in JSON path with explicitly
specified  types is rather interesting, and I think it should be shown now
to get a some kind of pre-review.

We are supporting v11 and v10 branches in our github repository:

https://github.com/postgrespro/sqljson/tree/sqljson
https://github.com/postgrespro/sqljson/tree/sqljson_wip
https://github.com/postgrespro/sqljson/tree/sqljson_v10
https://github.com/postgrespro/sqljson/tree/sqljson_v10_wip


We provide web interface to our build
http://sqlfiddle.postgrespro.ru/#!21/



Attached patches can be produced simply by combining groups of consecutive
commits from these branches.

--
Nikita Glukhov
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-17 Thread Oleg Bartunov

On 3 Aug 2017 16:29, "Oliver Ford"  wrote:

Adds to the to_number() function the ability to convert Roman numerals
to a number. This feature is on the formatting.c TODO list. It is not
currently implemented in either Oracle, MSSQL or MySQL so gives
PostgreSQL an edge :-)


I see use of this in full text search as a dictionary. It's useful for
indexing and searching historical documents. Probably, better to have as
contrib.


==Usage==

Call: to_number(numerals, 'RN') or to_number(numerals, 'rn').

Example: to_number('MMMXIII', 'RN') returns 3013. to_number('xiv',
'rn') returns 14.

The function is case insensitive for the numerals. It accepts a mix of
cases and treats them the same. So  to_number ('MCI, 'rn') and
to_number ('McI', 'RN') both return 1101. The format mask must however
be either 'RN' or 'rn'. If there are other elements in the mask, those
other elements will be ignored. So to_number('MMM', 'FMRN') returns
3000.

Whitespace before the numerals is ignored.

==Validation==

The new function roman_to_int() in formatting.c performs the
conversion. It strictly validates the numerals based on the following
Roman-to-Arabic conversion rules:

1. The power-of-ten numerals (I, X, C, M) can be repeated up to three
times in a row. The beginning-with-5 numerals (V, L, D) can each
appear only once.

2. Subtraction from a power-of-ten numeral cannot occur if a
beginning-with-5 numeral appears later.

3. Subtraction cannot occur if the smaller numeral is less than a
tenth of the greater numeral (so IX is valid, but IC is invalid).

4. There cannot be two subtractions in a row.

5. A beginning-with-5 numeral cannot subtract.

If any of these rules are violated, an error is raised.

==Testing==

This has been tested on a Windows build of the master branch with
MinGW. The included regression tests positively test every value from
1 to 3999 (the Roman numeral max value) by calling the existing
to_char() function to get the Roman value, then converting it back to
an Arabic value. There are also negative tests for each invalid code
path and some positive mixed-case tests.

Documentation is updated to include this new feature.

==References==

http://sierra.nmsu.edu/morandi/coursematerials/RomanNumerals.html
Documents the strict Roman numeral standard.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Generic type subscripting

2017-09-25 Thread Oleg Bartunov

On Fri, Sep 22, 2017 at 3:51 PM, Peter Eisentraut
 wrote:
> On 9/21/17 11:24, Dmitry Dolgov wrote:
>> One last thing that I need to clarify. Initially there was an idea to
>> minimize changes in `pg_type`
>
> I see, but there is no value in that if it makes everything else more
> complicated.

+1



>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Surjective functional indexes

2017-09-28 Thread Oleg Bartunov

On Thu, May 25, 2017 at 7:30 PM, Konstantin Knizhnik
 wrote:
> Right now Postgres determines whether update operation touch index or not
> based only on set of the affected columns.
> But in case of functional indexes such policy quite frequently leads to
> unnecessary index updates.
> For example, functional index are widely use for indexing JSON data:
> info->>'name'.
>
> JSON data may contain multiple attributes and only few of them may be
> affected by update.
> Moreover, index is used to build for immutable attributes (like "id",
> "isbn", "name",...).
>
> Functions like (info->>'name') are named "surjective" ni mathematics.
> I have strong feeling that most of functional indexes are based on
> surjective functions.
> For such indexes current Postgresql index update policy is very inefficient.
> It cause disabling of hot updates
> and so leads to significant degrade of performance.
>
> Without this patch Postgres is slower than Mongo on YCSB benchmark with (50%
> update,50 % select)  workload.
> And after applying this patch Postgres beats Mongo at all workloads.

I confirm that the patch helps for workload A of YCSB, but actually
just extends #clients, where postgres outperforms mongodb (see
attached picture).  If we increase #clients > 100 postgres quickly
degrades not only for workload A, but even for workload B (5%
updates), while mongodb and mysql behave much-much better, but this is
another problem, we will discuss in different thread.

>
> My proposal is to check value of function for functional indexes instead of
> just comparing set of effected attributes.
> Obviously, for some complex functions it may  have negative effect on update
> speed.
> This is why I have added "surjective" option to index. By default it is
> switched on for all functional indexes (based on my assumption
> that most functions used in functional indexes are surjective). But it is
> possible to explicitly disable it and make decision weather index
> needs to be updated or not only based on set of effected attributes.
>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speakers Wanted for pgDay Cuba

2015-07-23 Thread Oleg Bartunov

I'm probably going, but my company supports this trip.

On Thu, Jul 23, 2015 at 10:35 PM, Josh Berkus  wrote:

> Contributors:
>
> We are looking for one or more PostgreSQL contributors to travel to
> Havana, Cuba, to present at the second pgDay Cuba.  In order to
> encourage this young community, we will be funding[1] up to $3000 in
> total travel funds for one or two speakers.
>
> Information on the event is here:
> http://www.postgresql.org/about/event/1813/
>
> It will be from October 19 to 23 at the Universidad de las Cien-
> cias Informáticas, La Habana, Cuba.
>
> Proficiency in Spanish is desireable, but according to the conference
> translators will be available for visiting international speakers.
> Contributors wanting to take advantage of travel funding must first be
> accepted by the conference, and then may apply for funding.
>
> IMPORTANT: if you travel to Cuba on SPI funds, there will be significant
> restrictions on your travel due to US regulations regarding Cuba.  These
> apply even if you are not a US Citizen.
>
> Contact c...@postgresql.org for more information about the conference and
> to submit potential speaking topics.
>
> Contact j...@postgresql.org for information about travel funding and
> travel restrictions, or to request travel funds.
>
> Deadline is September 3, but earlier is better.
>
> (1: travel funds are donated community funds held at SPI, Inc.  Speakers
> will be reimbursed after attending the conference by check or wire
> transfer.)
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] "A huge debt of gratitude" - Michael Stonebraker

2015-07-23 Thread Oleg Bartunov

Nice to hear you again Jolly !

On Wed, Jul 22, 2015 at 1:42 AM, Jolly Chen  wrote:

> Hey everyone,
>
> You have probably heard that Mike Stonebraker recently won the Turing
> award.  A recording of his award lecture is available at:
> https://www.youtube.com/watch?v=BbGeKi6T6QI
>
> It is an entertaining talk overall. If you fast forward to about the 1:07
> mark, he makes some comments about postgres.
>
> Here’s my rough transcription:
>
> "The abstract data type system in postgres has been added to a lot of
> relational database systems. It's kind of de facto table stakes for
> relational databases these days, essentially intact.  That idea was really
> a good one. It was mentioned in the citation for my Turing award winning.
> However, serendipity played a huge role, which is, the biggest impact of
> postgres by far came from two Berkeley students that I'll affectionately
> call Grumpy and Sleepy.  They converted the academic postgres prototype
> from QUEL to SQL in 1995. This was in parallel to the commercial activity.
> And then a pick-up team of volunteers, none of whom have anything to do
> with me or Berkeley, have been shepherding that open source system ever
> since 1995. The system that you get off the web for postgres comes from
> this pick-up team.  It is open source at its best and I want to just
> mention that I have nothing to do with that and that collection of folks we
> all owe a huge debt of gratitude to, because they have robustize that code
> line and made it so it really works.”
>
> Thank you all so much for your hard work over the last twenty years!!
>
> Affectionately,
>
> Grumpy
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] GIN pending list clean up exposure to SQL

2015-08-12 Thread Oleg Bartunov

On Thu, Aug 13, 2015 at 2:19 AM, Jeff Janes  wrote:

> I've written a function which allows users to clean up the pending list.
> It takes the index name and returns the number of pending list pages
> deleted.
>
> # select * from gin_clean_pending_list('foo_text_array_idx');
>  gin_clean_pending_list
> 
> 278
> (1 row)
>
> Time: 31994.880 ms
>
>
> This is needed because there needs to be a way to offload this duty from
> the user backends, and the only other way to intentionaly clean up the list
> is by vacuum (and the rest of a vacuum can take days to run on a large
> table).  Autoanalyze will also do it, but it hard to arrange for those to
> occur at need, and unless you drop default_statistics_target very low they
> can also take a long time.  And if you do lower the target, it screws up
> your statistics, of course.
>
> I've currently crammed it into pageinspect, simply because that is where I
> found the existing code which I used as an exemplar for writing this code.
>
> But where does this belong?  Core?  Its own separate extension?
>

For a long time we have gevel extension (
http://www.sigaev.ru/git/gitweb.cgi?p=gevel.git;a=summary) , which was
originally developed to help us debugging indexes, but it's also contains
user's functions.  Your function could be there, but I have no idea about
gevel itself. Probably, it should be restructurized and come to contrib. Do
you have time to look into ?



>
> Cheers,
>
> Jeff
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Re: [HACKERS] jsonb array-style subscripting

2015-08-18 Thread Oleg Bartunov

On Mon, Aug 17, 2015 at 11:26 PM, Peter Geoghegan  wrote:

> On Mon, Aug 17, 2015 at 12:26 PM, Merlin Moncure 
> wrote:
> > ...is a good idea. postgres operators tend to return immutable copies
> > of the item they are referring to.
>
> This patch does not add an operator at all, actually. If feels like
> there ought to be an operator, but in fact there is not. The parser is
> hard-coded to recognize array-style subscripts, which this uses.
>
> While I'm certainly glad that Dmitry took the time to work on this, I
> think we will need an operator, too. Or, more accurately, there should
> probably be a way to make something like this use some available GIN
> index:
>
> postgres=# explain analyze select * from testjsonb where p['a'] = '[1]';
>  QUERY PLAN
>
> -
>  Seq Scan on testjsonb  (cost=0.00..27.00 rows=7 width=32) (actual
> time=0.022..0.023 rows=1 loops=1)
>Filter: (p['a'] = '[1]'::jsonb)
>  Planning time: 0.070 ms
>  Execution time: 0.054 ms
> (4 rows)
>
> This doesn't really matter with arrays, but ISTM that it matters here.
> I have no strong feelings on how it should work, but certain things do
> seem to suggest themselves. For example, maybe the parser can be made
> to create a query tree that uses an indexable operator based on
> special-case logic. Although maybe that's a kludge too far, since I
> can imagine it breaking other legitimate things. My sense is that this
> will need to be discussed.
>

Peter,  we are thinking about better indexing of subselects, let's  first
have the syntax sugar in core, which Dmitry implemented.



>
> --
> Peter Geoghegan
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Proposal: Implement failover on libpq connect level.

2015-08-19 Thread Oleg Bartunov

On Wed, Aug 19, 2015 at 4:46 PM, Andres Freund  wrote:

> On 2015-08-19 09:41:32 -0400, Tom Lane wrote:
> > In fact, they'd still need to use DNS balancing for Postgres,
> > because not everything connects with libpq (think JDBC for instance).
>
> It already does support this though.
>
> https://jdbc.postgresql.org/documentation/head/connect.html :
>
> > Connection Fail-over
> >
> > To support simple connection fail-over it is possible to define multiple
> > endpoints (host and port pairs) in the connection url separated by
> > commas. The driver will try to once connect to each of them in order
> > until the connection succeeds. If none succeed, a normal connection
> > exception is thrown.
> >
> > The syntax for the connection url is:
> >
> > jdbc:postgresql://host1:port1,host2:port2/database
>

yes, I also wanted to show this, but you was quicker.


>
>
> > So I think we ought to reject this proposal, full stop.  I see no
> > reason to re-invent this wheel, and there are good reasons not to.
>
> I don't really buy this argument. Allowing to connect to several
> endpoints isn't exactly "new tech" either. A lot of database connectors
> do support something very close to the above pgjdbc feature.
>


mysql, for example.


>
> Greetings,
>
> Andres Freund
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Horizontal scalability/sharding

2015-08-30 Thread Oleg Bartunov

On Sun, Aug 30, 2015 at 5:31 PM, Simon Riggs  wrote:

> On 30 August 2015 at 03:17, Bruce Momjian  wrote:
>
>> I have recently increased my public statements about the idea of adding
>> horizontal scaling/sharding to Postgres.
>
>
> Glad to see it. Many people have been pushing such things for years, so it
> is good to finally see some debate about this on Hackers.
>
>
>> I wanted to share with hackers
>> a timeline of how we got here, and where I think we are going in the
>> short term:
>>
>> 2012-2013:  As part of writing my scaling talk
>> (http://momjian.us/main/presentations/overview.html#scaling), studying
>> Oracle RAC, and talking to users, it became clear that an XC-like
>> architecture (sharding) was the only architecture that was going to allow
>> for write scaling.
>>
>
> What other architectures were discussed? Where was that discussion?
>
>
>> Users and conference attendees I talked to were increasingly concerned
>> about the ability of Postgres to scale for high write volumes.  They
>> didn't
>> necessarily need that scale now, but they needed to know they could get
>> it if they wanted it, and wouldn't need to switch to a new database in
>> the future.  This is similar to wanting a car that can get you on a
>> highway
>> on-ramp fast --- even if you don't need it, you want to know it is there.
>>
>
> +1
>
>
>> 2014:  I started to shop around the idea that we could use FDWs,
>> parallelism, and a transaction/snapshot manager to get XC features
>> as built-in to Postgres.  (I don't remember where the original idea
>> came from.)  It was clear that having separate forks of the source code
>> in XC and XL was never going to achieve critical mass --- there just
>> aren't enough people who need high right scale right now, and the fork
>> maintenance overhead is a huge burden.
>>
>
> I personally support the view that we should put scalability features into
> Postgres core, rather than run separate forks.
>
>
>> I realized that we would never get community acceptance to dump the XC
>> (or XL) code needed for sharding into community Postgres
>
>
> How or why did you realize that? There has never been any such discussion,
> AFAIK. Surely it can be possible to move required subsystems across?
>
>
>> , but with FDWs,
>> we could add the features as _part_ of improving FDWs, which would benefit
>> FDWs _and_ would be useful for sharding.  (We already see some of those
>> FDW features in 9.5.)
>>
>
> That is a huge presumption. Not discussed or technically analyzed in any
> way with the community.
>
>
>> October, 2014:  EDB and NTT started working together in the community
>> to start improving FDWs as a basis for an FDW-based sharding solution.
>> Many of the 9.5 FDW improvements that also benefit sharding were developed
>> by a combined EDB/NTT team.  The features improved FDWs independent of
>> sharding, so they didn't need community buy-in on sharding to get them
>> accepted.
>>
>> June, 2015:  I attended the PGCon sharding unconference session and
>> there was a huge discussion about where we should go with sharding.
>> I think the big take-away was that most people liked the FDW approach,
>> but had business/customer reasons for wanting to work on XC or XL because
>> those would be production-ready faster.
>>
>
> Cough, cough. You must surely be joking that "most people liked the FDW
> approach"? How did we measure the acceptance of this approach?
>
> What actually is the FDW approach? Since its not been written down
> anywhere, or even explained verbally, how can anyone actually agree to it?
>
>
>> July, 2015:  Oleg Bartunov and his new company Postgres Professional (PP)
>> started to think about joining the FDW approach, rather than working on
>> XL, as they had stated at PGCon in June.  A joint NTT/EDB/PP phone-in
>> meeting is scheduled for September 1.
>>
>
>

A little correction about Postgres Professional. We are concentrated on
idea to have one distributed transaction manager, originally DTM, now we
have better name XTM, which is neutral to actual cluster realization. For
example, we are testing it with XL, ported to 9.4, but we were planning to
extend tests to pg_shard, postgres_fdw. My idea was to have at least XTM
committed to 9.6, so all parties could work on their implementation much
easier.



> August, 2015:  While speaking at SFPUG, Citus Data approached me about
>> joining the FDW sharding team.  They have been invited to the September
>> 1 meeting, as have the XC and XL people.
&

Re: [HACKERS] Horizontal scalability/sharding

2015-08-31 Thread Oleg Bartunov

On Mon, Aug 31, 2015 at 5:48 AM, Bruce Momjian  wrote:

> On Sun, Aug 30, 2015 at 10:08:06PM -0400, Bruce Momjian wrote:
> > On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote:
> > > Well, I have had many such discussions with XC/XL folks, and that
> was my
> > > opinion.  I have seen almost no public discussion about this
> because the
> > > idea had almost no chance of success.  If it was possible, someone
> would
> > > have already suggested it on this list.
> > >
> > >
> > > Or perhaps people invested in this area had other obligations or lacked
> > > motivation and/or time to work to push up for things in core. That's
> not
> > > possible to know, and what is done is done.
> >
> > Well, I have talked to everyone privately about this, and concluded that
> > while horizontal scalability/sharding is useful, it is unlikely that the
> > code volume of something like XC or XL would be accepted into the
> > community, and frankly, now that we have FDWs, it is hard to imagine why
> > we would _not_ go in the FDW direction.
>
> Actually, there was hope that XC or XL would get popular enough that it
> would justify adding their code into community Postgres, but that never
> happened.
>

AFAIK, XC/XL has already some customers and that is an additional pressure
on their development team, which is now called X2. I don't exactly know how
internal Huawei's MPPDB is connected to XC/XL.

We need community test suite for cluster and our company is working on
this. It's non-trivial work, but community will never accepts any cluster
solution without thorough testing of functionality and performance. Our
XC/XL experience was not good.



>
> --
>   Bruce Momjian  http://momjian.us
>   EnterpriseDB http://enterprisedb.com
>
>   + Everyone has their own god. +
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Horizontal scalability/sharding

2015-09-02 Thread Oleg Bartunov

On Tue, Sep 1, 2015 at 7:08 PM, Robert Haas  wrote:

> On Tue, Sep 1, 2015 at 12:00 AM, Pavan Deolasee
>  wrote:
> > My worry is that if we start implementing them again from scratch, it
> will
> > take a few years before we get them in a usable state. What XC/XL lacked
> is
> > probably a Robert Haas or a Tom Lane who could look at the work and
> suggest
> > major edits. If that had happened, the quality of the product could have
> > been much better today. I don't mean to derate the developers who worked
> on
> > XC/XL, but there is no harm in accepting that if someone with a much
> better
> > understanding of the whole system was part of the team, that would have
> > positively impacted the project. Is that an angle worth exploring? Does
> it
> > make sense to commit some more resources to say XC or XL and try to
> improve
> > the quality of the product even further? To be honest, XL is in far far
> > better shape (haven't really tried XC in a while) and some more
> QA/polishing
> > can make it production ready much sooner.
>
> From my point of view, and EnterpriseDB's point of view, anything that
> doesn't go into the core PostgreSQL distribution isn't really getting
> us where we need to be.  If there's code in XL that would be valuable
> to merge into core PostgreSQL, then let's do it.  If the code cannot
> be used but there are lessons we can learn that will make what does go
> into core PostgreSQL better, let's learn them.  However, I don't think
> it's serving anybody very well that we have the XC fork, and multiple
> forks of the XC fork, floating around out there and people are working
> on those instead of working on core PostgreSQL.  The reality is that
> we don't have enough brainpower to spread it across 2 or 3 or 4 or 5
> different projects and have all of them be good.  The reality is,
> also, that horizontal scalability isn't an optional feature.  There
> was a point in time at which the PostgreSQL project's official policy
> on replication was that it did not belong in core.  That was a bad
> policy; thankfully, it was reversed, and the result was Hot Standby
> and Streaming Replication, incredibly important technologies without
> which we would not be where we are today. Horizontal scalability is
> just as essential.
>

Agree with you, Robert.

One lesson from XL we got is that we need testing framework for cluster, so
any cluster project should at least pass functional and performance
testing. XL was very easy to break and I'm wondering how many corner cases
still exists. We tried several other approaches and while reading the
papers was a fun, in practice we found many devil details, which made the
paper be just a paper.



>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

[HACKERS] Fwd: [Snowball-discuss] New website

2015-09-07 Thread Oleg Bartunov

Snowball stemmer has moved to the new site http://snowballstem.org, I made
a simple patch for master . I'm not sure if it's ok for 9.5.

-- Forwarded message --
From: Olly Betts 
Date: Mon, Sep 7, 2015 at 1:46 AM
Subject: [Snowball-discuss] New website
To: snowball-disc...@lists.tartarus.org

On Wed, Mar 25, 2015 at 01:22:40AM +, Richard Boulton wrote:
>  - The website at http://snowball.tartarus.org/ will be preserved in
> essentially its current state, for the foreseeable future. Snowball has
> been referenced in many academic papers and other places, so we feel it
> important to preserve the website in essentially its current state.

I have set up a new website for snowball:

http://snowballstem.org/

The content is forked from the old website.

Currently there are some dead internal links I need to fix (to things
like the source code of the stemmers), and the demo page isn't
operational.  Please report any other problems.

I'd encourage people to update external links to "the snowball project"
to point to the new site.

Cheers,
Olly

___
Snowball-discuss mailing list
snowball-disc...@lists.tartarus.org
http://lists.tartarus.org/mailman/listinfo/snowball-discuss

textsearch-doc.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nested hstore patch

2014-01-14 Thread Oleg Bartunov

Erik,

thanks for docs fixes, we have even more :)

Oleg

On Tue, Jan 14, 2014 at 4:18 AM, Erik Rijkers  wrote:
> On Mon, January 13, 2014 18:30, Andrew Dunstan wrote:
>>
>>
>> On 01/13/2014 11:16 AM, Oleg Bartunov wrote:
>>> Andrew,
>>>
>>> did you run perl script ? Actually, I found, that operator table needs
>>> to be fixed.
>>>
>>
>> No. My build machine doesn't actually have DBD::Pg installed. Can you
>> send me a patch if you don't want to push it yourself, or maybe Erik can
>> send a pacth top adjust the table.
>>
>
>> [ nested_hstore_and_jsonb-2.patch ]
>
> ( centos 6.5, gcc 4.8.2. )
>
> The patch applies & compiles with warnings (see below).
>
> The opr_sanity test fails during make check: regression.diffs attached.
>
> Also attached are changes to hstore.sgml, to operator + functions table, plus 
> some typos.
>
> Thanks,
> Erik Rijkers
>
>
> make
>
> jsonfuncs.c: In function ‘each_object_field_end_jsonb’:
> jsonfuncs.c:1328:7: warning: assignment from incompatible pointer type 
> [enabled by default]
>val = DatumGetPointer(DirectFunctionCall1(jsonb_in, 
> CStringGetDatum(cstr)));
>^
> jsonfuncs.c: In function ‘elements_array_element_end_jsonb’:
> jsonfuncs.c:1530:8: warning: assignment from incompatible pointer type 
> [enabled by default]
>   jbval = DatumGetPointer(DirectFunctionCall1(jsonb_in, 
> CStringGetDatum(cstr)));
> ^
>
>
> make contrib:
>
> hstore_io.c: In function ‘array_to_hstore’:
> hstore_io.c:1694:30: warning: ‘result’ may be used uninitialized in this 
> function [-Wmaybe-uninitialized]
>   PG_RETURN_POINTER(hstoreDump(result));
>
>
>
>
>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nested hstore patch - FailedAssertion("!(value->array.nelems == 1)

2014-01-14 Thread Oleg Bartunov

It doesn't crashed in the last version in our repository.

=# select  'x'::hstore || ('a=>"1"':: hstore) ;
   ?column?
---
 "x", "a", "1"
(1 row)



On Wed, Jan 15, 2014 at 3:53 AM, Erik Rijkers  wrote:
> On Mon, January 13, 2014 16:36, Andrew Dunstan wrote:
>> A new version of the patch is attached. It includes all of Erik's docs
>
>> [ nested_hstore_and_jsonb-2.patch ]
>
>
> This crashes the server:
>
> testdb=#  select  'x' || ('a=>"1"':: hstore) ;
> The connection to the server was lost. Attempting reset: Failed.
>
> logging:
> TRAP: FailedAssertion("!(value->array.nelems == 1)", File: "jsonb_support.c", 
> Line: 904)
> 2014-01-15 00:32:01.854 CET 1206 LOG:  server process (PID 3918) was 
> terminated by signal 6: Aborted
> 2014-01-15 00:32:01.854 CET 1206 DETAIL:  Failed process was running: select  
> 'x' || ('a=>"1"':: hstore) ;
>
>
> Btw, I find it strange that:
>
> testdb=#  select  ('a=>""':: hstore) #%> '{a}' ;
>  ?column?
> --
>  ""
> (1 row)
>
> so that:
>
> Time: 0.641 ms
> testdb=#  select ( ('a=>""':: hstore) #%> '{a}' ) = '' ;
>  ?column?
> --
>  f
> (1 row)
>
> testdb=#  select ( ('a=>""':: hstore) #%> '{a}' ) = '""' ;
>  ?column?
> --
>  t
> (1 row)
>
> Maybe there is a rationale, but it seems to me that
>('a=>""':: hstore) #%> '{a}'
> should deliver the empty string '', and not two double quotes.
>
>
> Thanks,
>
> Erik Rijkers
>
>
>
>
>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nested hstore patch - FailedAssertion("!(value->array.nelems == 1)

2014-01-15 Thread Oleg Bartunov

https://github.com/feodor/postgres


On Wed, Jan 15, 2014 at 11:10 AM, Erik Rijkers  wrote:
> On Wed, January 15, 2014 08:01, Oleg Bartunov wrote:
>> It doesn't crashed in the last version in our repository.
>>
>> =# select  'x'::hstore || ('a=>"1"':: hstore) ;
>>?column?
>> ---
>>  "x", "a", "1"
>> (1 row)
>>
>
> OK, shall I use that repository instead of the latest posted patch?
> No point in testing old code ( I used nested_hstore_and_jsonb-2.patch ).
>
> Could you send a link to where that repository is?
>
> ( btw, your query is not quite the same as the one I used:
>   select  'x' || ('a=>"1"':: hstore)
> but your query also crashes my server here so I suppose
> it triggers the same bug )
>
>
> thanks,
>
> Erik Rijkers
>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-01-31 Thread Oleg Bartunov

Hmm,
neither me, nor Teodor have experience and knowledge with
populate_record() and moreover hstore here is virgin and we don't know
the right behaviour, so I think we better take it from jsonb, once
Andrew realize it. Andrew ?

On Fri, Jan 31, 2014 at 4:52 AM, Andrew Dunstan  wrote:
>
> On 01/30/2014 07:21 PM, Merlin Moncure wrote:
>
>> Something seems off:
>>
>> postgres=# create type z as (a int, b int[]);
>> CREATE TYPE
>> postgres=# create type y as (a int, b z[]);
>> CREATE TYPE
>> postgres=# create type x as (a int, b y[]);
>> CREATE TYPE
>>
>> -- test a complicated construction
>> postgres=# select row(1, array[row(1, array[row(1,
>> array[1,2])::z])::y])::x;
>>   row
>>
>> -
>>
>> (1,"{""(1,\\""{\\""\\""(1,\\""\\""{1,2}\\""\\"")\\""\\""}\\"")""}")
>>
>> postgres=# select hstore(row(1, array[row(1, array[row(1,
>> array[1,2])::z])::y])::x);
>>  hstore
>>
>> --
>>   "a"=>1,
>> "b"=>"{\"(1,\\\"{\\\"\\\"(1,\\\"\\\"{1,2}\\\"\\\")\\\"\\\"}\\\")\"}"
>>
>> here, the output escaping has leaked into the internal array
>> structures.  istm we should have a json expressing the internal
>> structure.
>
>
> What has this to do with json at all? It's clearly a failure in the hstore()
> function.
>
>
>
>>It does (weirdly) map back however:
>>
>> postgres=# select populate_record(null::x, hstore(row(1, array[row(1,
>> array[row(1, array[1,2])::z])::y])::x));
>> populate_record
>>
>> -
>>
>> (1,"{""(1,\\""{\\""\\""(1,\\""\\""{1,2}\\""\\"")\\""\\""}\\"")""}")
>>
>>
>> OTOH, if I go via json route:
>>
>> postgres=# select row_to_json(row(1, array[row(1, array[row(1,
>> array[1,2])::z])::y])::x);
>>row_to_json
>> ---
>>   {"a":1,"b":[{"a":1,"b":[{"a":1,"b":[1,2]}]}]}
>>
>>
>> so far, so good.  let's push to hstore:
>> postgres=# select row_to_json(row(1, array[row(1, array[row(1,
>> array[1,2])::z])::y])::x)::jsonb::hstore;
>>row_to_json
>> ---
>>   "a"=>1, "b"=>[{"a"=>1, "b"=>[{"a"=>1, "b"=>[1, 2]}]}]
>>
>> this ISTM is the 'right' behavior.  but what if we bring it back to
>> record object?
>>
>> postgres=# select populate_record(null::x, row_to_json(row(1,
>> array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb::hstore);
>> ERROR:  malformed array literal: "{{"a"=>1, "b"=>{{"a"=>1, "b"=>{1,
>> 2}"
>>
>> yikes. The situation as I read it is that (notwithstanding my comments
>> upthread) there is no clean way to slide rowtypes to/from hstore and
>> jsonb while preserving structure.  IMO, the above query should work
>> and the populate function record above should return the internally
>> structured row object, not the text escaped version.
>
>
>
> And this is a failure in populate_record().
>
> I think we possibly need to say that handling of nested composites and
> arrays is an area that needs further work. OTOH, the refusal of
> json_populate_record() and json_populate_recordset() to handle these in 9.3
> has not generated a flood of complaints, so I don't think it's a tragedy,
> just a limitation, which should be documented if it's not already. (And of
> course hstore hasn't handled nested anything before now.)
>
> Meanwhile, maybe Teodor can fix the two hstore bugs shown here.
>
> cheers
>
> andrew
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GIN improvements part2: fast scan

2014-02-02 Thread Oleg Bartunov

Tomasa, it'd be nice if you use real data in your testing.

One very good application of gin fast-scan is dramatic performance
improvement  of hstore/jsonb @> operator, see slides 57, 58
http://www.sai.msu.su/~megera/postgres/talks/hstore-dublin-2013.pdf.
I'd like not to lost this benefit :)

Oleg

PS. I used data delicious-rss-1250k.gz from
http://randomwalker.info/data/delicious/

On Mon, Feb 3, 2014 at 5:44 AM, Tomas Vondra  wrote:
> On 3.2.2014 00:13, Tomas Vondra wrote:
>> On 2.2.2014 11:45, Heikki Linnakangas wrote:
>>> On 01/30/2014 01:53 AM, Tomas Vondra wrote:
 (3) A file with explain plans for 4 queries suffering ~2x slowdown,
  and explain plans with 9.4 master and Heikki's patches is available
  here:

http://www.fuzzy.cz/tmp/gin/queries.txt

  All the queries have 6 common words, and the explain plans look
  just fine to me - exactly like the plans for other queries.

  Two things now caught my eye. First some of these queries actually
  have words repeated - either exactly like "database & database" or
  in negated form like "!anything & anything". Second, while
  generating the queries, I use "dumb" frequency, where only exact
  matches count. I.e. "write != written" etc. But the actual number
  of hits may be much higher - for example "write" matches exactly
  just 5% documents, but using @@ it matches more than 20%.

  I don't know if that's the actual cause though.
>>>
>>> Ok, here's another variant of these patches. Compared to git master, it
>>> does three things:
>>>
>>> 1. It adds the concept of ternary consistent function internally, but no
>>> catalog changes. It's implemented by calling the regular boolean
>>> consistent function "both ways".
>>>
>>> 2. Use a binary heap to get the "next" item from the entries in a scan.
>>> I'm pretty sure this makes sense, because arguably it makes the code
>>> more readable, and reduces the number of item pointer comparisons
>>> significantly for queries with a lot of entries.
>>>
>>> 3. Only perform the pre-consistent check to try skipping entries, if we
>>> don't already have the next item from the entry loaded in the array.
>>> This is a tradeoff, you will lose some of the performance gain you might
>>> get from pre-consistent checks, but it also limits the performance loss
>>> you might get from doing useless pre-consistent checks.
>>>
>>> So taken together, I would expect this patch to make some of the
>>> performance gains less impressive, but also limit the loss we saw with
>>> some of the other patches.
>>>
>>> Tomas, could you run your test suite with this patch, please?
>>
>> Sure, will do. Do I get it right that this should be applied instead of
>> the four patches you've posted earlier?
>
> So, I was curious and did a basic testing - I've repeated the tests on
> current HEAD and 'HEAD with the new patch'. The complete data are
> available at [http://www.fuzzy.cz/tmp/gin/gin-scan-benchmarks.ods] and
> I've updated the charts at [http://www.fuzzy.cz/tmp/gin/] too.
>
> Look for branches named 9.4-head-2 and 9.4-heikki-2.
>
> To me it seems that:
>
> (1) The main issue was that with common words, it used to be much
> slower than HEAD (or 9.3). This seems to be fixed, i.e. it's not
> slower than before. See
>
>   http://www.fuzzy.cz/tmp/gin/3-common-words.png (previous patch)
>   http://www.fuzzy.cz/tmp/gin/3-common-words-new.png (new patch)
>
> for comparison vs. 9.4 HEAD. With the new patch there's no slowdown,
> which seems nice. Compared to 9.3 it looks like this:
>
>   http://www.fuzzy.cz/tmp/gin/3-common-words-new-vs-93.png
>
> so there's a significant speedup (thanks to the modified layout).
>
> (2) The question is whether the new patch works fine on rare words. See
> this for comparison of the patches against HEAD:
>
>   http://www.fuzzy.cz/tmp/gin/3-rare-words.png
>   http://www.fuzzy.cz/tmp/gin/3-rare-words-new.png
>
> and this is the comparison of the two patches:
>
>   http://www.fuzzy.cz/tmp/gin/patches-rare-words.png
>
> That seems fine to me - some queries are slower, but we're talking
> about queries taking 1 or 2 ms, so the measurement error is probably
> the main cause of the differences.
>
> (3) With higher numbers of frequent words, the differences (vs. HEAD or
> the previous patch) are not that dramatic as in (1) - the new patch
> is consistently by ~20% faster.
>
> Tomas
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-04 Thread Oleg Bartunov

Andrew provided us more information and we'll work on recv. What
people think about testing this stuff ?  btw, we don't have any
regression test on this.

Oleg

On Wed, Feb 5, 2014 at 2:03 AM, Josh Berkus  wrote:
> On 02/03/2014 07:27 AM, Andres Freund wrote:
>> On 2014-02-03 09:22:52 -0600, Merlin Moncure wrote:
 I lost my stomach (or maybe it was the glass of red) somewhere in the
 middle, but I think this needs a lot of work. Especially the io code
 doesn't seem ready to me. I'd consider ripping out the send/recv code
 for 9.4, that seems the biggest can of worms. It will still be usable
 without.
>>>
>>> Not having type send/recv functions is somewhat dangerous; it can
>>> cause problems for libraries that run everything through the binary
>>> wire format.  I'd give jsonb a pass on that, being a new type, but
>>> would be concerned if hstore had that ability revoked.
>>
>> Yea, removing it for hstore would be a compat problem...
>>
>>> offhand note: hstore_send seems pretty simply written and clean; it's
>>> a simple nonrecursive iterator...
>>
>> But a send function is pretty pointless without the corresponding recv
>> function... And imo recv simply is to dangerous as it's currently
>> written.
>> I am not saying that it cannot be made work, just that it's still nearly
>> as ugly as when I pointed out several of the dangers some weeks back.
>
> Oleg, Teodor, any comments on the above?
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Oleg Bartunov

Yes, the repository you mentioned is the last version of our
development. It contains various fixes of issues by Andres, but we are
waiting Andrew, who is working on jsonb stuff.

On Mon, Feb 24, 2014 at 5:34 PM, Tomas Vondra  wrote:
> On 7.2.2014 00:47, Andrew Dunstan wrote:
>>
>> On 02/05/2014 10:36 AM, Teodor Sigaev wrote:
>>>  Should I make new version of patch? Right now it's placed on github.
>>> May be Andrew wants to change something?
>>>
>>
>>
>> Attached are updated patches.
>>
>> Apart from the things Teodor has fixed, this includes
>>
>>  * switching to using text representation in jsonb send/recv
>>  * implementation of jsonb_array_elements_text that we need now we have
>>json_array_elements_text
>>  * some code fixes requested in code reviews, plus some other tidying
>>and refactoring.
>>
>> cheers
>
> Hi,
>
> I'm slightly uncertain if this is the current version of the patches, or
> whether I should look at
> https://github.com/feodor/postgres/tree/jsonb_and_hstore which contains
> slightly modified code.
>
> Anyway, the only thing I noticed in the v10 version so far is slight
> difference in naming - while we have json_to_hstore/hstore_to_json, we
> have jsonb2hstore/hstore2jsonb. I propose to change this to
> jsonb_to_hstore/hstore_to_jsonb.
>
> May not be needed if the implicit casts go through.
>
> regards
> Tomas
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-03 Thread Oleg Bartunov

Andres,

you can always look at our development repository:
https://github.com/feodor/postgres/tree/hstore - hstore only,
https://github.com/feodor/postgres/tree/jsonb_and_hstore - hstore with jsonb

Since we were concentrated on the jsonb_and_hstore branch we usually
wait Andrew, who publish patch.  You last issues were addressed in
both branches.

Oleg

PS.

We are not native-english and may not well inderstand your criticism
well, but  please try to be a little bit polite.  We are working
together and our common goal is to make postgres better.  Your notes
are very important for quality of postgres, but sometimes you drive us
...

On Mon, Mar 3, 2014 at 7:00 PM, Andres Freund  wrote:
> On 2014-03-03 08:57:59 -0600, Merlin Moncure wrote:
>> On Fri, Feb 28, 2014 at 2:01 PM, Andres Freund  
>> wrote:
>> > On 2014-02-28 14:45:29 -0500, Andrew Dunstan wrote:
>> >> Well, the jsonb portion of this is arguably the most ready, certainly it's
>> >> had a lot more on-list review.
>> >
>> > Having crossread both patches I tend to agree with this. I don't think
>> > it's unrealistic to get jsonb committable, but the hstore bits are
>> > another story.
>>
>> hm, do you have any specific concerns/objections about hstore?
>
> I've listed a fair number in various emails, some have been addressed
> since I think. But just take a look at the patch, at least last when I
> looked, it was simply far from ready. And it's quite a bit of code, so
> it's not something that can be addressed within 5min.
>
> Greetings,
>
> Andres Freund
>
> --
>  Andres Freund http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-03 Thread Oleg Bartunov

On Mon, Mar 3, 2014 at 7:22 PM, Andres Freund  wrote:
> Hi Oleg,
>
> On 2014-03-03 19:17:12 +0400, Oleg Bartunov wrote:
>> Since we were concentrated on the jsonb_and_hstore branch we usually
>> wait Andrew, who publish patch.  You last issues were addressed in
>> both branches.
>
> I'll try to have look sometime soon.
>
>> We are not native-english and may not well inderstand your criticism
>> well, but  please try to be a little bit polite.  We are working
>> together and our common goal is to make postgres better.  Your notes
>> are very important for quality of postgres, but sometimes you drive us
>> ...
>
> I am sorry if I came over as impolite. I just tried to point at things I
> thought needed improvement, and imo there were quite some. A patch
> needing polishing isn't something that carries shame, blame or
> anything. It's just a state a patch can be in.

We have not so much time to go deep onto 100th messages threads and sometimes
just lost directions.

>
> Greetings,
>
> Andres Freund
>
> PS: Not a native speaker either...

That's explain all :)


>
> --
>  Andres Freund http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-04 Thread Oleg Bartunov

Thanks, looks like a bug.

On Tue, Mar 4, 2014 at 12:38 PM, Peter Geoghegan  wrote:
> Hi Oleg,
>
> On Mon, Mar 3, 2014 at 7:17 AM, Oleg Bartunov  wrote:
>> you can always look at our development repository:
>
> I think I found a bug:
>
> [local]/postgres=# \d+ bar
>  Table "public.bar"
>  Column | Type  | Modifiers | Storage  | Stats target | Description
> +---+---+--+--+-
>  i  | jsonb |   | extended |  |
> Indexes:
> "f" gin (i)
> Has OIDs: no
>
> [local]/postgres=# insert into bar values ('{
> "firstName": "John",
> "lastName": "Smith",
> "age": 25,
> "address": {
> "streetAddress": "21 2nd Street",
> "city": "New York",
> "state": "NY",
> "postalCode": 10021
> },
> "phoneNumbers": [
> {
> "type": "home",
> "number": "212 555-1234"
> },
> {
> "type": "fax",
> "number": "646 555-4567"
> }
> ]
> }');
> INSERT 0 1
> Time: 7.635 ms
> [local]/postgres=# select * from bar where i @> '{"age":25.0}'::jsonb;
>  i
> ---
> (0 rows)
>
> Time: 2.443 ms
> [local]/postgres=# explain select * from bar where i @> '{"age":25.0}'::jsonb;
>QUERY PLAN
> -
>  Bitmap Heap Scan on bar  (cost=16.01..20.02 rows=1 width=32)
>Recheck Cond: ((i)::hstore @> '"age"=>25.0'::hstore)
>->  Bitmap Index Scan on f  (cost=0.00..16.01 rows=1 width=0)
>  Index Cond: ((i)::hstore @> '"age"=>25.0'::hstore)
>  Planning time: 0.161 ms
> (5 rows)
>
> [local]/postgres=# set enable_bitmapscan = off;
> SET
> Time: 6.052 ms
> [local]/postgres=# select * from bar where i @> '{"age":25.0}'::jsonb;
> -[ RECORD 1 
> ]
> i | {"age": 25, "address": {"city": "New York", "state": "NY",
> "postalCode": 10021, "streetAddress": "21 2nd Street"}, "lastName":
> "Smith", "firstName": "John", "phoneNumbers": [{"type": "home",
> "number": "212 555-1234"}, {"type": "fax", "number": "646 555-4567"}]}
>
> Time: 6.479 ms
> [local]/postgres=# explain select * from bar where i @> '{"age":25.0}'::jsonb;
>  QUERY PLAN
> -
>  Seq Scan on bar  (cost=0.00..26.38 rows=1 width=32)
>Filter: ((i)::hstore @> '"age"=>25.0'::hstore)
>  Planning time: 0.154 ms
> (3 rows)
>
> Time: 6.565 ms
>
> --
> Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-04 Thread Oleg Bartunov

I tried try.mongodb.com

> 25 == 25.0
true

On Tue, Mar 4, 2014 at 2:18 PM, Peter Geoghegan  wrote:
> On Tue, Mar 4, 2014 at 2:18 AM, Teodor Sigaev  wrote:
>> That is because compareJsonbValue compares numeric values with a help of
>> numeric_cmp() instead of comparing text representation. This inconsistent
>> will be fixed.
>
> Cool.
>
>
> --
> Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Oleg Bartunov

On Thu, Mar 6, 2014 at 12:43 PM, Peter Geoghegan  wrote:
> On Thu, Mar 6, 2014 at 12:23 AM, Teodor Sigaev  wrote:
>> That's possible to introduce GUC variable for i/o functions which will
>> control old "bug-to-bug" behavior. IMHO, this is much better option that
>> stopping hstore development or split hstore to two branches.
>
> A GUC that controls i/o functions is generally considered to be an
> unacceptable hack.
>
> In what sense are we really stopping hstore development if hstore2
> lives as jsonb? I have a hard time imagining someone dealing with the
> incompatibility that a user-facing hstore2 would introduce, while
> still preferring hstore syntax over json syntax given the choice.
> There are very rich facilities for manipulating json available in
> every programming language. The same is not true of hstore.
>
> Having looked at the issue today, I think that the amount of redundant
> code between a hstore2 in core as jsonb and hstore1 will be
> acceptable. The advantages of making a clean-break in having to
> support the legacy hstore disk format strengthen the case for doing so
> too.

Heh, let's not to do an implusive decision about hstore2. I agree,
that jsonb has
a lot of facilities, but don't forget, that json(b) has to follow standard and
in that sense it's more constrained than hstore, which we could further
develop to support some interesting features, which will never be implemented
in json(b).  Also,  it'd be a bit awkward after working on nested
hstore and declaring it
on several conferences (Engine Yard has sponsored part of our hstore
work), suddenly
break people expectation and say, that our work has moved to core to
provide json
some very cool features, good bye, hstore users :(   I'm afraid people
will not understand us.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Oleg Bartunov

Hi there,

Looks like consensus is done. I and Teodor are not happy with it, but
what we can do :)   One thing I  want to do is to reserve our
contribution to the flagship feature (jsonb), particularly, "binary
storage for nested structures and indexing. Their work was sponsored
by Engine Yard".

As for the old hstore I think it'd be nice to add gin_hstore_hash_ops,
so hstore users will benefit from 9.4 release. There is no
compatibiliy issue, so I think this could be harmless.

Oleg

On Thu, Mar 6, 2014 at 7:25 PM, Tom Lane  wrote:
> Bruce Momjian  writes:
>> OK, just to summarize:
>
>>   JSONB and everything it shares with hstore will be in core
>>   hstore-specific code stays in contrib
>>   hstore contrib will create an hstore type to call contrib and core code
>>   9.4 hstore has some differences from pre-9.4
>
> I've got a problem with the last part of that.  AFAICS, the value
> proposition for hstore2 largely fails if it's not 100% upward compatible
> with existing hstore, both as to on-disk storage and as to application-
> visible behavior.  If you've got to adapt your application anyway, why
> not switch to JSONB which is going to offer a lot of benefits in terms
> of available code you can work with?
>
> Although I've not looked at the patch, it was claimed upthread that there
> were changes in the I/O format for existing test cases, for example.
> IMO, that's an absolute dead no-go.
>
>> The question is whether we change/improve hstore in 9.4, or create an
>> hstore2 that is the improved hstore for 9.4 and keep hstore identical to
>> pre-9.4.  That last option looks an awful like the dreaded VARCHAR2.
>
> I think hstore2 as a separate type isn't likely to be a win either.
>
> The bottom line here is that hstore2 is more or less what we'd agreed to
> doing back at the last PGCon, but that decision has now been obsoleted by
> events in the JSON area.  If jsonb gets in, I think we probably end up
> rejecting hstore2 as such.  Or at least, that's what we should do IMO.
> contrib/hstore is now a legacy type and we shouldn't be putting additional
> work into it, especially not work that breaks backwards compatibility.
>
> regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Oleg Bartunov

I meant in Release Notes for 9.4

On Thu, Mar 6, 2014 at 10:26 PM, Andrew Dunstan  wrote:
>
>
>
>
>
> On 03/06/2014 12:50 PM, Oleg Bartunov wrote:
>>
>> Hi there,
>>
>> Looks like consensus is done. I and Teodor are not happy with it, but
>> what we can do :)   One thing I  want to do is to reserve our
>> contribution to the flagship feature (jsonb), particularly, "binary
>> storage for nested structures and indexing. Their work was sponsored
>> by Engine Yard".
>
>
>
> We don't normally credit sponsors in commits, but if I'm doing the commit I
> promise you guys would certainly get major credit as authors.
>
>
> cheers
>
> andrew
>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Oleg Bartunov

On Thu, Mar 6, 2014 at 10:54 PM, Josh Berkus  wrote:
g?
>
> On 03/06/2014 09:50 AM, Oleg Bartunov wrote:> Looks like consensus is
> done. I and Teodor are not happy with it, but
>> what we can do :)   One thing I  want to do is to reserve our
>> contribution to the flagship feature (jsonb), particularly, "binary
>> storage for nested structures and indexing. Their work was sponsored
>> by Engine Yard".
>
> We don't generally credit companies in the release notes, since if we
> started, where would we stop?  However, we *do* credit them in the press
> release, and I'll make a note of the EY sponsorship, especially since
> it's also good PR.

I think press release is fine. We waited a long time for sponsorship
of our work and EY help was crucial.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hstore: Query speedups with Gin index

2013-08-26 Thread Oleg Bartunov

Michael,  take a look on http://obartunov.livejournal.com/171959.html
As for the indexing stuff we already thought many times about key&value
mixing, but real solution, probably, could come from spgist and gin
combination.  I mean, spgist (suffix array) instead of btree for avoiding
key duplication, which is real stopper for key.value mixing, especially,
for deep nesting structures. We'll research further and probably will
develop a prototype of such hybrid  search tree.

Oleg


On Mon, Aug 26, 2013 at 6:11 AM, Tom Lane  wrote:

> Michael Paquier  writes:
> > On Thu, Aug 22, 2013 at 11:55 PM, Blake Smith 
> wrote:
> >> The combined entry is used to support "contains (@>)" queries, and the
> key
> >> only item is used to support "key contains (?)" queries. This change
> seems
> >> to help especially with hstore keys that have high cardinalities.
> Downsides
> >> of this change is that it requires an index rebuild, and the index will
> be
> >> larger in size.
>
> > Index rebuild would be a problem only for minor releases,
>
> That's completely false; people have expected major releases to be
> on-disk-compatible for several years now.  While there probably will be
> future releases in which we are willing to break storage compatibility,
> a contrib module doesn't get to dictate that.
>
> What might be a practical solution, especially if this isn't always a
> win (which seems likely given the index-bloat risk), is to make hstore
> offer two different GIN index opclasses, one that works the traditional
> way and one that works this way.
>
> Another thing that needs to be taken into account here is Oleg and
> Teodor's in-progress work on extending hstore:
> https://www.pgcon.org/2013/schedule/events/518.en.html
> I'm not sure if this patch would conflict with that at all, but it
> needs to be considered.
>
> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] [v9.4] row level security

2013-08-27 Thread Oleg Bartunov

btw, there is serious problem with row-level security and constraints. For
example, user with low security level could use unique constraint to know
about existence of a row with higher security.  I don't know, what is the
best practice to avoid this.


On Wed, Aug 28, 2013 at 1:37 AM, Greg Smith  wrote:

> On 7/20/13 10:08 AM, Kohei KaiGai wrote:
>
>> Hmm. I didn't have this idea. It seems to me fair enough and kills
>> necessity to enhance RangeTblEntry and getrelid() indeed.
>> I try to fix up this implementation according to your suggestion.
>>
>
> How is that going?  I'm going to do a serious review of this myself over
> the next few weeks.  I have a good chunk of time set aside for it as part
> of a larger project.  I'm hoping to get more people here involved in that
> effort too, starting in the November CF if that works out.
>
> I've been trying to catch up with your larger plan for this feature for
> 9.4.  You made this comment earlier:
>
> > Also, I'd like to have discussion for this feature in earlier half of
> > v9.4 to keep time for the remaining features, such as check on
> > writer-side, integration with selinux, and so on
>
> Is any of that code around yet?  I see that you have split your
> submissions so that a smaller program can be reviewed today.  I'd like to
> start taking a look at the next step too though.  For the project I'm
> starting to work on here, getting the integration with labeling also done
> is a very important thing to target for 9.4.  It would be nice to see how
> that fits together today, even if the code for it isn't being reviewed
> heavily yet.
>
> I don't quite understand yet what's missing on the writer side.  If you
> could help explain what's missing there, I would like to read about that.
>
> --
> Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
> PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-hackers
>

Re: [HACKERS] [v9.4] row level security

2013-08-29 Thread Oleg Bartunov

Any constraints could be "covert channel".


On Wed, Aug 28, 2013 at 4:17 PM, Kohei KaiGai  wrote:

> 2013/8/28 Oleg Bartunov :
> > btw, there is serious problem with row-level security and constraints.
> For
> > example, user with low security level could use unique constraint to know
> > about existence of a row with higher security.  I don't know, what is the
> > best practice to avoid this.
> >
> It is out of scope for this feature. We usually calls this type of
> information
> leakage "covert channel"; that is not avoidable in principle.
> However, its significance is minor, because attacker must know identical
> data to be here, or must have proving for each possible values.
> Its solution is simple. DBA should not use value to be confidential as
> unique
> key. If needed, our recommendation is alternative key, instead of natural
> key,
> because its value itself does not have worth.
>
> I should add a note of caution onto the documentation according to
> the previous consensus, however, I noticed it had gone from the sgml files
> while I was unaware. So, let me add description on the documentation.
>
> Thanks,
> --
> KaiGai Kohei 
>

Re: [HACKERS] Hstore: Query speedups with Gin index

2013-09-06 Thread Oleg Bartunov

Blake,

I think it's better to implement this patch as a separate opclass, so users
will have option to choose indexing.

Oleg


On Tue, Sep 3, 2013 at 6:24 PM, Blake Smith  wrote:

> Thanks for the feedback everyone. I've attached the patch that we are now
> running in production to service our hstore include queries. We rebuilt the
> index to account for the on-disk incompatibility. I've submitted the patch
> to commitfest here:
> https://commitfest.postgresql.org/action/patch_view?id=1203
>
> Michael: I don't have a formal benchmark, but several of our worst queries
> went from 10-20 seconds per query down to 50-400 ms. These are numbers
> we've seen when testing real production queries against our production
> dataset with real world access patterns.
> Oleg: Thanks for your thoughts on this change. As for the spgist / gin
> work you're doing, is there anything you need help with or are you still in
> the research phase? I'd love to help get something more robust merged into
> mainline if you think there's collaborative work to be done (even if it's
> only user testing).
>
> Thanks,
>
> Blake
>
>
>
>
> On Wed, Aug 28, 2013 at 12:40 PM, Andres Freund wrote:
>
>> On 2013-08-28 13:31:22 -0400, Bruce Momjian wrote:
>> > On Sun, Aug 25, 2013 at 10:11:50PM -0400, Tom Lane wrote:
>> > > Michael Paquier  writes:
>> > > > On Thu, Aug 22, 2013 at 11:55 PM, Blake Smith <
>> blakesmi...@gmail.com> wrote:
>> > > >> The combined entry is used to support "contains (@>)" queries, and
>> the key
>> > > >> only item is used to support "key contains (?)" queries. This
>> change seems
>> > > >> to help especially with hstore keys that have high cardinalities.
>> Downsides
>> > > >> of this change is that it requires an index rebuild, and the index
>> will be
>> > > >> larger in size.
>> > >
>> > > > Index rebuild would be a problem only for minor releases,
>> > >
>> > > That's completely false; people have expected major releases to be
>> > > on-disk-compatible for several years now.  While there probably will
>> be
>> > > future releases in which we are willing to break storage
>> compatibility,
>> > > a contrib module doesn't get to dictate that.
>> > >
>> > > What might be a practical solution, especially if this isn't always a
>> > > win (which seems likely given the index-bloat risk), is to make hstore
>> > > offer two different GIN index opclasses, one that works the
>> traditional
>> > > way and one that works this way.
>> > >
>> > > Another thing that needs to be taken into account here is Oleg and
>> > > Teodor's in-progress work on extending hstore:
>> > > https://www.pgcon.org/2013/schedule/events/518.en.html
>> > > I'm not sure if this patch would conflict with that at all, but it
>> > > needs to be considered.
>> >
>> > We can disallow in-place upgrades for clusters that use certain contrib
>> > modules --- we have done that in the past.
>>
>> But that really cannot be acceptable for hstore. The probably most
>> widely used extension there is.
>>
>> Greetings,
>>
>> Andres Freund
>>
>> --
>>  Andres Freund http://www.2ndQuadrant.com/
>>  PostgreSQL Development, 24x7 Support, Training & Services
>>
>
>
>
> --
> Blake Smith
> http://blakesmith.me
> @blakesmith
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Re: [HACKERS] Hstore: Query speedups with Gin index

2013-09-10 Thread Oleg Bartunov

Blake,

Teodor will review your patch, but I have one consideration about the patch
in context of future hstore, which supports hierarchical structures. In
that case overhead of composite keys will be enormous and the only way in
this direction is to think about idea suffix array instead of btree to
store keys. But this is another big task and I afraid to think about this
now.

Oleg


On Mon, Sep 9, 2013 at 6:55 PM, Blake Smith  wrote:

> Thanks for getting back to me about this change Oleg. I took your advice
> and reworked the patch by adding a new hstore gin opclass
> (gin_hstore_combined_ops) and leaving the functionality of the default
> hstore gin opclass the same. This should prevent the on-disk compatibility
> issues from the first patch, and allow users to select the different
> indexing method when they build the index. The hstore regression suite is
> passing for me locally with the --enable-cassert configure flag. Please let
> me know what you think and if there is any other work that would need to be
> done (style cleanups, updating documentation, etc) to get this merged.
>
> Thanks!
>
> Blake
>
>
>
>
>
>
> On Fri, Sep 6, 2013 at 1:47 PM, Oleg Bartunov  wrote:
>
>> Blake,
>>
>> I think it's better to implement this patch as a separate opclass, so
>> users will have option to choose indexing.
>>
>> Oleg
>>
>>
>> On Tue, Sep 3, 2013 at 6:24 PM, Blake Smith wrote:
>>
>>> Thanks for the feedback everyone. I've attached the patch that we are
>>> now running in production to service our hstore include queries. We rebuilt
>>> the index to account for the on-disk incompatibility. I've submitted the
>>> patch to commitfest here:
>>> https://commitfest.postgresql.org/action/patch_view?id=1203
>>>
>>> Michael: I don't have a formal benchmark, but several of our worst
>>> queries went from 10-20 seconds per query down to 50-400 ms. These are
>>> numbers we've seen when testing real production queries against our
>>> production dataset with real world access patterns.
>>> Oleg: Thanks for your thoughts on this change. As for the spgist / gin
>>> work you're doing, is there anything you need help with or are you still in
>>> the research phase? I'd love to help get something more robust merged into
>>> mainline if you think there's collaborative work to be done (even if it's
>>> only user testing).
>>>
>>> Thanks,
>>>
>>> Blake
>>>
>>>
>>>
>>>
>>> On Wed, Aug 28, 2013 at 12:40 PM, Andres Freund 
>>> wrote:
>>>
>>>> On 2013-08-28 13:31:22 -0400, Bruce Momjian wrote:
>>>> > On Sun, Aug 25, 2013 at 10:11:50PM -0400, Tom Lane wrote:
>>>> > > Michael Paquier  writes:
>>>> > > > On Thu, Aug 22, 2013 at 11:55 PM, Blake Smith <
>>>> blakesmi...@gmail.com> wrote:
>>>> > > >> The combined entry is used to support "contains (@>)" queries,
>>>> and the key
>>>> > > >> only item is used to support "key contains (?)" queries. This
>>>> change seems
>>>> > > >> to help especially with hstore keys that have high
>>>> cardinalities. Downsides
>>>> > > >> of this change is that it requires an index rebuild, and the
>>>> index will be
>>>> > > >> larger in size.
>>>> > >
>>>> > > > Index rebuild would be a problem only for minor releases,
>>>> > >
>>>> > > That's completely false; people have expected major releases to be
>>>> > > on-disk-compatible for several years now.  While there probably
>>>> will be
>>>> > > future releases in which we are willing to break storage
>>>> compatibility,
>>>> > > a contrib module doesn't get to dictate that.
>>>> > >
>>>> > > What might be a practical solution, especially if this isn't always
>>>> a
>>>> > > win (which seems likely given the index-bloat risk), is to make
>>>> hstore
>>>> > > offer two different GIN index opclasses, one that works the
>>>> traditional
>>>> > > way and one that works this way.
>>>> > >
>>>> > > Another thing that needs to be taken into account here is Oleg and
>>>> > > Teodor's in-progress work on extending hstore:
>>>> > > https://www.pgcon.org/2013/schedule/events/518.en.html
>>>> > > I'm not sure if this patch would conflict with that at all, but it
>>>> > > needs to be considered.
>>>> >
>>>> > We can disallow in-place upgrades for clusters that use certain
>>>> contrib
>>>> > modules --- we have done that in the past.
>>>>
>>>> But that really cannot be acceptable for hstore. The probably most
>>>> widely used extension there is.
>>>>
>>>> Greetings,
>>>>
>>>> Andres Freund
>>>>
>>>> --
>>>>  Andres Freund http://www.2ndQuadrant.com/
>>>>  PostgreSQL Development, 24x7 Support, Training & Services
>>>>
>>>
>>>
>>>
>>> --
>>> Blake Smith
>>> http://blakesmith.me
>>> @blakesmith
>>>
>>>
>>> --
>>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-hackers
>>>
>>>
>>
>
>
> --
> Blake Smith
> http://blakesmith.me
> @blakesmith
>

[HACKERS] \h open

2013-09-15 Thread Oleg Bartunov

I noticed there is nothing available in built-in psql help about OPEN
command. Does it intentional ?

postgres=# \h open
No help available for "open".
Try \h with no arguments to see available help.

Oleg

Re: [HACKERS] Looking for information on our elephant

2013-09-19 Thread Oleg Bartunov

Tatsuo,

I have  emails even from 1995 !  You can see that historical message here:
http://www.pgsql.ru/db/mw/msg.html?mid=1238939

Re: [HACKERS] PostgreSQL logo.
*Author:* yang( at )sjuphil( dot )sju( dot )edu
*Date:* 1997-04-03 20:36:33
*Subject:*Re:
[HACKERS] PostgreSQL logo.

Some other ideas:

derivative: a sword (derivative of the Dragon book cover -- Postgres as a tool)
illustrative: a bowl of Alphabet Soup, with letters spelling out POSTGRESQL
obscure: a revolver/hit man (Grosse Pt is an anagram of Postgres, and an
abbreviation of the title of the new John Cusack movie)

but if you want an animal-based logo, how about some sort of elephant?
After all, as the Agatha Christie title read, elephants can remember ...

David yangy...@sju.edu

Oleg

On Fri, Sep 20, 2013 at 3:16 AM, Tatsuo Ishii  wrote:

> Hi,
>
> I'm Looking for information on our elephant: especially how/why we
> chose elephant as our mascot.
>
> According to:
> http://wiki.postgresql.org/wiki/Logo
>
> The discussion was back to 1997 on the hackers list. Unfortunately the
> official archive for 1997 was lost. Mine seems to be gone too. Any
> information will be appreciated.
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Looking for information on our elephant

2013-09-22 Thread Oleg Bartunov

Tatsuo,

you could ask Marc about archives. Probably he has original mbox files.

Oleg


On Sun, Sep 22, 2013 at 11:18 AM, Tatsuo Ishii  wrote:

> Oleg,
>
> Unfortunately the archives seem to miss attached files. I love to see
> the attached files because they are logo images.  Any idea?
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
>
> > Tatsuo,
> >
> > I have  emails even from 1995 !  You can see that historical message
> here:
> > http://www.pgsql.ru/db/mw/msg.html?mid=1238939
> >
> > Re: [HACKERS] PostgreSQL logo.
> > *Author:* yang( at )sjuphil( dot )sju( dot )edu
> > *Date:* 1997-04-03 20:36:33
> > *Subject:*<
> http://www.pgsql.ru/db/mw/index.html?word=Re%3A%20%5BHACKERS%5D%20PostgreSQL%20logo
> .>Re:
> > [HACKERS] PostgreSQL logo.
> >
> > Some other ideas:
> >
> > derivative: a sword (derivative of the Dragon book cover -- Postgres as
> a tool)
> > illustrative: a bowl of Alphabet Soup, with letters spelling out
> POSTGRESQL
> > obscure: a revolver/hit man (Grosse Pt is an anagram of Postgres, and an
> > abbreviation of the title of the new John Cusack movie)
> >
> > but if you want an animal-based logo, how about some sort of elephant?
> > After all, as the Agatha Christie title read, elephants can remember ...
> >
> > David yangy...@sju.edu
> >
> >
> >
> > Oleg
> >
> >
> > On Fri, Sep 20, 2013 at 3:16 AM, Tatsuo Ishii 
> wrote:
> >
> >> Hi,
> >>
> >> I'm Looking for information on our elephant: especially how/why we
> >> chose elephant as our mascot.
> >>
> >> According to:
> >> http://wiki.postgresql.org/wiki/Logo
> >>
> >> The discussion was back to 1997 on the hackers list. Unfortunately the
> >> official archive for 1997 was lost. Mine seems to be gone too. Any
> >> information will be appreciated.
> >> --
> >> Tatsuo Ishii
> >> SRA OSS, Inc. Japan
> >> English: http://www.sraoss.co.jp/index_en.php
> >> Japanese: http://www.sraoss.co.jp
> >>
> >>
> >> --
> >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> >> To make changes to your subscription:
> >> http://www.postgresql.org/mailpref/pgsql-hackers
> >>
>

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 976 matches

Mail list logo