Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2016-07-23 Thread Anton Dignös
On Sat, Jul 23, 2016 at 12:01 AM, David Fetter <da...@fetter.org> wrote:
> On Fri, Jul 22, 2016 at 01:15:17PM +0200, Anton Dignös wrote:
>> Hi hackers,
>>
>> we are a group of researches that work on temporal databases.  Our
>> main focus is the processing of data with time intervals, such as
>> the range types in PostgreSQL.
>
> Thanks for your hard work so far!
>
> [Explanation and examples elided]
>
> To what extent, if any, are you attempting to follow the SQL:2011
> standard?
>
> http://cs.ulb.ac.be/public/_media/teaching/infoh415/tempfeaturessql2011.pdf

The querying in the SQL:2011 standard is based on simple SQL range restrictions
and period predicates (OVERLAP, PRECEDES, FOR SYSTEM_TIME AS OF, etc) that
functionality-wise in PostgreSQL are already covered by the operators and
functions on range types.

Operations such as aggregation, outer joins, set-operations on ranges
(mentioned in
Section 2.5 "Future directions" in the above paper) are not yet part of the
standard. These are the operations that require the adjustment (or splitting) of
ranges.

Best,

Anton


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protocol buffer support for Postgres

2016-06-24 Thread Flavius Anton
On Fri, Jun 24, 2016 at 11:35 AM, Álvaro Hernández Tortosa
<a...@8kdata.com> wrote:
>
>
> On 24/06/16 14:23, Flavius Anton wrote:
>>
>> On Thu, Jun 23, 2016 at 2:54 PM, Flavius Anton <f.v.an...@gmail.com>
>> wrote:
>>>
>>> On Thu, Jun 23, 2016 at 1:50 PM, Greg Stark <st...@mit.edu> wrote:
>>>>
>>>> On Thu, Jun 23, 2016 at 8:15 PM, Flavius Anton <f.v.an...@gmail.com>
>>>> wrote:
>>>>>
>>>>> I'd love to talk more about this.
>>>>
>>>> I thought quite a bit about this a few years ago but never really
>>>> picked up it to work on.
>>
>> Any other thoughts on this? My guess is that it might be an important
>> addition to Postgres that can attract even more users, but I am not
>> sure if there's enough interest from the community. If I want to pick
>> this task, how should I move forward? Do I need to write a design
>> document or similar or should I come up with a patch that implements a
>> draft prototype? I am new to this community and don't know the code
>> yet, so I'd appreciate some guidance from an older, more experienced
>> member.
>>
>>
>
> Other than protobuf, there are also other serialization formats that
> might be worth considering. Not too long ago I suggested working
> specifically on serialization formas for the json/jsonb types:
> https://www.postgresql.org/message-id/56CB8A62.40100%408kdata.com I believe
> this effort is on the same boat.

Sure, there are a bunch of these, some of the most popular being:
* Cap'n Proto
* Flatbuffers
* Thrift

A longer list is already made here[1].

Meanwhile, I came across another interesting idea. What if, for
starters, we don't introduce a completely new serialization format,
like Protocol Buffers, but rather make the JSON support more stronger
and interesting. What I am thinking is to have a JSON schema
(optionally) associated with a JSON column. In this way, you don't
have to store the keys on disk anymore and also you'd have your
database check for JSON validity at INSERT time. I think there are two
big advantages here. What do you think about this one?

--
Flavius

[1] https://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protocol buffer support for Postgres

2016-06-24 Thread Flavius Anton
On Thu, Jun 23, 2016 at 2:54 PM, Flavius Anton <f.v.an...@gmail.com> wrote:
> On Thu, Jun 23, 2016 at 1:50 PM, Greg Stark <st...@mit.edu> wrote:
>> On Thu, Jun 23, 2016 at 8:15 PM, Flavius Anton <f.v.an...@gmail.com> wrote:
>>>
>>> I'd love to talk more about this.
>>
>> I thought quite a bit about this a few years ago but never really
>> picked up it to work on.

Any other thoughts on this? My guess is that it might be an important
addition to Postgres that can attract even more users, but I am not
sure if there's enough interest from the community. If I want to pick
this task, how should I move forward? Do I need to write a design
document or similar or should I come up with a patch that implements a
draft prototype? I am new to this community and don't know the code
yet, so I'd appreciate some guidance from an older, more experienced
member.

Thanks.

--
Flavius


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protocol buffer support for Postgres

2016-06-23 Thread Flavius Anton
On Thu, Jun 23, 2016 at 1:50 PM, Greg Stark <st...@mit.edu> wrote:
> On Thu, Jun 23, 2016 at 8:15 PM, Flavius Anton <f.v.an...@gmail.com> wrote:
>>
>> I'd love to talk more about this.
>
> I thought quite a bit about this a few years ago but never really
> picked up it to work on.
>
> Another option would be to allow the output of your select query to be
> read in a protobuf. That might be a feature to add in libpq rather
> than in the server, or perhaps as a new protocol feature that libpq
> would then just switch to which might make it easier to use from other
> languages. That might make it easier to use Postgres as a data store
> for an environment where everything is in protobufs without losing the
> full power of SQL schemas in the database.

I agree on this one, I think it's the most /natural/ way of doing things.

> As an aside, have you seen Cap’n Proto? It looks more interesting to
> me than protobufs. It fixes a lot of the corner cases where protobufs
> end up with unbounded computational complexity and seems a lot
> cleaner.

I've seen it around for quite some time, but my fear is that it is not
(yet?) widely adopted. Protocol Buffers themselves are not that
popular, let alone Cap'n Proto. By the way, there's also the newer
FlatBuffers[1] project, from Google too. They seem a lot like Cap'n
Proto, though. I think it's doable to provide some sort of abstraction
for these protocols in PostgreSQL, so that it can support all of them
eventually, with minimum effort for adding a new one. However, I am
skeptical about the practical advantages of having /all/ of them
supported.

--
Flavius

[1] https://github.com/google/flatbuffers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protocol buffer support for Postgres

2016-06-23 Thread Flavius Anton
On Tue, Apr 26, 2016 at 2:40:49PM -0700, David Fetter wrote:
> Should we see about making a more flexible serialization
> infrastructure?  What we have is mostly /ad hoc/, and has already
> caused real pain to the PostGIS folks, this even after some pretty
> significant and successful efforts were made in this direction.

Hi all. Is anybody working on this right now? I would like to pick
this task for the summer. First of all, what do you think about what
David said? Should we try and design a generic infrastructure for
similar serialization datatypes? If so, will we need to refactor some
pieces from the JSON/XML implementation? I looked over the code and it
seems nicely decoupled, but I am not sure what this would involve.
I've done this before for MySQL[1] (not yet completed), but I'd love
to try it for PostgreSQL too.

On Tue, Apr 26, 2016 at 11:23:11AM -0700, José Luis Tallón wrote:
> Have you investigated JSONB vs ProtoBuf space usage ?
>(the key being  the "B" -- Postgres' own binary JSON
> implementation)

This is something I can further investigate, but another (possibly
major) benefit of the Protocol Buffers over JSON is that they *still*
have a schema. I think they combine advantages from both structured
and schemaless data.

My best guess is that we shouldn't focus on abstracting *any*
serialization paradigm, but only the ones that have a schema (like
Thrift or Protocol Buffers). Speaking of schemas, where is the best
place to keep that? For MySQL I opted for a plain text file similar to
.trg files (the ones used by MySQL for keeping triggers).

I'd love to talk more about this.

Thank you.
Flavius Anton

[1] https://github.com/google/mysql-protobuf


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Box type equality

2015-09-29 Thread Jeff Anton

The Box type is the oldest non-linear type in the Postgres system.
We used it as the template for extensibility in the original system 
about thirty years ago.  We had R-trees for box indexing.  If you want 
fuzzy box matching, that seems possible with R-trees and some creativity 
by say matching an R-tree with a little larger box and using containment 
and maybe also not contained by a smaller box.  This is the idea behind 
strategies.  That you can use existing operations to build a new operation.


If you have to force this onto B-tree's I think you will have to choose 
one edge to index on (i.e. one of the four values) then fuzzy match that 
through the index and have a secondary condition to further restrict the 
matches.


As with all the geometric types, you can use containment boxes for them 
and have the secondary condition checks.


It's all just a few lines of code as Stonebraker used to say.

Jeff Anton


On 09/29/15 08:43, Tom Lane wrote:

Stanislav Kelvich <s.kelv...@postgrespro.ru> writes:

I've faced an issue with Box type comparison that exists almost for a five 
years.


Try twenty-five years.  The code's been like that since Berkeley.


   That can be fixed by b-tree equality for boxes, but we need some
   decisions there.


The problem with inventing a btree opclass for boxes is much more
fundamental than fuzzy comparisons, unfortunately.  Btree requires a
linear sort order, and there's no plausible linear ordering of boxes,
unless you compare areas which won't give the equality semantics you want.

We could perhaps invent an exact-equality operator and construct just
a hash opclass for it, no btree.

In any case I think it would be a mistake to consider only boxes; all
the built-in geometric types have related issues.

regards, tom lane





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Jeff Anton

Seems to me that there are a bunch of agendas here.

I read about not wanting to be trapped into a proprietary system.  You 
can be trapped in any software you depend upon.  Compilers, Platforms, 
SCM, issue tracking are all places to be trapped.


Postgres and Postgresql have been around a very long time for the 
software world.  It has outlived several of the systems it has depended 
upon over those many years.  I hope and expect that Postgres will 
continue to outlive some of these platforms.


So do not get hung up on having been 'burned' in the past.  Expect to be 
'burned' again.  Take steps to minimize that pain in the future.


For an issue tracker, open source or proprietary, I would want raw 
database dumps and schema information.  Postgres is a database after 
all.  If you truly have the data, you should be able to migrate it.


Also, does the system you adopt use Postgres?  You are your best open 
source software.  If performance starts to go downhill, you are in the 
best position to fix it if you understand and control it.  How 
responsive will whatever system be to your changing needs?  If you 
partner with an external group, the two groups will benefit from each 
other if they are truly sharing the technologies.


Jeff Anton


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] optimization join on random value

2015-05-03 Thread Anton
Hello guru of postgres,  it's possoble to tune query with join on random 
string ?

i know that it is not real life example, but i need it for tests.

soe=# explain
soe-#  SELECT   ADDRESS_ID,
soe-#   CUSTOMER_ID,
soe-#   DATE_CREATED,
soe-#   HOUSE_NO_OR_NAME,
soe-#   STREET_NAME,
soe-#   TOWN,
soe-#   COUNTY,
soe-#   COUNTRY,
soe-#   POST_CODE,
soe-#   ZIP_CODE
soe-# FROM ADDRESSES
soe-# WHERE customer_id = trunc( random()*45000) ;
QUERY PLAN
---
 Seq Scan on addresses  (cost=0.00..165714.00 rows=22500 width=84)
   Filter: ((customer_id)::double precision = trunc((random() * 
45000::double precision)))

(2 rows)

soe=# \d addresses;
soe=# \d addresses;
  Table public.addresses
  Column  |Type | Modifiers
--+-+---
 address_id   | bigint  | not null
 customer_id  | bigint  | not null
 date_created | timestamp without time zone | not null
 house_no_or_name | character varying(60) |
 street_name  | character varying(60) |
 town | character varying(60) |
 county   | character varying(60) |
 country  | character varying(60) |
 post_code| character varying(12) |
 zip_code | character varying(12) |
Indexes:
addresses_pkey PRIMARY KEY, btree (address_id)
addresses_cust_ix btree (customer_id)
Foreign-key constraints:
add_cust_fk FOREIGN KEY (customer_id) REFERENCES 
customers(customer_id) DEFERRABLE




same query in oracle same query use index access path:

00:05:23 (1)c##bushmelev_aa@orcl explain plan for
 SELECT   ADDRESS_ID,
  CUSTOMER_ID,
  DATE_CREATED,
  HOUSE_NO_OR_NAME,
  STREET_NAME,
  TOWN,
  COUNTY,
  COUNTRY,
  POST_CODE,
  ZIP_CODE
FROM soe.ADDRESSES
* WHERE customer_id = dbms_random.value ();*

Explained.

Elapsed: 00:00:00.05
00:05:29 (1)c##bushmelev_aa@orcl @utlxpls

PLAN_TABLE_OUTPUT
--
Plan hash value: 317664678

---
| Id  | Operation   | Name| Rows  | Bytes | 
Cost (%CPU)| Time |

---
|   0 | SELECT STATEMENT| | 2 |   150 | 
5   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ADDRESSES   | 2 |   150 | 
5   (0)| 00:00:01 |
|*  2 | *INDEX RANGE SCAN * | ADDRESS_CUST_IX | 2 |   
| 3   (0)| 00:00:01 |

---

Predicate Information (identified by operation id):
---
   2 - access(CUSTOMER_ID=DBMS_RANDOM.VALUE())



Re: [HACKERS] Zero-padding and zero-masking fixes for to_char(float)

2015-03-24 Thread Jeff Anton
The issue of significant (decimal) digits to and from floating point 
representation is a complex one.


What 'significant' means may depend upon the intent.

There are (at least) two different tests which may need to be used.

* How many digits can be stored and then accurately returned?
or
* How many decimal digits are needed to recreate a floating point value? 
 Or in longer form, if you have a floating point value, you may want to 
print it in decimal form and then later scan that to recreate the exact 
bit pattern from the original floating point value.  How many decimal 
digits do you need?


The first question produces a smaller number of digits then the second one!

The idea of zero padding is, IMO, a bad idea all together.  It makes 
people feel better, but it adds inaccuracy.  I've lost this 
interpretation so many times now that I only mention it for the real 
number geeks out there.


Postgresql seems to be using the first interpretation and reporting 
fewer digits.  I've noticed this with pg_dump.  That a dump and restore 
of floating point values does not produce the same floating point 
values.  To me, that is inexcusable.  Using the -Fc format, real values 
are preserved.  I have a large database of security prices.  I want 
accuracy above all.


I do not have time right now to produce the needed evidence for all 
these cases of floating point values.  If there is interest I can 
produce this in a day or so.


Jeff Anton

BTW:  This is my first posting to this list.  I should introduce myself.
I'm Jeff Anton.  I was the first Postgres project lead programmer 
working for Michael Stonebraker at U.C. Berkeley a very long time ago.
The first version was never released.  I've since worked for several db 
companies.



On 03/24/15 06:47, Noah Misch wrote:

On Sun, Mar 22, 2015 at 10:53:12PM -0400, Bruce Momjian wrote:

On Sun, Mar 22, 2015 at 04:41:19PM -0400, Noah Misch wrote:

On Wed, Mar 18, 2015 at 05:52:44PM -0400, Bruce Momjian wrote:

This junk digit zeroing matches the Oracle behavior:

SELECT to_char(1.123456789123456789123456789d, 
'9.9') as x from dual;
--
1.12345678912345680

Our output with the patch would be:

SELECT to_char(float8 '1.123456789123456789123456789', 
'9.9');
--
1.12345678912345000



These outputs show Oracle treating 17 digits as significant while PostgreSQL
treats 15 digits as significant.  Should we match Oracle in this respect while
we're breaking compatibility anyway?  I tend to think yes.


Uh, I am hesistant to adjust our precision to match Oracle as I don't
know what they are using internally.


http://sqlfiddle.com/#!4/8b4cf/5 strongly implies 17 significant digits for
float8 and 9 digits for float4.





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Native XML

2011-03-09 Thread Anton
On 03/09/2011 08:21 PM, Yeb Havinga wrote:
 On 2011-03-09 19:30, Robert Haas wrote:
 On Wed, Mar 9, 2011 at 1:11 PM, Bruce Momjian br...@momjian.us wrote:
 
 Robert Haas wrote:
   
 On Mon, Feb 28, 2011 at 10:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 
 Well, in principle we could allow them to work on both, just the same
 way that (for instance) + is a standardized operator but works on more
 than one datatype. ?But I agree that the prospect of two parallel types
 with essentially duplicate functionality isn't pleasing at all.
   
 The real issue here is whether we want to store XML as text (as we do
 now) or as some predigested form which would make output the whole
 thing slower but speed up things like xpath lookups.  We had the same
 issue with JSON, and due to the uncertainty about which way to go with
 it we ended up integrating nothing into core at all.  It's really not
 clear that there is one way of doing this that is right for all use
 cases.  If you are storing xml in an xml column just to get it
 validated, and doing no processing in the DB, then you'd probably
 prefer our current representation.  If you want to build functional
 indexes on xpath expressions, and then run queries that extract data
 using other xpath expressions, you would probably prefer the other
 representation.
 
 Someone should measure how much overhead the indexing of xml values
 might have.  If it is minor, we might be OK with only an indexed xml
 type.
   
 I think the relevant thing to measure would be how fast the
 predigested representation speeds up the evaluation of xpath
 expressions.
 
 About a predigested representation, I hope I'm not insulting anyone's
 education here, but a lot of XML database 'accellerators' seem to be
 using the pre and post orders (see
 http://en.wikipedia.org/wiki/Tree_traversal) of the document nodes.
 The following two pdfs show how these orders can be used to query for
 e.g. all ancestors of a node: second pdf slide 10: for nodes x,y : x
 is an ancestor of y when x.pre  y.pre AND x.post  y.post.

 www.cse.unsw.edu.au/~cs4317/09s1/tutorials/tutor4.pdf  about the format
 www.cse.unsw.edu.au/~cs4317/09s1/tutorials/tutor10.pdf about querying
 the format

 regards,
 Yeb Havinga

This looks rather like a special kind of XML shredding and that is
listed at http://wiki.postgresql.org/wiki/Todo

About the predigested / plain storage and the evaluation: I haven't yet
fully given up the idea to play with it, even though on purely
experimental basis (i.e. with little or no ambition to contribute to the
core product). If doing so, interesting might also be to use TOAST
slicing during the xpath evaluation so that large documents are not
fetched immediately as a whole, if the xpath is rather 'short'.

But first I should let all the thoughts 'settle down'. There may well be
other areas of Postgres where it's worth to spend some time, whether
writing something or just reading.


Re: [HACKERS] Native XML

2011-02-28 Thread Anton
On 02/27/2011 11:57 PM, Peter Eisentraut wrote:
 On sön, 2011-02-27 at 10:45 -0500, Tom Lane wrote:
   
 Hmm, so this doesn't rely on libxml2 at all?  Given the amount of pain
 that library has caused us, getting out from under it seems like a
 mighty attractive idea.
 
 This doesn't replace the existing xml functionality, so it won't help
 getting rid of libxml.

   
Right, what I published on github.com doesn't replace the libxml2
functionality and I didn't say it does at this moment. The idea is to
design (or rather start designing) a low-level XML API on which SQL/XML
functionality can be based. As long as XSLT can be considered a sort of
separate topic, then Postgres uses very small subset of what libxml2
offers and thus it might not be that difficult to implement the same
level of functionality in a new way.

In addition, I think that using a low-level API that Postgres
development team fully controls would speed-up enhancements of the XML
functionality in the future. When I thought of implementing some
functionality listed on the official TODO, I was a little bit
discouraged by the workarounds that need to be added in order to deal
with libxml2 memory management. Also parsing the document each time it's
accessed (which involves parser initialization and finalization) is not
too comfortable and eventually efficient.

A question is of course, if potential new implementation must
necessarily replace the existing one, immediately or at all. What I
published is implemented as a new data type and thus pg_type.h and
pg_proc.h are the only files where something needs to be merged. From
technical point of view, the new type can co-exist with the existing easily.

This however implies a question if such co-existence (whether temporary
or permanent) would be acceptable for users, i.e. if it wouldn't bring
some/significant confusion. That's something I'm not able to answer.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Native XML

2011-02-28 Thread Anton
On 02/28/2011 05:23 PM, Robert Haas wrote:
 On Mon, Feb 28, 2011 at 10:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:
   
 Well, in principle we could allow them to work on both, just the same
 way that (for instance) + is a standardized operator but works on more
 than one datatype.  But I agree that the prospect of two parallel types
 with essentially duplicate functionality isn't pleasing at all.
 
 The real issue here is whether we want to store XML as text (as we do
 now) or as some predigested form which would make output the whole
 thing slower but speed up things like xpath lookups.  We had the same
 issue with JSON, and due to the uncertainty about which way to go with
 it we ended up integrating nothing into core at all.  It's really not
 clear that there is one way of doing this that is right for all use
 cases.  If you are storing xml in an xml column just to get it
 validated, and doing no processing in the DB, then you'd probably
 prefer our current representation.  If you want to build functional
 indexes on xpath expressions, and then run queries that extract data
 using other xpath expressions, you would probably prefer the other
 representation.
   
Yes, it was actually the focal point of my considerations: whether to
store plain text or 'something else'.
It's interesting to know that such uncertainty already existed in
another area. Maybe it's specific to other open source projects too...
 I tend to think that it would be useful to have both text and
 predigested types for both XML and JSON, but I am not too eager to
 begin integrating more stuff into core or contrib until it spends some
 time on pgfoundry or github or wherever people publish their
 PostgreSQL extensions these days and we have a few users prepared to
 testify to its awesomeness.
   
It definitely makes sense to develop this new functionality separate for
some time.
It's kind of exciting to develop something new, but spending significant
effort on the 'native XM' probably needs a bit higher level of consensus
than what appeared in this discussion. In that context, the remark about
users and their needs is something that I can't ignore.

Thanks to all for contributions to this discussion.
 In any case, the definitional problems with xpath_table(), and/or the
 memory management problems with libxml2, are not the basis on which we
 should be making this decision.

   

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Fwd: Re: [HACKERS] Native XML

2011-02-27 Thread Anton
Sorry for resending, I forgot to add 'pgsql-hackers' to CC.

 Original Message 
Subject:Re: [HACKERS] Native XML
Date:   Sun, 27 Feb 2011 23:18:03 +0100
From:   Anton antonin.hou...@gmail.com
To: Tom Lane t...@sss.pgh.pa.us



On 02/27/2011 04:45 PM, Tom Lane wrote:
 Anton antonin.hou...@gmail.com writes:
   
 I've been playing with 'native XML' for a while and now wondering if
 further development of such a feature makes sense for Postgres.
 ...
 Unlike 'libxml2', the parser uses palloc()/pfree(). The output format is
 independent from any 3rd party code.
 
 Hmm, so this doesn't rely on libxml2 at all?  Given the amount of pain
 that library has caused us, getting out from under it seems like a
 mighty attractive idea.  How big a chunk of code do you think it'd be
 by the time you complete the missing features?

   regards, tom lane
   
Right, no dependency, everything coded from scratch.
For the initial stable version, my plan is to make the parser conform to
the standard as much as possible and the same for XMLPath / XMLQuery.
(In all cases the question is which version of the standard to start at.)

Integration of SQL  XML data in queries is my primary interest. I
didn't really think to re-implement XSLT. For those who really need to
use XSLT functionality at the database level, can't the API be left for
optional installation?

Also I'm not sure if document validation is necessary for the initial
version - I still see a related item on the current TODO list.

Sincerely,
Tony,



[HACKERS] Native XML

2011-02-26 Thread Anton
Hello,
I've been playing with 'native XML' for a while and now wondering if
further development of such a feature makes sense for Postgres.
(By not having brought this up earlier I'm taking the chance that the
effort will be wasted, but that's not something you should worry about.)

The code is available here:
https://github.com/ahouska/postgres/commit/bde3d3ab05915e91a0d831a8877c2fed792693c7

Whoever is interested in my suggestions, I recommend to start at the
test (it needs to be executed standalone, pg_regress is not aware of it
yet):

src/test/regress/sql/xmlnode.sql
src/test/expected/xmlnode.out

In few words, the 'xmlnode' is a structured type that stores XML
document in a form of tree, as opposed to plain text.
Parsing is only performed on insert or update (for update it would also
make sense to implement functions that add/remove nodes at the low
level, w/o dumping  parsing).

Unlike 'libxml2', the parser uses palloc()/pfree(). The output format is
independent from any 3rd party code.
The binary (parsed) XML node is single chunk of memory, independent from
address where it was allocated.
The parser does yet fully conform to XML standard and some functionality
is still missing (DTD, PI, etc., see comments in the code if you're
interested in details).

'xquery()' function evaluates (so far just a simple) XMLPath expressions
and for each document it returns a set of matching nodes/subtrees.
'xmlpath' is parsed XMLPath (i.e. the expression + some metadata). It
helps to avoid repeated parsing of the XMLPath expressions by the
xquery() function.

I don't try to pretend that I invented this concept: DB2, Oracle and
probably some other commercial databases do have it for years.
Even though the mission of Postgres is not as simple as copying features
from other DBMs, I think the structured XML makes sense as such.
It allows for better integration of relational and XML data - especially
joining relational columns with XML node sets.

In the future, interesting features could be based on it. For example,
XML node/subtree can be located quickly within a xmlnode value and as
such it could be indexed (even though the existing indexes / access
methods might not be appropriate for that).

When reviewing my code, please focus on the ideas, rather than the code
quality :-) I'm aware that some refactoring will have to be done in case
this subproject will go on.

Thanks in advance for any feedback,
Tony.





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Restore deleted rows

2009-04-26 Thread Anton Egorov
Hi!

I need to recover deleted rows from table. After I delete those rows I
stopped postgres immediately and create tar archive of database. I found
solution http://archives.postgresql.org/pgsql-hackers/2005-03/msg00965.php,
but is there another (easyer) way to do it?


[HACKERS] Compiling trigger function with MinGW

2008-04-26 Thread Anton Burkun

Hello All.

Now I try to link dll with MinGW from Example in Postgres Help.
Linker show me this error:

D:\users\anthony\kursor\abzcrm\c\foogcc -shared foo.o -o foo.dll -L 
d:/files/local/PostgreSQL/8.3/lib -l postgres

Cannot export ⌂postgres_NULL_THUNK_DATA: symbol not found
collect2: ld returned 1 exit status

What should I do?

--
Anton Burkun
+380 66 757 70 27

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] REMOVE

2003-05-31 Thread Anton V. Kozub