Re: [HACKERS] Proposal to add --single-row to psql

2013-05-20 Thread Bruce Momjian
On Sat, May 11, 2013 at 07:22:55PM -0400, Robert Haas wrote:
 On Sat, May 11, 2013 at 12:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  By the time you've got an expression tree, the problem is mostly solved,
  at least so far as parser extension is concerned.
 
 Right.
 
  More years ago than I care to admit, I worked on systems that had
  run-time-extensible parsers at Hewlett-Packard, so technology for this
  does exist.  But my (vague) memory of those systems is that the parser's
  language capabilities were more limited than bison's, perhaps only
  LL(1).  Parsing spec-compatible SQL that way might be a challenge.
 
 If I understand bison correctly, it basically looks at the current
 parser state and the next token and decides to either shift that token
 onto the stack or reduce the stack using some rule.  If there's no
 matching rule, we error out.  If someone wants to inject new rules
 into the grammar, those state tables are all going to change.  But if
 we could contrive things so that the state tables are built
 dynamically and can be change as rules are added and removed, then it
 seems to me that we could let a loadable module supply (or delete)
 whatever grammar rules it likes.  Whenever it does this, we recompile
 the grammar on next use (and complain if we get ambiguities).  This
 does not sound all that easy to code, but at least in theory it seems
 doable.
 
 We'd also need a way to add keywords.  To be quite frank, I think our
 whole approach to keywords is massive screwed up right now.  Aside

FYI, one trick that is often used when parsing C code is to allow parsed
code to add to the keyword list, so when the parser goes to get the next
token, it might get keyword back based on what it has seen already. 
This is required to handle C typedefs, and might be a method we could
use.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-20 Thread Darren Duncan
I have actually been working on the task discussed in this thread, most relevant 
parts quoted below, for awhile now, and hope to have something concrete that you 
can use by the end of this summer.


My in-development Muldis D language is homoiconic as a core feature, its source 
code is data to it, and the native syntax of the language resembles an abstract 
syntax tree.  This tree of nodes primarily defines behavior of the code, but it 
also supports arbitrary metadata per node, which for example can be used to 
preserve concrete syntax for any programming language that can be parsed into 
Muldis D nodes or conversely generated from said.


For example, you can have one type of node defining a function, and its details 
are defined by its attributes or child nodes, such as its result type, its 
parameters, whether it is declared associative/commutative/etc or not, and the 
expression(s) defining its body.  Another type of node defines a call to a 
function, another type defines a text literal, another a relation literal, and 
so on.  Conversely, a node can define a schema or package etc.  Example 
metadata, which is also structured, could include line numbers or code comments 
or whitespace or exact numeric literal formats or quoting formats or exact 
keyword choices, for example.


Using these node data types, we have a fairly normalized representation of any 
source code (or data) that is easy to introspect or transform with code.  A key 
design of Muldis D is the clear separation of syntax and behavior.


A parser is just a function that takes (typically) a character string as input 
and produces a node (tree) as output.  A compiler is just a function or 
operation that takes a node (tree) as input and produces machine code.  An 
optimizer can be a function that derives one node tree from another, either as 
its own operation or typically as part of the compiler stage.


A compiler or optimizer generally can trivially ignore the node metadata, but a 
code generator or debugger can use it; metadata can be stripped without 
affecting behavior.  The canonical tree form can also easily be mapped 
losslessly with relation forms, such as typical information schemas have.


Practically all behavior is defined in terms of generic type and routine 
definitions.  Practically all system-defined types and routines are defined in 
one or more libraries/modules/packages that have the same format as those users 
would write like extensions.  So, all the relational operators have the same 
syntax as say string or numeric or array operators.


I envision that the most effective way to use Muldis D to handle an arbitrary 
programming language, including the native SQL syntax+behavior of each SQL DBMS, 
is to have a Muldis D library+parser pair for it.


For example, to represent PostgreSQL 9.2 most directly, we have a library with 
an explicitly defined type and routine for every built-in that Pg 9.2 has, and 
we also have a parser function that takes the SQL syntax that Pg 9.2 understands 
and produces a Muldis D node tree consisting of calls to the routines of that 
library or value selectors of types in that library (things like SELECT and 
INSERT etc are each mapped to a routine too).  That way, even with a standard 
node format that isn't specific to a typical language or version, the code for 
parsing Pg 9.2 SQL has the minimal amount of indirection that it has to deal 
with, as each syntax element would have a direct library call counterpart. 
Similarly, the existing Pg 9.2 SQL compiler would have the least indirection to 
take said nodes and execute them.  (The library would be named eg Postgres_9_2 
for example, which makes it easier say to also have one side-by-side for other 
versions, shims, legacy code you want to more easily support compatibility with.)


Where one decides to do cross-compilation, say make Oracle SQL run on Pg, that 
could be done as simply as defining a library for Oracle SQL with the 
routines+types that has, and then mapping it to a Pg one just in terms of shim 
calls (which the compiler can optimize away as applicable), and so parsers or 
compilers never necessarily have to deal with behavior compatibility issues.


I am presently working out the exact set of such language nodes, and making a 
reference implementation which is mostly self-defined and would compile to Perl 
code, and hope to have the first Muldis D executable by the end of this summer.


I am confident that an adaption of this design into C or whatever would serve 
Postgres greatly in letting it effectively parse multiple languages, anywhere 
from application programming languages to multiple SQL dialects or versions.


Even if you don't decide to use my design specifically (which is open source and 
you can influence it), I think you should find some of the general design 
principles I stated above to be useful.  Representing behavior as libraries the 
AST being flexible enough for any concrete language without 

Re: [HACKERS] Proposal to add --single-row to psql

2013-05-14 Thread Amit Kapila
On Sunday, May 12, 2013 4:53 AM Robert Haas wrote:
 On Sat, May 11, 2013 at 12:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  By the time you've got an expression tree, the problem is mostly
 solved,
  at least so far as parser extension is concerned.
 
 Right.
 
  More years ago than I care to admit, I worked on systems that had
  run-time-extensible parsers at Hewlett-Packard, so technology for
 this
  does exist.  But my (vague) memory of those systems is that the
 parser's
  language capabilities were more limited than bison's, perhaps only
  LL(1).  Parsing spec-compatible SQL that way might be a challenge.
 
 If I understand bison correctly, it basically looks at the current
 parser state and the next token and decides to either shift that token
 onto the stack or reduce the stack using some rule.  If there's no
 matching rule, we error out.  If someone wants to inject new rules
 into the grammar, those state tables are all going to change.  But if
 we could contrive things so that the state tables are built
 dynamically and can be change as rules are added and removed, then it
 seems to me that we could let a loadable module supply (or delete)
 whatever grammar rules it likes.  Whenever it does this, we recompile
 the grammar on next use (and complain if we get ambiguities).  This
 does not sound all that easy to code, but at least in theory it seems
 doable.

Can we think of a pluggable parser whose responsibility is to produce raw
parse tree structure similar to PostgreSQL, but internally it can handle
different database syntax (e.g Oracle).
For example below syntaxes of Oracle can be converted to PostgreSQL raw
parse tree

Oracle syntax: 
SELECT 1 + 10 * 100 + 1000 from DUAL; 
PostgreSQL syntax: 
SELECT 1 + 10 * 100 + 1000; 


Oracle syntax: 
CREATE USER  IDENTIFIED BY 'password'; 
ALTER USER  IDENTIFiED BY 'password'; 
PostgreSQL syntax: 
CREATE USER  PASSWORD 'password'; 
ALTER USER  PASSWORD 'password';

Currently if somebody has to migrate from Sybase, Oracle, SQL Server, MySQL
to PostgreSQL, he has to either use some converter tool or enhance existing
gram.y to support it.
In case current PG parser is pluggable, he can write it's parser which can
produce PG specific parse tree for another database syntax. User can load
database specific 
Parser during server start time by specifying corresponding parser dll/so in
shared_preload_libraries. 
Advantage of writing PG based parser for other database syntax is user can
directly use script files of other database, no need of conversion every
time.

Though I am not sure, but I think 70~80% of commonly used SQL syntax is
common among different databases as all standard databases are SQL-spec
compliant, so if such a parser facility
is provided, it can save lot of time taken for conversion for customers who
want to migrate from other database syntax to PostgreSQL.


With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-13 Thread Peter Geoghegan
On Sun, May 12, 2013 at 1:18 PM, Jim Nasby j...@nasby.net wrote:
 FWIW, I've wanted the ability to plug into the parser not for an extension,
 but so that I could programmaticly enforce certain coding conventions.

Depending on the exact requirements, that probably wouldn't be too
difficult. It'd likely entail adding a hook for plugins to inspect raw
parse trees.

The pg_stat_statements core infrastructure changes for 9.2 added a
hook that allowed pg_stat_statements to inspect the post-parse
analysis (query) tree. That wouldn't be suitable for what you want,
though, because that tree loses many of the details that you are
likely to care about but are inconsequential to the semantics of the
query. For example, you couldn't tell from the post-parse analysis
tree if the query text was originally select all, the, columns from
table or select * from table.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-12 Thread Jim Nasby

On 5/11/13 11:27 AM, Tom Lane wrote:

David Fetter da...@fetter.org writes:

On Sat, May 11, 2013 at 11:17:03AM -0400, Robert Haas wrote:

Some kind of extendable parser would be awesome.  It would need to tie
into the rewriter also.

No, I don't have a clue what the design looks like.



That's a direction several of the proprietary RDBMS vendors have
proposed.  I think it'd be great :)



Pre-coffee (yeah, I know.  Bad idea.) sketch of an idea: create an API
to pass expression trees in and out.  This could have other benefits
as to clustering space, shortening the planning cycle, etc., but let's
not go there for now.  My knowledge is very, very sketchy, but when I
squint, the expression trees we use look a lot like JSON.  Are they
isomorphic?


By the time you've got an expression tree, the problem is mostly solved,
at least so far as parser extension is concerned.

More years ago than I care to admit, I worked on systems that had
run-time-extensible parsers at Hewlett-Packard, so technology for this
does exist.  But my (vague) memory of those systems is that the parser's
language capabilities were more limited than bison's, perhaps only
LL(1).  Parsing spec-compatible SQL that way might be a challenge.

A larger issue is that if you don't have the whole grammar available
to check, it's difficult to be sure there are no parsing conflicts.
I seem to remember that we hit some conflicts between different
extension ROMs back at HP :-(

Another point is that extensions that are actually interesting require
a lot more than new syntax.  Robert mentioned the rewriter, but you'd
typically need planner and executor additions as well.  It's possible to
see how whole new plan node types might be added by a plugin so far as
the executor is concerned, but I haven't a clue how we'd get the planner
to emit them ...


FWIW, I've wanted the ability to plug into the parser not for an extension, but 
so that I could programmaticly enforce certain coding conventions. TBH, it 
would be ideal if that was possible without requiring a backend (ie, some kind 
of stand-alone library/tool), but exposing it via a backend would be better 
than nothing.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-11 Thread Robert Haas
On Thu, May 9, 2013 at 7:36 AM, Michael Paquier
michael.paqu...@gmail.com wrote:

 Some of this is getting solved by making PostgreSQL more pluggable in
 ways that isolate the proprietary stuff, i.e. make people not have to
 touch the PostgreSQL core code much, if at all, in order to provide
 whatever special features they provide.  Hooks and FDWs are two such
 pluggable components.

  Extensions and the lovely background workers as well.

Some kind of extendable parser would be awesome.  It would need to tie
into the rewriter also.

No, I don't have a clue what the design looks like.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-11 Thread David Fetter
On Sat, May 11, 2013 at 11:17:03AM -0400, Robert Haas wrote:
 On Thu, May 9, 2013 at 7:36 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 
  Some of this is getting solved by making PostgreSQL more pluggable in
  ways that isolate the proprietary stuff, i.e. make people not have to
  touch the PostgreSQL core code much, if at all, in order to provide
  whatever special features they provide.  Hooks and FDWs are two such
  pluggable components.
 
   Extensions and the lovely background workers as well.
 
 Some kind of extendable parser would be awesome.  It would need to tie
 into the rewriter also.
 
 No, I don't have a clue what the design looks like.

That's a direction several of the proprietary RDBMS vendors have
proposed.  I think it'd be great :)

Pre-coffee (yeah, I know.  Bad idea.) sketch of an idea: create an API
to pass expression trees in and out.  This could have other benefits
as to clustering space, shortening the planning cycle, etc., but let's
not go there for now.  My knowledge is very, very sketchy, but when I
squint, the expression trees we use look a lot like JSON.  Are they
isomorphic?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-11 Thread Tom Lane
David Fetter da...@fetter.org writes:
 On Sat, May 11, 2013 at 11:17:03AM -0400, Robert Haas wrote:
 Some kind of extendable parser would be awesome.  It would need to tie
 into the rewriter also.
 
 No, I don't have a clue what the design looks like.

 That's a direction several of the proprietary RDBMS vendors have
 proposed.  I think it'd be great :)

 Pre-coffee (yeah, I know.  Bad idea.) sketch of an idea: create an API
 to pass expression trees in and out.  This could have other benefits
 as to clustering space, shortening the planning cycle, etc., but let's
 not go there for now.  My knowledge is very, very sketchy, but when I
 squint, the expression trees we use look a lot like JSON.  Are they
 isomorphic?

By the time you've got an expression tree, the problem is mostly solved,
at least so far as parser extension is concerned.

More years ago than I care to admit, I worked on systems that had
run-time-extensible parsers at Hewlett-Packard, so technology for this
does exist.  But my (vague) memory of those systems is that the parser's
language capabilities were more limited than bison's, perhaps only
LL(1).  Parsing spec-compatible SQL that way might be a challenge.

A larger issue is that if you don't have the whole grammar available
to check, it's difficult to be sure there are no parsing conflicts.
I seem to remember that we hit some conflicts between different
extension ROMs back at HP :-(

Another point is that extensions that are actually interesting require
a lot more than new syntax.  Robert mentioned the rewriter, but you'd
typically need planner and executor additions as well.  It's possible to
see how whole new plan node types might be added by a plugin so far as
the executor is concerned, but I haven't a clue how we'd get the planner
to emit them ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-11 Thread Robert Haas
On Sat, May 11, 2013 at 12:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 By the time you've got an expression tree, the problem is mostly solved,
 at least so far as parser extension is concerned.

Right.

 More years ago than I care to admit, I worked on systems that had
 run-time-extensible parsers at Hewlett-Packard, so technology for this
 does exist.  But my (vague) memory of those systems is that the parser's
 language capabilities were more limited than bison's, perhaps only
 LL(1).  Parsing spec-compatible SQL that way might be a challenge.

If I understand bison correctly, it basically looks at the current
parser state and the next token and decides to either shift that token
onto the stack or reduce the stack using some rule.  If there's no
matching rule, we error out.  If someone wants to inject new rules
into the grammar, those state tables are all going to change.  But if
we could contrive things so that the state tables are built
dynamically and can be change as rules are added and removed, then it
seems to me that we could let a loadable module supply (or delete)
whatever grammar rules it likes.  Whenever it does this, we recompile
the grammar on next use (and complain if we get ambiguities).  This
does not sound all that easy to code, but at least in theory it seems
doable.

We'd also need a way to add keywords.  To be quite frank, I think our
whole approach to keywords is massive screwed up right now.  Aside
from the parser bloat, adding even unreserved keywords breaks stuff,
e.g. SELECT 1 this_is_not_yet_a_keyword.  EnterpriseDB's internal
bug-tracking system must have at least half a dozen bug reports open
right now that are attributable to keywords either existing at all or
being more reserved than they are in Oracle.  That aside, the amount
of trouble we've caused for PostgreSQL users over the years, either by
adding new keywords or disallowing them in contexts where they used to
work, is not small.  I don't have a specific proposal for what we
should do to make this problem less painful, but I think if we ever
revise our parser infrastructure we ought to use that as an
opportunity to try to come up with something better than what exists
today, because the status quo is awfully painful.

 Another point is that extensions that are actually interesting require
 a lot more than new syntax.  Robert mentioned the rewriter, but you'd
 typically need planner and executor additions as well.  It's possible to
 see how whole new plan node types might be added by a plugin so far as
 the executor is concerned, but I haven't a clue how we'd get the planner
 to emit them ...

I don't have any ideas either.  Still, solving some of the problems
would be better than solving none of them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-09 Thread Michael Paquier
On Thu, May 9, 2013 at 9:12 AM, David Fetter da...@fetter.org wrote:

 On Wed, May 08, 2013 at 06:08:28PM -0500, Jim Nasby wrote:
  I believe that makes it significantly harder for them to actually
  contribute code back that doesn't give them a business advantage, as
  well as making it harder to justify hacking on the community
  codebase because they'll just face a very large hurdle when it comes
  to pulling that code back into their proprietary product.
 
  I don't know of any good way to solve that problem. Maybe it's not
  worth solving... but I do suspect there's some useful stuff that the
  community has lost out on because of this.

 Some of this is getting solved by making PostgreSQL more pluggable in
 ways that isolate the proprietary stuff, i.e. make people not have to
 touch the PostgreSQL core code much, if at all, in order to provide
 whatever special features they provide.  Hooks and FDWs are two such
 pluggable components.

 Extensions and the lovely background workers as well.
-- 
Michael


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-08 Thread Jim Nasby

On 5/1/13 7:36 PM, Robert Haas wrote:

On Mon, Apr 29, 2013 at 4:33 PM, Jim Nasbyj...@nasby.net  wrote:

On 4/28/13 7:50 AM, Craig Ringer wrote:


I find it frustrating that I've never seen an @paraccel email address here
and that few of the other vendors of highly customised Pg offshoots are
contributing back. It's almost enough to make me like the GPL.


FWIW, I think there's a pretty large barrier to these folks contributing
back. Would the community really want to add a bunch of hooks to support
something like Redshift? Or Greenplum? Or etc, etc.? Most of these guys have
to change significant amounts of PG code, so much so that it's actually hard
for them to stay current (which is why most of them just fork).

I do think this is a shame, but I'm not sure of any good way to fix it.

Yep.  There are plenty of things that we do at EDB for good and valid
business reasons that I can't imagine the community accepting under
any circumstances.  For example, Oracle compatibility is not something
the community values as highly as EnterpriseDB (and our customers) do.
  I'm sure that many of those vendors are in similar situations - they
write code that only runs on specialized hardware, or (rather
commonly, I suspect) they remove parts of the functionality in order
to make certain things very fast.  Those are not trade-offs that make
sense for PostgreSQL, but I find it hard to understand what we'd gain
from preventing other people from making them.  There are in fact a
pretty large number of companies - EnterpriseDB, obviously, but there
are many, many others - that are choosing to build businesses around
PostgreSQL precisely because it*isn't*  GPL.  Personally, I think
that's a good thing for our community in terms of mindshare even when
companies choose not to contribute back - and it's even better when
they do.


FWIW, one point I was trying to make that was overlooked is that it seems to be 
exceptionally difficult for companies to fork Postgres and then stay current 
(AFAIK EnterpriseDB and Mammoth are the only products that have pulled that 
feat off). I believe that makes it significantly harder for them to actually 
contribute code back that doesn't give them a business advantage, as well as 
making it harder to justify hacking on the community codebase because they'll 
just face a very large hurdle when it comes to pulling that code back into 
their proprietary product.

I don't know of any good way to solve that problem. Maybe it's not worth 
solving... but I do suspect there's some useful stuff that the community has 
lost out on because of this.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-08 Thread David Fetter
On Wed, May 08, 2013 at 06:08:28PM -0500, Jim Nasby wrote:
 On 5/1/13 7:36 PM, Robert Haas wrote:
 On Mon, Apr 29, 2013 at 4:33 PM, Jim Nasbyj...@nasby.net  wrote:
 On 4/28/13 7:50 AM, Craig Ringer wrote:
 
 I find it frustrating that I've never seen an @paraccel email address 
 here
 and that few of the other vendors of highly customised Pg offshoots are
 contributing back. It's almost enough to make me like the GPL.
 
 FWIW, I think there's a pretty large barrier to these folks contributing
 back. Would the community really want to add a bunch of hooks to support
 something like Redshift? Or Greenplum? Or etc, etc.? Most of these guys 
 have
 to change significant amounts of PG code, so much so that it's actually 
 hard
 for them to stay current (which is why most of them just fork).
 
 I do think this is a shame, but I'm not sure of any good way to fix it.
 Yep.  There are plenty of things that we do at EDB for good and valid
 business reasons that I can't imagine the community accepting under
 any circumstances.  For example, Oracle compatibility is not something
 the community values as highly as EnterpriseDB (and our customers) do.
   I'm sure that many of those vendors are in similar situations - they
 write code that only runs on specialized hardware, or (rather
 commonly, I suspect) they remove parts of the functionality in order
 to make certain things very fast.  Those are not trade-offs that make
 sense for PostgreSQL, but I find it hard to understand what we'd gain
 from preventing other people from making them.  There are in fact a
 pretty large number of companies - EnterpriseDB, obviously, but there
 are many, many others - that are choosing to build businesses around
 PostgreSQL precisely because it*isn't*  GPL.  Personally, I think
 that's a good thing for our community in terms of mindshare even when
 companies choose not to contribute back - and it's even better when
 they do.
 
 FWIW, one point I was trying to make that was overlooked is that it
 seems to be exceptionally difficult for companies to fork Postgres
 and then stay current (AFAIK EnterpriseDB and Mammoth are the only
 products that have pulled that feat off).

VMware and CitusDB are doing pretty well so far, but it's early days.

 I believe that makes it significantly harder for them to actually
 contribute code back that doesn't give them a business advantage, as
 well as making it harder to justify hacking on the community
 codebase because they'll just face a very large hurdle when it comes
 to pulling that code back into their proprietary product.
 
 I don't know of any good way to solve that problem. Maybe it's not
 worth solving... but I do suspect there's some useful stuff that the
 community has lost out on because of this.

Some of this is getting solved by making PostgreSQL more pluggable in
ways that isolate the proprietary stuff, i.e. make people not have to
touch the PostgreSQL core code much, if at all, in order to provide
whatever special features they provide.  Hooks and FDWs are two such
pluggable components.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-01 Thread Robert Haas
On Mon, Apr 29, 2013 at 4:33 PM, Jim Nasby j...@nasby.net wrote:
 On 4/28/13 7:50 AM, Craig Ringer wrote:

 I find it frustrating that I've never seen an @paraccel email address here
 and that few of the other vendors of highly customised Pg offshoots are
 contributing back. It's almost enough to make me like the GPL.

 FWIW, I think there's a pretty large barrier to these folks contributing
 back. Would the community really want to add a bunch of hooks to support
 something like Redshift? Or Greenplum? Or etc, etc.? Most of these guys have
 to change significant amounts of PG code, so much so that it's actually hard
 for them to stay current (which is why most of them just fork).

 I do think this is a shame, but I'm not sure of any good way to fix it.

Yep.  There are plenty of things that we do at EDB for good and valid
business reasons that I can't imagine the community accepting under
any circumstances.  For example, Oracle compatibility is not something
the community values as highly as EnterpriseDB (and our customers) do.
 I'm sure that many of those vendors are in similar situations - they
write code that only runs on specialized hardware, or (rather
commonly, I suspect) they remove parts of the functionality in order
to make certain things very fast.  Those are not trade-offs that make
sense for PostgreSQL, but I find it hard to understand what we'd gain
from preventing other people from making them.  There are in fact a
pretty large number of companies - EnterpriseDB, obviously, but there
are many, many others - that are choosing to build businesses around
PostgreSQL precisely because it *isn't* GPL.  Personally, I think
that's a good thing for our community in terms of mindshare even when
companies choose not to contribute back - and it's even better when
they do.

I was at the MySQL show a couple of years back and they had a vendor
area.  And I got talking to one the vendors there who had implemented
some new kind of database - I forget the details exactly - and he told
me that it spoke the PostgreSQL wire protocol.  I was of course a bit
surprised as PostgreSQL was not exactly what people at this show were
excited about.  So I asked him why not the MySQL wire protocol, and
he, basically, that they were afraid of being accused of a GPL
violation, because MySQL AB had a very expansive interpretation of
what the GPL did and did not allow.  We could perhaps argue about
whether he was right to be worried about that... but he was.  I can't
think of all the names right now, but I've talked with a bunch of
other companies over the last few years who also chose PostgreSQL for
licensing reasons.  I'm pretty happy about that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-01 Thread Josh Berkus

 I find it frustrating that I've never seen an @paraccel email address
 here and that few of the other vendors of highly customised Pg offshoots
 are contributing back. It's almost enough to make me like the GPL.

Well, Paraccel never ended up contributing any code, but in years back
(2006-2008) they did contribute money to PostgreSQL (event sponsorships,
SPI).  I actually talked to them about contributing code, but somehow it
never happened.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-01 Thread Craig Ringer
On 05/02/2013 08:36 AM, Robert Haas wrote:

 I've talked with a bunch of other companies over the last few years
 who also chose PostgreSQL for licensing reasons. I'm pretty happy
 about that. 

I think it's a pretty good thing too, personally ... but I do wish
they'd contribute a bit more to the client-side drivers like psqlODBC
and PgJDBC.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-05-01 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 ... There are in fact a
 pretty large number of companies - EnterpriseDB, obviously, but there
 are many, many others - that are choosing to build businesses around
 PostgreSQL precisely because it *isn't* GPL.  Personally, I think
 that's a good thing for our community in terms of mindshare even when
 companies choose not to contribute back - and it's even better when
 they do.

Yeah, exactly.  People forking PG may not contribute much directly,
but they still add to mindshare and general familiarity with the PG
code base.  In the long run, that cannot help but add to our goal of
world domination ;-)

We had the GPL-vs-BSD-license discussion in excruciating length years
ago, mid-2000 if memory serves, so you can find a whole lot more about
this topic in the mailing list archives from back then.  But personally
I wouldn't consider changing even if it were somehow practical.  The
license choice has served us well and will continue to do so.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-04-29 Thread Jim Nasby

On 4/28/13 7:50 AM, Craig Ringer wrote:

I find it frustrating that I've never seen an @paraccel email address here and 
that few of the other vendors of highly customised Pg offshoots are 
contributing back. It's almost enough to make me like the GPL.


FWIW, I think there's a pretty large barrier to these folks contributing back. 
Would the community really want to add a bunch of hooks to support something 
like Redshift? Or Greenplum? Or etc, etc.? Most of these guys have to change 
significant amounts of PG code, so much so that it's actually hard for them to 
stay current (which is why most of them just fork).

I do think this is a shame, but I'm not sure of any good way to fix it.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-04-28 Thread Craig Ringer
On 04/25/2013 10:55 PM, Andrew Dunstan wrote:

 It's an Amazon product based on release 8.0, but with many many
 features removed (e.g. Indexes!) 
More specifically, it's a hacked-up column-store-ized Pg for OLAP and
analytics work. As I understand it Amazon didn't develop it themselves;
they bought/licensed Paraccel ( http://www.paraccel.com/) and customised
to integrated into Amazon's IAM authentication/authorization, usage
accounting, dynamic provisioning, etc.

http://docs.aws.amazon.com/redshift/latest/dg/c_redshift-and-postgres-sql.html
http://en.wikipedia.org/wiki/ParAccel

I find it frustrating that I've never seen an @paraccel email address
here and that few of the other vendors of highly customised Pg offshoots
are contributing back. It's almost enough to make me like the GPL.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



Re: [HACKERS] Proposal to add --single-row to psql

2013-04-25 Thread Tom Lane
Christopher Manning c...@christophermanning.org writes:
 Fabrízio and Tom,
 I know that you can use  --variable=FETCH_COUNT=1 from the
 psql command line, but internally that uses a CURSOR to batch the rows and
 [Redshift doesn't support CURSOR](
 https://forums.aws.amazon.com/thread.jspa?threadID=122664tstart=0) so it's
 not an option when using psql to download data from Redshift.

I don't know what redshift is, but I have very little patience with the
idea of burdening psql with yet another random feature in order to work
around deficiencies in somebody else's software.  Considering that the
earliest any such thing could reach the field would be 9.4, it seems not
unlikely that the need for it would be gone by next year anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-04-25 Thread Andrew Dunstan


On 04/25/2013 10:42 AM, Tom Lane wrote:

Christopher Manning c...@christophermanning.org writes:

Fabrízio and Tom,
I know that you can use  --variable=FETCH_COUNT=1 from the
psql command line, but internally that uses a CURSOR to batch the rows and
[Redshift doesn't support CURSOR](
https://forums.aws.amazon.com/thread.jspa?threadID=122664tstart=0) so it's
not an option when using psql to download data from Redshift.

I don't know what redshift is,



It's an Amazon product based on release 8.0, but with many many features 
removed (e.g. Indexes!)



but I have very little patience with the
idea of burdening psql with yet another random feature in order to work
around deficiencies in somebody else's software.  Considering that the
earliest any such thing could reach the field would be 9.4, it seems not
unlikely that the need for it would be gone by next year anyway.





Plus there is the fact that we have no way to test it against redshift 
anyway.


It should be up to Amazon to produce a useful psql program that works 
with redshift, not us. We have enough to do to support our own product.


If this is to be justified at all it needs to be without reference to 
redshift.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-04-25 Thread Joshua D. Drake


On 04/25/2013 07:42 AM, Tom Lane wrote:

Christopher Manning c...@christophermanning.org writes:

Fabr�zio and Tom,
I know that you can use  --variable=FETCH_COUNT=1 from the
psql command line, but internally that uses a CURSOR to batch the rows and
[Redshift doesn't support CURSOR](
https://forums.aws.amazon.com/thread.jspa?threadID=122664tstart=0) so it's
not an option when using psql to download data from Redshift.


I don't know what redshift is,


It is a PostgreSQL fork based on ancient source code. From Amazon:

Amazon Redshift is based on PostgreSQL 8.0.2. Amazon Redshift and 
PostgreSQL have a number of very important differences that you must be 
aware of as you design and develop your data warehouse applications.



but I have very little patience with the
idea of burdening psql with yet another random feature in order to work
around deficiencies in somebody else's software.  Considering that the
earliest any such thing could reach the field would be 9.4, it seems not
unlikely that the need for it would be gone by next year anyway.


+1 this is really an amazon problem not a postgresql problem.

Sincerely,

Joshua D. Drake






--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-04-24 Thread Fabrízio de Royes Mello
On Tue, Apr 23, 2013 at 1:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Isn't there already a way to set FETCH_COUNT from the command line?
 (ie, I think there's a generic variable-assignment facility that could
 do this)


Christopher,

Tom is all right... from psql [1] command line we can do that:

$ bin/psql --variable=FETCH_COUNT=100
psql (9.3devel)
Type help for help.

fabrizio=# \echo :FETCH_COUNT
100

Regards,

[1]
http://www.postgresql.org/docs/current/interactive/app-psql.html#AEN84903

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Proposal to add --single-row to psql

2013-04-24 Thread Christopher Manning
Fabrízio and Tom,

I know that you can use  --variable=FETCH_COUNT=1 from the
psql command line, but internally that uses a CURSOR to batch the rows and
[Redshift doesn't support CURSOR](
https://forums.aws.amazon.com/thread.jspa?threadID=122664tstart=0) so it's
not an option when using psql to download data from Redshift.

Pavel's idea of having a single row mode option to replace FETCH_COUNT is
interesting, does anyone have any problems with that or alternative ideas?

Regards,
Christopher



On Wed, Apr 24, 2013 at 7:04 AM, Fabrízio de Royes Mello 
fabriziome...@gmail.com wrote:


 On Tue, Apr 23, 2013 at 1:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 
  Isn't there already a way to set FETCH_COUNT from the command line?
  (ie, I think there's a generic variable-assignment facility that could
  do this)
 

 Christopher,

 Tom is all right... from psql [1] command line we can do that:

 $ bin/psql --variable=FETCH_COUNT=100
 psql (9.3devel)
 Type help for help.

 fabrizio=# \echo :FETCH_COUNT
 100

 Regards,

 [1]
 http://www.postgresql.org/docs/current/interactive/app-psql.html#AEN84903

 --
 Fabrízio de Royes Mello
 Consultoria/Coaching PostgreSQL
  Blog sobre TI: http://fabriziomello.blogspot.com
  Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
  Twitter: http://twitter.com/fabriziomello



Re: [HACKERS] Proposal to add --single-row to psql

2013-04-23 Thread Pavel Stehule
Hello

It is redundant to current FETCH_COUNT implementation, so I don't see sense
to use it together. Maybe we can drop FETCH_COUNT and replace it by
--single-row mode and probably it can simplify code.

Regards

Pavel


2013/4/23 Christopher Manning c...@christophermanning.org

 psql currently collects the query result rows in memory before writing
 them to a file and can cause out of memory problems for large results in
 low memory environments like ec2. I can't use COPY TO STDOUT or FETCH_COUNT
 since I'm using Redshift and it doesn't support [writing to STDOUT](
 http://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html) or [CURSOR](
 https://forums.aws.amazon.com/thread.jspa?threadID=122664).

 [Single Row Mode is available in Postgres 9.2](
 http://www.postgresql.org/docs/9.2/static/libpq-single-row-mode.html) but
 [it doesn't look like](
 http://www.postgresql.org/docs/9.2/static/app-psql.html) you can tell
 psql to use single row mode when writing to a file (using --output).

 I'm proposing to add a --single-row option to psql that would allow the
 result rows of a query to be streamed to a file without collecting them in
 memory first.

 I'm new to the postgres source, but I was considering doing this by adding
 an elseif at [this line in bin/psql/common.c](
 https://github.com/postgres/postgres/blob/master/src/bin/psql/common.c#L955) 
 that
 would call [PQsetSingleRowMode](
 https://github.com/postgres/postgres/blob/master/src/interfaces/libpq/fe-exec.c#L1581)
  and
 ideally use something very similar to [ExecQueryUsingCursor](
 https://github.com/postgres/postgres/blob/master/src/bin/psql/common.c#L1081
 )

 Please let me know if that would be an acceptable addition and if there's
 anything in particular I should be aware of when adding the feature.

 Thank you,
 Christopher




Re: [HACKERS] Proposal to add --single-row to psql

2013-04-23 Thread Tom Lane
Christopher Manning c...@christophermanning.org writes:
 I'm proposing to add a --single-row option to psql that would allow the
 result rows of a query to be streamed to a file without collecting them in
 memory first.

Isn't there already a way to set FETCH_COUNT from the command line?
(ie, I think there's a generic variable-assignment facility that could
do this)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers