Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Jov
what about runtime code generation using LLVM?
http://blog.cloudera.com/blog/2013/02/inside-cloudera-impala-runtime-code-generation/
http://llvm.org/devmtg/2013-11/slides/Wanderman-Milne-Cloudera.pdf

Jov
blog: http:amutu.com/blog http://amutu.com/blog


2014-04-22 6:41 GMT+08:00 Simon Riggs si...@2ndquadrant.com:

 I've discussed 2ndQuadrant's involvement in the AXLE project a few
 times publicly, but never on this mailing list. The project relates to
 innovation and improvement in Business Intelligence for systems based
 upon PostgreSQL in the range of 10-100TB.

 Our work will span the 9.5 and 9.6 cycles. We're looking to make
 measurable improvements in a number of cases; one of those is TPC-H,
 since its a publicly accessible benchmark, another is a more private
 benchmark on healthcare data. In brief, this means speeding up the
 performance of large queries, data loading and looking at very large
 systems issues.

 Some of areas of RD are definitely on the roadmap, others are more
 flexible. Some of this is in progress, other stuff is not even at the
 design stage - yet, just a few paragraphs along the lines of we will
 look at these topics. If we have room, its possible we may
 accommodate other topics; this is not carte blanche, but the reason
 for posting here is so people know we will take input, following the
 normal community process. Detailed in-person discussions at PGCon are
 expected and the Wiki pages will be updated for each aspect.

 BI-related Indexing
 * MinMax indexes
 * Bitmap indexes

 Large Systems
 * Freeze avoidance
 * Storage management issues for very large systems

 Storage Efficiency
 * Compression
 * Column Orientation

 Optimisation
 * Bulk loading speed improvements
 * Bulk FK evaluation
 * Executor tuning for very large queries

 Query tuning
 * Approximate queries, sampling
 * Materialized Views

 ...and possibly some other aspects.

 2ndQuadrant is also assisting other researchers on GPU and FPGA
 topics, which may also yield work of interest to PostgreSQL project.

 Couple of points: The project is time limited, so if work gets pushed
 back beyond that then we'll lose the opportunity to contribute. Please
 support our work with timely objections, assistance in defining the
 path forwards and limiting the scope to something that avoids wasting
 this opportunity. Further funding is possible if we don't squander
 this. We are being funded to make best efforts to contribute to open
 source PostgreSQL, not pay-for-commit.

 AXLE is funded by the EU under FP7 Grant Agreement 318633.  Further
 details are available here http://www.axleproject.eu/

 (There are also other 2ndQuadrant development projects in progress,
 this is just one of the larger ones).

 Best Regards

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Simon Riggs
On 22 April 2014 00:24, Josh Berkus j...@agliodbs.com wrote:
 On 04/21/2014 03:41 PM, Simon Riggs wrote:
 Storage Efficiency
 * Compression
 * Column Orientation

 You might look at turning this:

 http://citusdata.github.io/cstore_fdw/

 ... into a more integrated part of Postgres.

Of course I'm aware of that work - credit to them. Certainly, many
people feel that it is now time to do as you suggest and include
column store features within PostgreSQL.

As to turning it into a more integrated part of Postgres, we have a
few problems there

1. cstore_fdw code has an incompatible licence

2. I don't think FDWs are the right place for complex new
architectures such as column store, massively parallel processing or
sharding. The fact that it is probably the best place to implement it
in user space doesn't mean it transfers well into core code. That's a
shame and I don't know what to do about it, because it would be nice
to simply ask for change of licence and then integrate it, but it
seems more work than that (to me).

cstore_fdw uses ORC, which interestingly stores lightweight index
values that look exactly like MinMax indexes, so at least PostgreSQL
shoiuld be getting that soon.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Simon Riggs
On 22 April 2014 10:42, Jov am...@amutu.com wrote:

 what about runtime code generation using LLVM?
 http://blog.cloudera.com/blog/2013/02/inside-cloudera-impala-runtime-code-generation/
 http://llvm.org/devmtg/2013-11/slides/Wanderman-Milne-Cloudera.pdf

Those techniques have been in use for at least 20 years on various platforms.

The main issues PostgreSQL faces is supporting many platforms and
compilers, while at the same time supporting extensible data types.

I believe there is some research work into run-time compilation in
progress, but that seems unlikely to make it into Postgres core.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Hannu Krosing
On 04/22/2014 01:24 AM, Josh Berkus wrote:
 On 04/21/2014 03:41 PM, Simon Riggs wrote:
 Storage Efficiency
 * Compression
 * Column Orientation
 You might look at turning this:

 http://citusdata.github.io/cstore_fdw/

 ... into a more integrated part of Postgres.
What would be of more general usefulness is probably
better planning and better performance of FDW interface.

So instead of integrating one specific FDW it would make
sense to improve postgresql so that it can use (properly written)
FDWs at native speeds

Regards

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread MauMau

From: Simon Riggs si...@2ndquadrant.com

Some of areas of RD are definitely on the roadmap, others are more
flexible. Some of this is in progress, other stuff is not even at the
design stage - yet, just a few paragraphs along the lines of we will
look at these topics. If we have room, its possible we may
accommodate other topics; this is not carte blanche, but the reason
for posting here is so people know we will take input, following the
normal community process. Detailed in-person discussions at PGCon are
expected and the Wiki pages will be updated for each aspect.

BI-related Indexing
* MinMax indexes
* Bitmap indexes

Large Systems
* Freeze avoidance
* Storage management issues for very large systems

Storage Efficiency
* Compression
* Column Orientation

Optimisation
* Bulk loading speed improvements
* Bulk FK evaluation
* Executor tuning for very large queries

Query tuning
* Approximate queries, sampling
* Materialized Views


Great!  I'm looking forward to seeing PostgreSQL evolve as an analytics 
database for data warehousing.  Is there any reason why in-memory database 
and MPP is not included?


Are you planning to include the above features in 9.5 and 9.6?  Are you 
recommending other developers not implement these features to avoid 
duplication of work with AXLE?


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Hannu Krosing
On 04/22/2014 02:04 PM, Simon Riggs wrote:
 On 22 April 2014 00:24, Josh Berkus j...@agliodbs.com wrote:
 On 04/21/2014 03:41 PM, Simon Riggs wrote:
 Storage Efficiency
 * Compression
 * Column Orientation
 You might look at turning this:

 http://citusdata.github.io/cstore_fdw/

 ... into a more integrated part of Postgres.
 Of course I'm aware of that work - credit to them. Certainly, many
 people feel that it is now time to do as you suggest and include
 column store features within PostgreSQL.

 As to turning it into a more integrated part of Postgres, we have a
 few problems there

 1. cstore_fdw code has an incompatible licence

 2. I don't think FDWs are the right place for complex new
 architectures such as column store, massively parallel processing or
 sharding. 
I agree that FDW is not an end-all solution for all these, but it is a
reasonable starting point and it just might be that the extra things
needed could be added to our FDW API instead of sewing it directly
into backend guts.


I recently tried to implement sharding at FDW level and the main
problem I ran into was a missing join type for efficiently using it
for certain queries.

The specific use case was queries of form

select l.*, r*
from remotetable r
join localtable l
on l.key1 = r.id and l.n = N;

PostgreSQL offered only two options:

1) full scan on remote table

2) single id=$ selects

neither of which are what is actually needed, as firs performs badly
if there are more than a few rows in remote table and 2nd performs
badly if l.n = N returns more than a few rows

when I manually rewrote the query to

select l.*, r*
from remotetable r where r.id = ANY(ARRAY(select key1 from localtable
where n = N))
join localtable l
on l.key1 = r.id and l.n = N;

it run really well.

Unfortunately this is not something that postgreSQL considers by itself
while optimising.

BTW, this kind of optimisation should also be a win for really large IN
queries if we
could have an indexed IN whic would not start each lookup from the index
root, but
rather would sort the IN contents and do an index merge vis skipping
from current position.


Cheers










-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Andrew Dunstan


On 04/22/2014 08:15 AM, MauMau wrote:



Are you planning to include the above features in 9.5 and 9.6? Are you 
recommending other developers not implement these features to avoid 
duplication of work with AXLE?






Without pointing any fingers, I should note that I have learned the hard 
way to take such recommendations with a grain of salt. More than once I 
have been stopped from working on something because someone else said 
they were, only for nothing to appear, and in the interests of full 
disclosure I can think of two significant instances when I have been 
similarly guilty, although the most serious of those has since been 
rectified by someone else.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Andrew Dunstan


On 04/22/2014 08:04 AM, Simon Riggs wrote:

On 22 April 2014 00:24, Josh Berkus j...@agliodbs.com wrote:

On 04/21/2014 03:41 PM, Simon Riggs wrote:

Storage Efficiency
* Compression
* Column Orientation

You might look at turning this:

http://citusdata.github.io/cstore_fdw/

... into a more integrated part of Postgres.

Of course I'm aware of that work - credit to them. Certainly, many
people feel that it is now time to do as you suggest and include
column store features within PostgreSQL.

As to turning it into a more integrated part of Postgres, we have a
few problems there

1. cstore_fdw code has an incompatible licence

2. I don't think FDWs are the right place for complex new
architectures such as column store, massively parallel processing or
sharding. The fact that it is probably the best place to implement it
in user space doesn't mean it transfers well into core code. That's a
shame and I don't know what to do about it, because it would be nice
to simply ask for change of licence and then integrate it, but it
seems more work than that (to me).





I agree, and indeed that was something like my first reaction to hearing 
about this development - FDW seems like a very odd way to handle this. 
But the notion of builtin columnar storage suggests to me that we really 
need first to tackle how various storage engines might be incorporated 
into Postgres. I know this has been a bugbear for many years, but maybe 
now with serious proposals for alternative storage engines on the 
horizon we can no longer afford to put off the evil day when we grapple 
with it.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Simon Riggs
On 22 April 2014 13:15, MauMau maumau...@gmail.com wrote:

 Great!  I'm looking forward to seeing PostgreSQL evolve as an analytics
 database for data warehousing.  Is there any reason why in-memory database
 and MPP is not included?

Those ideas are valid; the features are bounded by resource
constraints of time and money, as well as by technical skills/
capacities of my fellow developers. My analysis has been that
implementing parallelism has lower benefit/cost ratio than other
features, as well as requiring more expensive servers (for MPP). I
expect MPP to be an eventual end goal from BDR project.

 Are you planning to include the above features in 9.5 and 9.6?

Yes

 Are you
 recommending other developers not implement these features to avoid
 duplication of work with AXLE?

This was more to draw attention to the work so that all interested
parties can participate in producing something useful.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Stephen Frost
* Andrew Dunstan (and...@dunslane.net) wrote:
 I agree, and indeed that was something like my first reaction to
 hearing about this development - FDW seems like a very odd way to
 handle this. But the notion of builtin columnar storage suggests to
 me that we really need first to tackle how various storage engines
 might be incorporated into Postgres. I know this has been a bugbear
 for many years, but maybe now with serious proposals for alternative
 storage engines on the horizon we can no longer afford to put off
 the evil day when we grapple with it.

Agreed, and it goes beyond just columnar stores- I could see IOTs being
implemented using this notion of a different 'storage engine', but
calling it a 'storage engine' makes it sound like we want to change how
we access files and I don't think we really want to change that but
rather come up with a way to have an alternative heap..  Columnar or
IOTs would still be page-based and go through shared buffers, etc, I'd
think..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Josh Berkus
On 04/22/2014 06:39 AM, Andrew Dunstan wrote:
 I agree, and indeed that was something like my first reaction to hearing
 about this development - FDW seems like a very odd way to handle this.
 But the notion of builtin columnar storage suggests to me that we really
 need first to tackle how various storage engines might be incorporated
 into Postgres. I know this has been a bugbear for many years, but maybe
 now with serious proposals for alternative storage engines on the
 horizon we can no longer afford to put off the evil day when we grapple
 with it.

Yes.  *IF* PostgreSQL already supported alternate storage, then the
Citus folks might have released their CStore as a storage plugin instead
of an FDW.  However, if they'd waited for pluggable storage, they'd
still be waiting.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Pavel Stehule
2014-04-22 19:02 GMT+02:00 Josh Berkus j...@agliodbs.com:

 On 04/22/2014 06:39 AM, Andrew Dunstan wrote:
  I agree, and indeed that was something like my first reaction to hearing
  about this development - FDW seems like a very odd way to handle this.
  But the notion of builtin columnar storage suggests to me that we really
  need first to tackle how various storage engines might be incorporated
  into Postgres. I know this has been a bugbear for many years, but maybe
  now with serious proposals for alternative storage engines on the
  horizon we can no longer afford to put off the evil day when we grapple
  with it.

 Yes.  *IF* PostgreSQL already supported alternate storage, then the
 Citus folks might have released their CStore as a storage plugin instead
 of an FDW.  However, if they'd waited for pluggable storage, they'd
 still be waiting.


I am sceptical - what I know about OLAP column store databases - they need
a hardly different planner, so just engine or storage is not enough. Vector
Wise try to merge Ingres with Monet engine more than four years - and still
has some issues.

Our extensibility is probably major barrier against fast OLAP - I see a
most realistic way to support better partitioning and going in direction
higher parallelism and distribution - and maybe map/reduce support.

In GoodData we use successfully Postgres for BI projects to 20G with fast
response - and most painfulness are missing MERGE, missing fault tolerant
copy, IO expensive update of large tables with lot of indexes and missing
simple massive partitioning. On second hand - Postgres works perfectly on
thousands databases with thousands tables without errors with terrible
simple deploying in cloud environment.

Regards

Pavel



 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-21 Thread Josh Berkus
On 04/21/2014 03:41 PM, Simon Riggs wrote:
 Storage Efficiency
 * Compression
 * Column Orientation

You might look at turning this:

http://citusdata.github.io/cstore_fdw/

... into a more integrated part of Postgres.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers