We have just newly released PGSpider extension(pgspider_ext).<br>
<br>
This is an extension to construct High-Performance SQL Cluster Engine for
distributed big data.<br>
PGSpider enables PostgreSQL to access a number of data sources using Foreign
Data Wrapper(FDW) and retrieves the distributed data source vertically.
The main feature is:<br>
* Node partitioned table<br>
User can get records in multi tables on some data sources by one SQL easily.
<br>
If there are 2 data sources which have the following records:<br>
SELECT * FROM t1_node1; -- @node1
i | t
----+---
10 | a
11 | b
(2 rows)
SELECT * FROM t1_node2; -- @node2
i | t
----+---
20 | c
21 | d
(2 rows)
PGSpider enables to collect these records with node identifier column like:
SELECT * FROM t1;
i | t | node
----+---+-------
10 | a | node1
11 | b | node1
20 | c | node2
21 | d | node2
(4 rows)
- Parallel processing<br>
PGSpider can fetch results from data sources in parallel.
- Pushdown<br>
PGSpider can pushdown WHERE clause and aggregation functions to data
sources.<br>
The shippability depends on datasource FDW.<br>
<br>
This is developed by Toshiba Software Engineering & Technology Center.<br>
Source repository : https://github.com/pgspider/pgspider_ext
Best Regards,<br>
Mototaka Kanematsu