We have just newly released PGSpider extension(pgspider_ext).<br>
<br>
This is an extension to construct High-Performance SQL Cluster Engine for 
distributed big data.<br>
PGSpider enables PostgreSQL to access a number of data sources using Foreign 
Data Wrapper(FDW) and retrieves the distributed data source vertically.

The main feature is:<br>
* Node partitioned table<br>
User can get records in multi tables on some data sources by one SQL easily.
<br>
If there are 2 data sources which have the following records:<br>

        SELECT * FROM t1_node1; -- @node1
           i | t
         ----+---
          10 | a
          11 | b
        (2 rows)
 
        SELECT * FROM t1_node2; -- @node2
           i | t
         ----+---
          20 | c
          21 | d
        (2 rows)

PGSpider enables to collect these records with node identifier column like:

        SELECT * FROM t1;
          i | t | node
        ----+---+-------
         10 | a | node1
         11 | b | node1
         20 | c | node2
         21 | d | node2
         (4 rows)

- Parallel processing<br>
PGSpider can fetch results from data sources in parallel.

- Pushdown<br>
PGSpider can pushdown WHERE clause and aggregation functions to data 
sources.<br>
The shippability depends on datasource FDW.<br>

<br>
This is developed by Toshiba Software Engineering & Technology Center.<br>
Source repository :  https://github.com/pgspider/pgspider_ext

Best Regards,<br>
Mototaka Kanematsu

Reply via email to