tablespace

Kyle Dunn Mon, 13 Mar 2017 16:58:06 -0700

Hello devs -

I'm doing some reading about HAWQ tablespaces here:
http://hdb.docs.pivotal.io/212/hawq/ddl/ddl-tablespace.html


I want to understand the flow of things, please correct me on the following
assumptions:

1) Create a filesystem (not *really* supported after HAWQ init) - the
default is obviously [lib]HDFS[3]:
      SELECT * FROM pg_filesystem;

2) Create a filespace, referencing the above file system:
      CREATE FILESPACE testfs ON hdfs
      ('localhost:8020/fs/testfs') WITH (NUMREPLICA = 1);

3) Create a tablespace, reference the above filespace:
      CREATE TABLESPACE fastspace FILESPACE testfs;

4) Create objects referencing the above table space, or set it as the
database's default:
      CREATE DATABASE testdb WITH TABLESPACE=testfs;

Given this set of steps, it it true (*in theory*) an arbitrary filesystem
(i.e. storage backend) could be added to HAWQ using *existing* APIs?

I realize the nuances of this are significant, but conceptually I'd like to
gather some details, mainly in support of this
<https://issues.apache.org/jira/browse/HAWQ-1270> ongoing JIRA discussion.
I'm daydreaming about whether this neat tool:
https://github.com/s3fs-fuse/s3fs-fuse could be useful for an S3 spike
(which also seems to kind of work on Google Cloud, when interoperability
<https://github.com/s3fs-fuse/s3fs-fuse/issues/109#issuecomment-286222694>
mode is enabled). By it's Linux FUSE nature, it implements the lion's share
of required pg_filesystem functions; in fact, maybe we could actually use
system calls from glibc (somewhat <http://www.linux-mag.com/id/7814/>)
directly in this situation.

Curious to get some feedback.


Thanks,
Kyle
-- 
*Kyle Dunn | Data Engineering | Pivotal*
Direct: 303.905.3171 <3039053171> | Email: [email protected]

Questions about filesystem / filespace / tablespace

Reply via email to