[
https://issues.apache.org/jira/browse/HAWQ-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15901627#comment-15901627
]
Kyle R Dunn edited comment on HAWQ-1270 at 3/14/17 12:59 AM:
-------------------------------------------------------------
>From what I can tell, [this |
>https://github.com/apache/incubator-hawq/blob/master/src/bin/gpfilesystem/hdfs/gpfshdfs.c]
> IS the interface.
When you look at the {{pg_filesystem}} table, it lists the exact functions
required for a new backend:
{code}
SELECT * FROM pg_filesystem ;
-[ RECORD 1 ]------+--------------------------
fsysname | hdfs
fsysconnfn | gpfs_hdfs_connect
fsysdisconnfn | gpfs_hdfs_disconnect
fsysopenfn | gpfs_hdfs_openfile
fsysclosefn | gpfs_hdfs_closefile
fsysseekfn | gpfs_hdfs_seek
fsystellfn | gpfs_hdfs_tell
fsysreadfn | gpfs_hdfs_read
fsyswritefn | gpfs_hdfs_write
fsysflushfn | gpfs_hdfs_sync
fsysdeletefn | gpfs_hdfs_delete
fsyschmodfn | gpfs_hdfs_chmod
fsysmkdirfn | gpfs_hdfs_createdirectory
fsystruncatefn | gpfs_hdfs_truncate
fsysgetpathinfofn | gpfs_hdfs_getpathinfo
fsysfreefileinfofn | gpfs_hdfs_freefileinfo
fsyslibfile | $libdir/gpfshdfs.so
fsysowner | 10
fsystrusted | f
fsysacl |
{code}
was (Author: kdunn926):
>From what I can tell, [this |
>https://github.com/apache/incubator-hawq/blob/master/src/bin/gpfilesystem/hdfs/gpfshdfs.c]
> IS the interface.
When you look at the {{pg_filesystem}} table, it lists the exact functions
required for a new backend:
{code}
SELECT * from pg_filesystem ;
-[ RECORD 1 ]------+--------------------------
fsysname | hdfs
fsysconnfn | gpfs_hdfs_connect
fsysdisconnfn | gpfs_hdfs_disconnect
fsysopenfn | gpfs_hdfs_openfile
fsysclosefn | gpfs_hdfs_closefile
fsysseekfn | gpfs_hdfs_seek
fsystellfn | gpfs_hdfs_tell
fsysreadfn | gpfs_hdfs_read
fsyswritefn | gpfs_hdfs_write
fsysflushfn | gpfs_hdfs_sync
fsysdeletefn | gpfs_hdfs_delete
fsyschmodfn | gpfs_hdfs_chmod
fsysmkdirfn | gpfs_hdfs_createdirectory
fsystruncatefn | gpfs_hdfs_truncate
fsysgetpathinfofn | gpfs_hdfs_getpathinfo
fsysfreefileinfofn | gpfs_hdfs_freefileinfo
fsyslibfile | $libdir/gpfshdfs.so
fsysowner | 10
fsystrusted | f
fsysacl |
{code}
> Plugged storage back-ends for HAWQ
> ----------------------------------
>
> Key: HAWQ-1270
> URL: https://issues.apache.org/jira/browse/HAWQ-1270
> Project: Apache HAWQ
> Issue Type: Improvement
> Reporter: Dmitry Buzolin
> Assignee: Ed Espino
>
> Since HAWQ only depends on Hadoop and Parquet for columnar format support, I
> would like to propose pluggable storage backend design for Hawq. Hadoop is
> already supported but there is Ceph - a distributed, storage system which
> offers standard Posix compliant file system, object and a block storage. Ceph
> is also data location aware, written in C++. and is more sophisticated
> storage backend compare to Hadoop at this time. It provides replicated and
> erasure encoded storage pools, Other great features of Ceph are: snapshots
> and an algorithmic approach to map data to the nodes rather than having
> centrally managed namenodes. I don't think HDFS offers any of these features.
> In terms of performance, Ceph should be faster than HFDS since it is written
> on C++ and because it doesn't have scalability limitations when mapping data
> to storage pools, compare to Hadoop, where name node is such point of
> contention.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)