RE: Different storage backends in HAWQ?

2017-01-13 Thread Dmitry Buzolin
Created HAWQ-1270 for this. I got some feedback on this from Greenplum folks - 
they also find this idea very interesting.

-Original Message-
From: Yandong Yao [mailto:y...@pivotal.io]
Sent: Thursday, January 12, 2017 8:54 PM
To: dev@hawq.incubator.apache.org
Subject: Re: Different storage backends in HAWQ?

WARNING - External email; exercise caution


Supporting Ceph will be very interesting. Could you please create a JIRA?
Would be great if there is any patch.

On Thu, Jan 12, 2017 at 10:19 PM, Dmitry Buzolin <dmitry.buzo...@theice.com>
wrote:

> Also, besides supporting 3 different storage interfaces Ceph is more
> sophisticated storage backend compare to Hadoop at this time.
> For example: in addition to replicated pools, Ceph supports erasure coded
> pools  (kind of host based RAID), which has requires lot less storage
> compare to the former.
> Other great features of Ceph is an algorytmic approach to map data to the
> nodes rather than having centrally managed namenodes and snapshots. I don't
> think HDFS offers any of these features. In terms of performance, Ceph
> should be faster than HFDS since it is written on C++ and because it
> doesn't have scalability limitations when mapping data to storage pools,
> compare to Hadoop, where name node is such point of contention.
>
>
> -Original Message-
> From: Paul Guo [mailto:paul...@gmail.com]
> Sent: Wednesday, January 11, 2017 9:15 PM
> To: dev@hawq.incubator.apache.org
> Subject: Re: Different storage backends in HAWQ?
>
> WARNING - External email; exercise caution
>
>
> HAWQ supports row oriented format also (AO Table). For block storage &
> posix fs, I suspect you could use gpdb since gpdb uses local storage though
> probably a bit additional work is needed. For object storage (e.g. S3
> interface), this is an interesting topic. There have been a JIRA for this
> though there are some debates about better solutions.
>
> HAWQ-823 Amazon S3 External Table Support
> https://issues.apache.org/jira/browse/HAWQ-823
>
>
> 2017-01-11 23:14 GMT+08:00 Dmitry Buzolin <dmitry.buzo...@theice.com>:
>
> > Since HAWQ only depends on Hadoop and Parquet as a columnar format. I
> > would like to propose having Hawq having pluggable storage backends.
> > Hadoop is already supported but there is Ceph storage backend which
> offers
> > standard Posix compliant file system, object and a block storage. Ceph is
> > also is location aware and written in C++. I believe this is more
> important
> > than porting HAWQ to windows. Your thoughts?
> >
> > Thanks,
> > Dmitry.
> >
> > 
> >
> > This message may contain confidential information and is intended for
> > specific recipients unless explicitly noted otherwise. If you have reason
> > to believe you are not an intended recipient of this message, please
> delete
> > it and notify the sender. This message may not represent the opinion of
> > Intercontinental Exchange, Inc. (ICE), its subsidiaries or affiliates,
> and
> > does not constitute a contract or guarantee. Unencrypted electronic mail
> is
> > not secure and the recipient of this message is expected to provide
> > safeguards from viruses and pursue alternate means of communication where
> > privacy or a binding message is desired.
> >
>
> 
>
> This message may contain confidential information and is intended for
> specific recipients unless explicitly noted otherwise. If you have reason
> to believe you are not an intended recipient of this message, please delete
> it and notify the sender. This message may not represent the opinion of
> Intercontinental Exchange, Inc. (ICE), its subsidiaries or affiliates, and
> does not constitute a contract or guarantee. Unencrypted electronic mail is
> not secure and the recipient of this message is expected to provide
> safeguards from viruses and pursue alternate means of communication where
> privacy or a binding message is desired.
>



--
Best Regards,
Yandong



This message may contain confidential information and is intended for specific 
recipients unless explicitly noted otherwise. If you have reason to believe you 
are not an intended recipient of this message, please delete it and notify the 
sender. This message may not represent the opinion of Intercontinental 
Exchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a 
contract or guarantee. Unencrypted electronic mail is not secure and the 
recipient of this message is expected to provide safeguards from viruses and 
pursue alternate means of communication where privacy or a binding message is 
desired.


Re: Different storage backends in HAWQ?

2017-01-12 Thread Yandong Yao
Supporting Ceph will be very interesting. Could you please create a JIRA?
Would be great if there is any patch.

On Thu, Jan 12, 2017 at 10:19 PM, Dmitry Buzolin <dmitry.buzo...@theice.com>
wrote:

> Also, besides supporting 3 different storage interfaces Ceph is more
> sophisticated storage backend compare to Hadoop at this time.
> For example: in addition to replicated pools, Ceph supports erasure coded
> pools  (kind of host based RAID), which has requires lot less storage
> compare to the former.
> Other great features of Ceph is an algorytmic approach to map data to the
> nodes rather than having centrally managed namenodes and snapshots. I don't
> think HDFS offers any of these features. In terms of performance, Ceph
> should be faster than HFDS since it is written on C++ and because it
> doesn't have scalability limitations when mapping data to storage pools,
> compare to Hadoop, where name node is such point of contention.
>
>
> -Original Message-
> From: Paul Guo [mailto:paul...@gmail.com]
> Sent: Wednesday, January 11, 2017 9:15 PM
> To: dev@hawq.incubator.apache.org
> Subject: Re: Different storage backends in HAWQ?
>
> WARNING - External email; exercise caution
>
>
> HAWQ supports row oriented format also (AO Table). For block storage &
> posix fs, I suspect you could use gpdb since gpdb uses local storage though
> probably a bit additional work is needed. For object storage (e.g. S3
> interface), this is an interesting topic. There have been a JIRA for this
> though there are some debates about better solutions.
>
> HAWQ-823 Amazon S3 External Table Support
> https://issues.apache.org/jira/browse/HAWQ-823
>
>
> 2017-01-11 23:14 GMT+08:00 Dmitry Buzolin <dmitry.buzo...@theice.com>:
>
> > Since HAWQ only depends on Hadoop and Parquet as a columnar format. I
> > would like to propose having Hawq having pluggable storage backends.
> > Hadoop is already supported but there is Ceph storage backend which
> offers
> > standard Posix compliant file system, object and a block storage. Ceph is
> > also is location aware and written in C++. I believe this is more
> important
> > than porting HAWQ to windows. Your thoughts?
> >
> > Thanks,
> > Dmitry.
> >
> > 
> >
> > This message may contain confidential information and is intended for
> > specific recipients unless explicitly noted otherwise. If you have reason
> > to believe you are not an intended recipient of this message, please
> delete
> > it and notify the sender. This message may not represent the opinion of
> > Intercontinental Exchange, Inc. (ICE), its subsidiaries or affiliates,
> and
> > does not constitute a contract or guarantee. Unencrypted electronic mail
> is
> > not secure and the recipient of this message is expected to provide
> > safeguards from viruses and pursue alternate means of communication where
> > privacy or a binding message is desired.
> >
>
> 
>
> This message may contain confidential information and is intended for
> specific recipients unless explicitly noted otherwise. If you have reason
> to believe you are not an intended recipient of this message, please delete
> it and notify the sender. This message may not represent the opinion of
> Intercontinental Exchange, Inc. (ICE), its subsidiaries or affiliates, and
> does not constitute a contract or guarantee. Unencrypted electronic mail is
> not secure and the recipient of this message is expected to provide
> safeguards from viruses and pursue alternate means of communication where
> privacy or a binding message is desired.
>



-- 
Best Regards,
Yandong