Re: What do you think about HDFS using GFS2 (shared disk file system) or GPFS (parallel filesystem) rather than local file system?

Daegyu Han Sun, 18 Aug 2019 18:25:38 -0700

Thank you for your response,

My question was intended to be a kernel-level file system for HDFS that
only local file systems (ext4, xfs) can be used.


Thank you,
Daegyu
ᐧ

2019년 8월 17일 (토) 오후 7:28, Wei-Chiu Chuang <weic...@apache.org>님이 작성:

> Not familiar with GPFS, but looking at IBM's website, GPFS has a client
> that emulates Hadoop RPC
>
> https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adv_Overview.htm
>
> So you can just use GPFS like HDFS. It may be the quickest way to approach
> this use case and is supported.
> Not sure about the performance though.
>
> Looking at Cloudera's user doc
> https://www.cloudera.com/documentation/other/reference-architecture/PDF/cloudera_ref_arch_stg_dev_accept_criteria.pdf
>
> *High-throughput Storage Area Network (SAN) and other shared storage
> solutions can present remote block devices to virtual machines in a
> flexible and performant manner that is often indistinguishable from a local
> disk. An Apache Hadoop workload provides a uniquely challenging IO profile
> to these storage solutions, and this can have a negative impact on the
> utility and stability of the Cloudera Enterprise cluster, and to other work
> that is utilizing the same storage backend.*
>
> *Warning: Running CDH on storage platforms other than direct-attached
> physical disks can provide suboptimal performance. Cloudera Enterprise and
> the majority of the Hadoop platform are optimized to provide high
> performance by distributing work across a cluster that can utilize data
> locality and fast local I/O.*
>
> On Sat, Aug 17, 2019 at 2:12 AM Daegyu Han <hdg9...@gmail.com> wrote:
>
>> Hi all,
>>
>> As far as I know, HDFS is designed to target local file systems like ext4
>> or xfs.
>>
>> Is it a bad approach to use SAN technology as storage for HDFS?
>>
>> Thank you,
>> Daegyu
>> ᐧ
>>
>

Re: What do you think about HDFS using GFS2 (shared disk file system) or GPFS (parallel filesystem) rather than local file system?

Reply via email to