Thank you for your response, My question was intended to be a kernel-level file system for HDFS that only local file systems (ext4, xfs) can be used.
Thank you, Daegyu ᐧ 2019년 8월 17일 (토) 오후 7:28, Wei-Chiu Chuang <weic...@apache.org>님이 작성: > Not familiar with GPFS, but looking at IBM's website, GPFS has a client > that emulates Hadoop RPC > > https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adv_Overview.htm > > So you can just use GPFS like HDFS. It may be the quickest way to approach > this use case and is supported. > Not sure about the performance though. > > Looking at Cloudera's user doc > https://www.cloudera.com/documentation/other/reference-architecture/PDF/cloudera_ref_arch_stg_dev_accept_criteria.pdf > > *High-throughput Storage Area Network (SAN) and other shared storage > solutions can present remote block devices to virtual machines in a > flexible and performant manner that is often indistinguishable from a local > disk. An Apache Hadoop workload provides a uniquely challenging IO profile > to these storage solutions, and this can have a negative impact on the > utility and stability of the Cloudera Enterprise cluster, and to other work > that is utilizing the same storage backend.* > > *Warning: Running CDH on storage platforms other than direct-attached > physical disks can provide suboptimal performance. Cloudera Enterprise and > the majority of the Hadoop platform are optimized to provide high > performance by distributing work across a cluster that can utilize data > locality and fast local I/O.* > > On Sat, Aug 17, 2019 at 2:12 AM Daegyu Han <hdg9...@gmail.com> wrote: > >> Hi all, >> >> As far as I know, HDFS is designed to target local file systems like ext4 >> or xfs. >> >> Is it a bad approach to use SAN technology as storage for HDFS? >> >> Thank you, >> Daegyu >> ᐧ >> >