This is an automated email from the ASF dual-hosted git repository.

lamberken pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 183aac0  [MINOR] fix typo in comparison document (#1588)
183aac0 is described below

commit 183aac0cbc186b14b557ebe3b678c320bd6fef91
Author: wanglisheng81 <37138788+wanglishen...@users.noreply.github.com>
AuthorDate: Wed May 6 16:08:03 2020 +0800

    [MINOR] fix typo in comparison document (#1588)
---
 docs/_docs/1_5_comparison.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/_docs/1_5_comparison.md b/docs/_docs/1_5_comparison.md
index 78f2be2..32b73c6 100644
--- a/docs/_docs/1_5_comparison.md
+++ b/docs/_docs/1_5_comparison.md
@@ -18,7 +18,7 @@ Consequently, Kudu does not support incremental pulling (as 
of early 2017), some
 
 Kudu diverges from a distributed file system abstraction and HDFS altogether, 
with its own set of storage servers talking to each  other via RAFT.
 Hudi, on the other hand, is designed to work with an underlying Hadoop 
compatible filesystem (HDFS,S3 or Ceph) and does not have its own fleet of 
storage servers,
-instead relying on Apache Spark to do the heavy-lifting. Thu, Hudi can be 
scaled easily, just like other Spark jobs, while Kudu would require hardware
+instead relying on Apache Spark to do the heavy-lifting. Thus, Hudi can be 
scaled easily, just like other Spark jobs, while Kudu would require hardware
 & operational support, typical to datastores like HBase or Vertica. We have 
not at this point, done any head to head benchmarks against Kudu (given RTTable 
is WIP).
 But, if we were to go with results shared by 
[CERN](https://db-blog.web.cern.ch/blog/zbigniew-baranowski/2017-01-performance-comparison-different-file-formats-and-storage-engines)
 ,
 we expect Hudi to positioned at something that ingests parquet with superior 
performance.

Reply via email to