This is an automated email from the ASF dual-hosted git repository.
vinoth pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 206e549 Travis CI build asf-site
206e549 is described below
commit 206e549637101c6fe49aaed915c41f21ad884e81
Author: CI <[email protected]>
AuthorDate: Wed May 6 08:42:27 2020 +0000
Travis CI build asf-site
---
content/docs/comparison.html | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/content/docs/comparison.html b/content/docs/comparison.html
index 3919cbd..223e9f7 100644
--- a/content/docs/comparison.html
+++ b/content/docs/comparison.html
@@ -349,7 +349,7 @@ Consequently, Kudu does not support incremental pulling (as
of early 2017), some
<p>Kudu diverges from a distributed file system abstraction and HDFS
altogether, with its own set of storage servers talking to each other via RAFT.
Hudi, on the other hand, is designed to work with an underlying Hadoop
compatible filesystem (HDFS,S3 or Ceph) and does not have its own fleet of
storage servers,
-instead relying on Apache Spark to do the heavy-lifting. Thu, Hudi can be
scaled easily, just like other Spark jobs, while Kudu would require hardware
+instead relying on Apache Spark to do the heavy-lifting. Thus, Hudi can be
scaled easily, just like other Spark jobs, while Kudu would require hardware
& operational support, typical to datastores like HBase or Vertica. We
have not at this point, done any head to head benchmarks against Kudu (given
RTTable is WIP).
But, if we were to go with results shared by <a
href="https://db-blog.web.cern.ch/blog/zbigniew-baranowski/2017-01-performance-comparison-different-file-formats-and-storage-engines">CERN</a>
,
we expect Hudi to positioned at something that ingests parquet with superior
performance.</p>