This is an automated email from the ASF dual-hosted git repository.

xushiyan pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 84bc8cd60f [HUDI-1570] Add "average record size in a commit" to FAQ 
(#7072)
84bc8cd60f is described below

commit 84bc8cd60f1540eaf77262df0f6065c73d96fc68
Author: Jon Vexler <[email protected]>
AuthorDate: Thu Oct 27 11:59:07 2022 -0700

    [HUDI-1570] Add "average record size in a commit" to FAQ (#7072)
---
 website/docs/faq.md                          | 5 +++++
 website/versioned_docs/version-0.12.1/faq.md | 5 +++++
 2 files changed, 10 insertions(+)

diff --git a/website/docs/faq.md b/website/docs/faq.md
index 793b459c8d..ae24d86ecc 100644
--- a/website/docs/faq.md
+++ b/website/docs/faq.md
@@ -632,6 +632,11 @@ Cloudera CDP stack, causing the conflict.  To get around 
the RuntimeException, y
 `hbase.defaults.for.version.skip` to `true` in the `hbase-site.xml` 
configuration file, e.g., overwriting the config
 within the Cloudera manager.
 
+### How can I find the average record size in a commit?
+The `commit showpartitons` command in [HUDI 
CLI](https://hudi.apache.org/docs/cli) will show both "bytes written" and 
+"records inserted." Divide the bytes written by records inserted to find the 
average size. Note that this answer assumes 
+metadata overhead is negligible. For a small dataset (such as 5 columns, 100 
records) this will not be the case.
+
 ## Contributing to FAQ
 
 A good and usable FAQ should be community-driven and crowd source 
questions/thoughts across everyone.
diff --git a/website/versioned_docs/version-0.12.1/faq.md 
b/website/versioned_docs/version-0.12.1/faq.md
index ac8c2aec6e..65dee20ce1 100644
--- a/website/versioned_docs/version-0.12.1/faq.md
+++ b/website/versioned_docs/version-0.12.1/faq.md
@@ -627,6 +627,11 @@ Cloudera CDP stack, causing the conflict.  To get around 
the RuntimeException, y
 `hbase.defaults.for.version.skip` to `true` in the `hbase-site.xml` 
configuration file, e.g., overwriting the config
 within the Cloudera manager.
 
+### How can I find the average record size in a commit?
+The `commit showpartitons` command in [HUDI 
CLI](https://hudi.apache.org/docs/cli) will show both "bytes written" and
+"records inserted." Divide the bytes written by records inserted to find the 
average size. Note that this answer assumes
+metadata overhead is negligible. For a small dataset (such as 5 columns, 100 
records) this will not be the case.
+
 ## Contributing to FAQ
 
 A good and usable FAQ should be community-driven and crowd source 
questions/thoughts across everyone.

Reply via email to