This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new c986463  Travis CI build asf-site
c986463 is described below

commit c986463cc160757db53ab3f2e04404aff382cb96
Author: CI <[email protected]>
AuthorDate: Thu May 13 18:48:34 2021 +0000

    Travis CI build asf-site
---
 content/docs/0.6.0-configurations.html |  5 +++++
 content/docs/0.7.0-configurations.html | 13 +++++++++----
 content/docs/0.8.0-configurations.html | 10 ++++++++++
 content/docs/configurations.html       | 18 ++++++++++++++----
 4 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/content/docs/0.6.0-configurations.html 
b/content/docs/0.6.0-configurations.html
index 45533f8..ee34c9f 100644
--- a/content/docs/0.6.0-configurations.html
+++ b/content/docs/0.6.0-configurations.html
@@ -464,6 +464,11 @@ This is useful to store checkpointing information, in a 
consistent way with the
 <p>Property: <code 
class="highlighter-rouge">hoodie.datasource.write.insert.drop.duplicates</code>,
 Default: <code class="highlighter-rouge">false</code> <br />
   <span style="color:grey">If set to true, filters out all duplicate records 
from incoming dataframe, during insert operations. </span></p>
 
+<h4 id="ENABLE_ROW_WRITER_OPT_KEY">ENABLE_ROW_WRITER_OPT_KEY</h4>
+<p>Property: <code 
class="highlighter-rouge">hoodie.datasource.write.row.writer.enable</code>, 
Default: <code class="highlighter-rouge">false</code> <br />
+<span style="color:grey">When set to true, will perform write operations 
directly using the spark native <code class="highlighter-rouge">Row</code>
+representation. This is expected to be faster by 20 to 30% than regular 
bulk_insert by setting this config</span></p>
+
 <h4 id="HIVE_SYNC_ENABLED_OPT_KEY">HIVE_SYNC_ENABLED_OPT_KEY</h4>
 <p>Property: <code 
class="highlighter-rouge">hoodie.datasource.hive_sync.enable</code>, Default: 
<code class="highlighter-rouge">false</code> <br />
   <span style="color:grey">When set to true, register/sync the table to Apache 
Hive metastore</span></p>
diff --git a/content/docs/0.7.0-configurations.html 
b/content/docs/0.7.0-configurations.html
index 0cac848..1881d86 100644
--- a/content/docs/0.7.0-configurations.html
+++ b/content/docs/0.7.0-configurations.html
@@ -445,6 +445,11 @@ This is useful to store checkpointing information, in a 
consistent way with the
 <p>Property: <code 
class="highlighter-rouge">hoodie.datasource.write.insert.drop.duplicates</code>,
 Default: <code class="highlighter-rouge">false</code> <br />
   <span style="color:grey">If set to true, filters out all duplicate records 
from incoming dataframe, during insert operations. </span></p>
 
+<h4 id="ENABLE_ROW_WRITER_OPT_KEY">ENABLE_ROW_WRITER_OPT_KEY</h4>
+<p>Property: <code 
class="highlighter-rouge">hoodie.datasource.write.row.writer.enable</code>, 
Default: <code class="highlighter-rouge">false</code> <br />
+<span style="color:grey">When set to true, will perform write operations 
directly using the spark native <code class="highlighter-rouge">Row</code>
+representation. This is expected to be faster by 20 to 30% than regular 
bulk_insert by setting this config</span></p>
+
 <h4 id="HIVE_SYNC_ENABLED_OPT_KEY">HIVE_SYNC_ENABLED_OPT_KEY</h4>
 <p>Property: <code 
class="highlighter-rouge">hoodie.datasource.hive_sync.enable</code>, Default: 
<code class="highlighter-rouge">false</code> <br />
   <span style="color:grey">When set to true, register/sync the table to Apache 
Hive metastore</span></p>
@@ -593,10 +598,6 @@ HoodieWriteConfig can be built using a builder pattern as 
below.</p>
 <p>Property: <code class="highlighter-rouge">hoodie.auto.commit</code><br />
 <span style="color:grey">Should HoodieWriteClient autoCommit after insert and 
upsert. The client can choose to turn off auto-commit and commit on a “defined 
success condition”</span></p>
 
-<h4 
id="withAssumeDatePartitioning">withAssumeDatePartitioning(assumeDatePartitioning
 = false)</h4>
-<p>Property: <code 
class="highlighter-rouge">hoodie.assume.date.partitioning</code><br />
-<span style="color:grey">Should HoodieWriteClient assume the data is 
partitioned by dates, i.e three levels from base path. This is a stop-gap to 
support tables created by versions &lt; 0.3.1. Will be removed eventually 
</span></p>
-
 <h4 id="withConsistencyCheckEnabled">withConsistencyCheckEnabled(enabled = 
false)</h4>
 <p>Property: <code 
class="highlighter-rouge">hoodie.consistency.check.enabled</code><br />
 <span style="color:grey">Should HoodieWriteClient perform additional checks to 
ensure written files’ are listable on the underlying filesystem/storage. Set 
this to true, to workaround S3’s eventual consistency model and ensure all data 
written as a part of a commit is faithfully available for queries. </span></p>
@@ -909,6 +910,10 @@ with keys/footers, avoiding full cost of rewriting the 
dataset. <code class="hig
 <p>Property: <code 
class="highlighter-rouge">hoodie.metadata.keep.min.commits</code>, <code 
class="highlighter-rouge">hoodie.metadata.keep.max.commits</code> <br />
 <span style="color:grey"> Controls the archival of the metadata table’s 
timeline </span></p>
 
+<h4 
id="withAssumeDatePartitioning">withAssumeDatePartitioning(assumeDatePartitioning
 = false)</h4>
+<p>Property: <code 
class="highlighter-rouge">hoodie.assume.date.partitioning</code><br />
+<span style="color:grey">Should HoodieWriteClient assume the data is 
partitioned by dates, i.e three levels from base path. This is a stop-gap to 
support tables created by versions &lt; 0.3.1. Will be removed eventually 
</span></p>
+
 <h3 id="clustering-configs">Clustering Configs</h3>
 <p>Controls clustering operations in hudi. Each clustering has to be 
configured for its strategy, and config params. This config drives the same.</p>
 
diff --git a/content/docs/0.8.0-configurations.html 
b/content/docs/0.8.0-configurations.html
index ed0305d..c763f18 100644
--- a/content/docs/0.8.0-configurations.html
+++ b/content/docs/0.8.0-configurations.html
@@ -453,6 +453,11 @@ This is useful to store checkpointing information, in a 
consistent way with the
 <p>Property: <code 
class="highlighter-rouge">hoodie.datasource.write.insert.drop.duplicates</code>,
 Default: <code class="highlighter-rouge">false</code> <br />
   <span style="color:grey">If set to true, filters out all duplicate records 
from incoming dataframe, during insert operations. </span></p>
 
+<h4 id="ENABLE_ROW_WRITER_OPT_KEY">ENABLE_ROW_WRITER_OPT_KEY</h4>
+<p>Property: <code 
class="highlighter-rouge">hoodie.datasource.write.row.writer.enable</code>, 
Default: <code class="highlighter-rouge">false</code> <br />
+<span style="color:grey">When set to true, will perform write operations 
directly using the spark native <code class="highlighter-rouge">Row</code>
+representation. This is expected to be faster by 20 to 30% than regular 
bulk_insert by setting this config</span></p>
+
 <h4 id="HIVE_SYNC_ENABLED_OPT_KEY">HIVE_SYNC_ENABLED_OPT_KEY</h4>
 <p>Property: <code 
class="highlighter-rouge">hoodie.datasource.hive_sync.enable</code>, Default: 
<code class="highlighter-rouge">false</code> <br />
   <span style="color:grey">When set to true, register/sync the table to Apache 
Hive metastore</span></p>
@@ -820,6 +825,11 @@ HoodieWriteConfig can be built using a builder pattern as 
below.</p>
 <p>Property: <code 
class="highlighter-rouge">hoodie.combine.before.delete</code><br />
 <span style="color:grey">Flag which first combines the input RDD and merges 
multiple partial records into a single record before deleting in DFS</span></p>
 
+<h4 
id="withMergeAllowDuplicateOnInserts">withMergeAllowDuplicateOnInserts(mergeAllowDuplicateOnInserts
 = false)</h4>
+<p>Property: <code 
class="highlighter-rouge">hoodie.merge.allow.duplicate.on.inserts</code> <br />
+<span style="color:grey"> When enabled, will route new records as inserts and 
will not merge with existing records.
+Result could contain duplicate entries. </span></p>
+
 <h4 id="withWriteStatusStorageLevel">withWriteStatusStorageLevel(level = 
MEMORY_AND_DISK_SER)</h4>
 <p>Property: <code 
class="highlighter-rouge">hoodie.write.status.storage.level</code><br />
 <span style="color:grey">HoodieWriteClient.insert and HoodieWriteClient.upsert 
returns a persisted RDD[WriteStatus], this is because the Client can choose to 
inspect the WriteStatus and choose and commit or not based on the failures. 
This is a configuration for the storage level for this RDD </span></p>
diff --git a/content/docs/configurations.html b/content/docs/configurations.html
index 4bae5c5..494c857 100644
--- a/content/docs/configurations.html
+++ b/content/docs/configurations.html
@@ -477,6 +477,11 @@ This is useful to store checkpointing information, in a 
consistent way with the
 <p>Property: <code 
class="highlighter-rouge">hoodie.datasource.write.insert.drop.duplicates</code>,
 Default: <code class="highlighter-rouge">false</code> <br />
   <span style="color:grey">If set to true, filters out all duplicate records 
from incoming dataframe, during insert operations. </span></p>
 
+<h4 id="ENABLE_ROW_WRITER_OPT_KEY">ENABLE_ROW_WRITER_OPT_KEY</h4>
+<p>Property: <code 
class="highlighter-rouge">hoodie.datasource.write.row.writer.enable</code>, 
Default: <code class="highlighter-rouge">false</code> <br />
+  <span style="color:grey">When set to true, will perform write operations 
directly using the spark native <code class="highlighter-rouge">Row</code> 
+  representation. This is expected to be faster by 20 to 30% than regular 
bulk_insert by setting this config</span></p>
+
 <h4 id="HIVE_SYNC_ENABLED_OPT_KEY">HIVE_SYNC_ENABLED_OPT_KEY</h4>
 <p>Property: <code 
class="highlighter-rouge">hoodie.datasource.hive_sync.enable</code>, Default: 
<code class="highlighter-rouge">false</code> <br />
   <span style="color:grey">When set to true, register/sync the table to Apache 
Hive metastore</span></p>
@@ -1039,6 +1044,11 @@ HoodieWriteConfig can be built using a builder pattern 
as below.</p>
 <p>Property: <code 
class="highlighter-rouge">hoodie.combine.before.delete</code><br />
 <span style="color:grey">Flag which first combines the input RDD and merges 
multiple partial records into a single record before deleting in DFS</span></p>
 
+<h4 
id="withMergeAllowDuplicateOnInserts">withMergeAllowDuplicateOnInserts(mergeAllowDuplicateOnInserts
 = false)</h4>
+<p>Property: <code 
class="highlighter-rouge">hoodie.merge.allow.duplicate.on.inserts</code> <br />
+<span style="color:grey"> When enabled, will route new records as inserts and 
will not merge with existing records. 
+Result could contain duplicate entries. </span></p>
+
 <h4 id="withWriteStatusStorageLevel">withWriteStatusStorageLevel(level = 
MEMORY_AND_DISK_SER)</h4>
 <p>Property: <code 
class="highlighter-rouge">hoodie.write.status.storage.level</code><br />
 <span style="color:grey">HoodieWriteClient.insert and HoodieWriteClient.upsert 
returns a persisted RDD[WriteStatus], this is because the Client can choose to 
inspect the WriteStatus and choose and commit or not based on the failures. 
This is a configuration for the storage level for this RDD </span></p>
@@ -1047,10 +1057,6 @@ HoodieWriteConfig can be built using a builder pattern 
as below.</p>
 <p>Property: <code class="highlighter-rouge">hoodie.auto.commit</code><br />
 <span style="color:grey">Should HoodieWriteClient autoCommit after insert and 
upsert. The client can choose to turn off auto-commit and commit on a “defined 
success condition”</span></p>
 
-<h4 
id="withAssumeDatePartitioning">withAssumeDatePartitioning(assumeDatePartitioning
 = false)</h4>
-<p>Property: <code 
class="highlighter-rouge">hoodie.assume.date.partitioning</code><br />
-<span style="color:grey">Should HoodieWriteClient assume the data is 
partitioned by dates, i.e three levels from base path. This is a stop-gap to 
support tables created by versions &lt; 0.3.1. Will be removed eventually 
</span></p>
-
 <h4 id="withConsistencyCheckEnabled">withConsistencyCheckEnabled(enabled = 
false)</h4>
 <p>Property: <code 
class="highlighter-rouge">hoodie.consistency.check.enabled</code><br />
 <span style="color:grey">Should HoodieWriteClient perform additional checks to 
ensure written files’ are listable on the underlying filesystem/storage. Set 
this to true, to workaround S3’s eventual consistency model and ensure all data 
written as a part of a commit is faithfully available for queries. </span></p>
@@ -1367,6 +1373,10 @@ with keys/footers, avoiding full cost of rewriting the 
dataset. <code class="hig
 <p>Property: <code 
class="highlighter-rouge">hoodie.metadata.keep.min.commits</code>, <code 
class="highlighter-rouge">hoodie.metadata.keep.max.commits</code> <br />
 <span style="color:grey"> Controls the archival of the metadata table’s 
timeline </span></p>
 
+<h4 
id="withAssumeDatePartitioning">withAssumeDatePartitioning(assumeDatePartitioning
 = false)</h4>
+<p>Property: <code 
class="highlighter-rouge">hoodie.assume.date.partitioning</code><br />
+<span style="color:grey">Should HoodieWriteClient assume the data is 
partitioned by dates, i.e three levels from base path. This is a stop-gap to 
support tables created by versions &lt; 0.3.1. Will be removed eventually 
</span></p>
+
 <h3 id="clustering-configs">Clustering Configs</h3>
 <p>Controls clustering operations in hudi. Each clustering has to be 
configured for its strategy, and config params. This config drives the same.</p>
 

Reply via email to