This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new b3d0f15  Travis CI build asf-site
b3d0f15 is described below

commit b3d0f15d447212ca72b0fda79a687a618196f6f5
Author: CI <[email protected]>
AuthorDate: Thu Aug 13 06:43:57 2020 +0000

    Travis CI build asf-site
---
 content/docs/writing_data.html | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/content/docs/writing_data.html b/content/docs/writing_data.html
index e229106..d18be96 100644
--- a/content/docs/writing_data.html
+++ b/content/docs/writing_data.html
@@ -368,6 +368,7 @@
             <ul class="toc__menu">
   <li><a href="#write-operations">Write Operations</a></li>
   <li><a href="#deltastreamer">DeltaStreamer</a></li>
+  <li><a href="#multitabledeltastreamer">MultiTableDeltaStreamer</a></li>
   <li><a href="#datasource-writer">Datasource Writer</a></li>
   <li><a href="#syncing-to-hive">Syncing to Hive</a></li>
   <li><a href="#deletes">Deletes</a></li>
@@ -541,6 +542,39 @@ provided under <code 
class="highlighter-rouge">hudi-utilities/src/test/resources
 
 <p>In some cases, you may want to migrate your existing table into Hudi 
beforehand. Please refer to <a href="/docs/migration_guide.html">migration 
guide</a>.</p>
 
+<h2 id="multitabledeltastreamer">MultiTableDeltaStreamer</h2>
+
+<p><code class="highlighter-rouge">HoodieMultiTableDeltaStreamer</code>, a 
wrapper on top of <code class="highlighter-rouge">HoodieDeltaStreamer</code>, 
enables one to ingest multiple tables at a single go into hudi datasets. 
Currently it only supports sequential processing of tables to be ingested and 
COPY_ON_WRITE storage type. The command line options for <code 
class="highlighter-rouge">HoodieMultiTableDeltaStreamer</code> are pretty much 
similar to <code class="highlighter-rouge">Hoo [...]
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  <span class="o">*</span> <span 
class="o">--</span><span class="n">config</span><span class="o">-</span><span 
class="n">folder</span>
+    <span class="n">the</span> <span class="n">path</span> <span 
class="n">to</span> <span class="n">the</span> <span class="n">folder</span> 
<span class="n">which</span> <span class="n">contains</span> <span 
class="n">all</span> <span class="n">the</span> <span class="n">table</span> 
<span class="n">wise</span> <span class="n">config</span> <span 
class="n">files</span>
+    <span class="o">--</span><span class="n">base</span><span 
class="o">-</span><span class="n">path</span><span class="o">-</span><span 
class="n">prefix</span>
+    <span class="k">this</span> <span class="n">is</span> <span 
class="n">added</span> <span class="n">to</span> <span class="n">enable</span> 
<span class="n">users</span> <span class="n">to</span> <span 
class="n">create</span> <span class="n">all</span> <span class="n">the</span> 
<span class="n">hudi</span> <span class="n">datasets</span> <span 
class="k">for</span> <span class="n">related</span> <span 
class="n">tables</span> <span class="n">under</span> <span class="n">one</span> 
<span  [...]
+</code></pre></div></div>
+
+<p>The following properties are needed to be set properly to ingest data using 
<code class="highlighter-rouge">HoodieMultiTableDeltaStreamer</code>.</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code><span class="n">hoodie</span><span 
class="o">.</span><span class="na">deltastreamer</span><span 
class="o">.</span><span class="na">ingestion</span><span 
class="o">.</span><span class="na">tablesToBeIngested</span>
+  <span class="n">comma</span> <span class="n">separated</span> <span 
class="n">names</span> <span class="n">of</span> <span class="n">tables</span> 
<span class="n">to</span> <span class="n">be</span> <span 
class="n">ingested</span> <span class="n">in</span> <span class="n">the</span> 
<span class="n">format</span> <span class="o">&lt;</span><span 
class="n">database</span><span class="o">&gt;.&lt;</span><span 
class="n">table</span><span class="o">&gt;,</span> <span class="k">for</span> 
<s [...]
+<span class="n">hoodie</span><span class="o">.</span><span 
class="na">deltastreamer</span><span class="o">.</span><span 
class="na">ingestion</span><span class="o">.</span><span 
class="na">targetBasePath</span>
+  <span class="k">if</span> <span class="n">you</span> <span 
class="n">wish</span> <span class="n">to</span> <span class="n">ingest</span> 
<span class="n">a</span> <span class="n">particular</span> <span 
class="n">table</span> <span class="n">in</span> <span class="n">a</span> <span 
class="n">separate</span> <span class="n">path</span><span class="o">,</span> 
<span class="n">you</span> <span class="n">can</span> <span 
class="n">mention</span> <span class="n">that</span> <span class="n">p [...]
+<span class="n">hoodie</span><span class="o">.</span><span 
class="na">deltastreamer</span><span class="o">.</span><span 
class="na">ingestion</span><span class="o">.&lt;</span><span 
class="n">database</span><span class="o">&gt;.&lt;</span><span 
class="n">table</span><span class="o">&gt;.</span><span 
class="na">configFile</span>
+  <span class="n">path</span> <span class="n">to</span> <span 
class="n">the</span> <span class="n">config</span> <span class="n">file</span> 
<span class="n">in</span> <span class="n">dedicated</span> <span 
class="n">config</span> <span class="n">folder</span> <span 
class="n">which</span> <span class="n">contains</span> <span 
class="n">table</span> <span class="n">overridden</span> <span 
class="n">properties</span> <span class="k">for</span> <span 
class="n">the</span> <span class="n">part [...]
+</code></pre></div></div>
+
+<p>Sample config files for table wise overridden properties can be found under 
<code 
class="highlighter-rouge">hudi-utilities/src/test/resources/delta-streamer-config</code>.
 The command to run <code 
class="highlighter-rouge">HoodieMultiTableDeltaStreamer</code> is also similar 
to how you run <code class="highlighter-rouge">HoodieDeltaStreamer</code>.</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code><span class="o">[</span><span 
class="n">hoodie</span><span class="o">]</span><span class="err">$</span> <span 
class="n">spark</span><span class="o">-</span><span class="n">submit</span> 
<span class="o">--</span><span class="kd">class</span> <span 
class="nc">org</span><span class="o">.</span><span 
class="na">apache</span><span class="o">.</span><span 
class="na">hudi</span><span class="o">.</sp [...]
+  <span class="o">--</span><span class="n">props</span> <span 
class="nl">file:</span><span 
class="c1">//${PWD}/hudi-utilities/src/test/resources/delta-streamer-config/kafka-source.properties
 \</span>
+  <span class="o">--</span><span class="n">config</span><span 
class="o">-</span><span class="n">folder</span> <span 
class="nl">file:</span><span class="c1">//tmp/hudi-ingestion-config \</span>
+  <span class="o">--</span><span class="n">schemaprovider</span><span 
class="o">-</span><span class="kd">class</span> <span 
class="nc">org</span><span class="o">.</span><span 
class="na">apache</span><span class="o">.</span><span 
class="na">hudi</span><span class="o">.</span><span 
class="na">utilities</span><span class="o">.</span><span 
class="na">schema</span><span class="o">.</span><span 
class="na">SchemaRegistryProvider</span> <span class="err">\</span>
+  <span class="o">--</span><span class="n">source</span><span 
class="o">-</span><span class="kd">class</span> <span 
class="nc">org</span><span class="o">.</span><span 
class="na">apache</span><span class="o">.</span><span 
class="na">hudi</span><span class="o">.</span><span 
class="na">utilities</span><span class="o">.</span><span 
class="na">sources</span><span class="o">.</span><span 
class="na">AvroKafkaSource</span> <span class="err">\</span>
+  <span class="o">--</span><span class="n">source</span><span 
class="o">-</span><span class="n">ordering</span><span class="o">-</span><span 
class="n">field</span> <span class="n">impresssiontime</span> <span 
class="err">\</span>
+  <span class="o">--</span><span class="n">base</span><span 
class="o">-</span><span class="n">path</span><span class="o">-</span><span 
class="n">prefix</span> <span class="nl">file:</span><span 
class="err">\</span><span class="o">/</span><span class="err">\</span><span 
class="o">/</span><span class="err">\</span><span class="o">/</span><span 
class="n">tmp</span><span class="o">/</span><span class="n">hudi</span><span 
class="o">-</span><span class="n">deltastreamer</span><span class="o">- [...]
+  <span class="o">--</span><span class="n">target</span><span 
class="o">-</span><span class="n">table</span> <span class="n">uber</span><span 
class="o">.</span><span class="na">impressions</span> <span class="err">\</span>
+  <span class="o">--</span><span class="n">op</span> <span 
class="no">BULK_INSERT</span>
+</code></pre></div></div>
+
 <h2 id="datasource-writer">Datasource Writer</h2>
 
 <p>The <code class="highlighter-rouge">hudi-spark</code> module offers the 
DataSource API to write (and read) a Spark DataFrame into a Hudi table. There 
are a number of options available:</p>

Reply via email to