[bahir-website] branch master updated: [BAHIR-293] Fix flink documentation

eskabetxe Tue, 07 Dec 2021 02:35:10 -0800

This is an automated email from the ASF dual-hosted git repository.

eskabetxe pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/bahir-website.git



The following commit(s) were added to refs/heads/master by this push:
     new 62bce83  [BAHIR-293] Fix flink documentation
62bce83 is described below

commit 62bce83563e9d14722ba533e33aedc0b543f3e11
Author: Joao Boto <[email protected]>
AuthorDate: Tue Dec 7 11:33:58 2021 +0100

    [BAHIR-293] Fix flink documentation
---
 site/Gemfile                                     |  3 +++
 site/docs/flink/current/documentation.md         |  4 ++++
 site/docs/flink/current/flink-streaming-kudu.md  | 25 ++++++++++++------------
 site/docs/flink/current/flink-streaming-pinot.md |  9 ++++++++-
 4 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/site/Gemfile b/site/Gemfile
index c13c470..f49e038 100644
--- a/site/Gemfile
+++ b/site/Gemfile
@@ -14,6 +14,9 @@
 # limitations under the License.
 #
 source 'https://rubygems.org'
+ruby "2.7.0"
+gem 'jekyll', '= 3.9.0'
+
 gem 'github-pages'
 gem 'rouge'
 gem 'jekyll-oembed', :require => 'jekyll_oembed'
diff --git a/site/docs/flink/current/documentation.md 
b/site/docs/flink/current/documentation.md
index 1af2a3c..969c97f 100644
--- a/site/docs/flink/current/documentation.md
+++ b/site/docs/flink/current/documentation.md
@@ -39,8 +39,12 @@ limitations under the License.
 
 [InfluxDB connector](../flink-streaming-influxdb)
 
+[InfluxDB2 connector](../flink-streaming-influxdb2) 
![](/assets/themes/apache-clean/img/new-black.png){:height="36px" width="36px"}
+
 [Kudu connector](../flink-streaming-kudu)
 
 [Netty connector](../flink-streaming-netty)
 
+[Pinot connector](../flink-streaming-pinot) 
![](/assets/themes/apache-clean/img/new-black.png){:height="36px" width="36px"}
+
 [Redis connector](../flink-streaming-redis)
diff --git a/site/docs/flink/current/flink-streaming-kudu.md 
b/site/docs/flink/current/flink-streaming-kudu.md
index 2af5e9a..28dec2c 100644
--- a/site/docs/flink/current/flink-streaming-kudu.md
+++ b/site/docs/flink/current/flink-streaming-kudu.md
@@ -181,18 +181,19 @@ The example uses lambda expressions to implement the 
functional interfaces.
 Read more about Kudu schema design in the [Kudu 
docs](https://kudu.apache.org/docs/schema_design.html).
 
 ### Supported data types
-| Flink/SQL     | Kudu           | 
-| ------------- |:-------------:| 
-|    STRING     | STRING        | 
-| BOOLEAN       |    BOOL       | 
-| TINYINT       |   INT8        | 
-| SMALLINT      |  INT16        | 
-| INT           |  INT32        | 
-| BIGINT        |   INT64     |
-| FLOAT         |  FLOAT      |
-| DOUBLE        |    DOUBLE    |
-| BYTES        |    BINARY    |
-| TIMESTAMP(3)     |    UNIXTIME_MICROS |
+
+| Flink/SQL            | Kudu                    |
+|----------------------|:-----------------------:|
+| `STRING`             | STRING                  |
+| `BOOLEAN`            | BOOL                    |
+| `TINYINT`            | INT8                    |
+| `SMALLINT`           | INT16                   |
+| `INT`                | INT32                   |
+| `BIGINT`             | INT64                   |
+| `FLOAT`              | FLOAT                   |
+| `DOUBLE`             | DOUBLE                  |
+| `BYTES`              | BINARY                  |
+| `TIMESTAMP(3)`       | UNIXTIME_MICROS         |
 
 Note:
 * `TIMESTAMP`s are fixed to a precision of 3, and the corresponding Java 
conversion class is `java.sql.Timestamp` 
diff --git a/site/docs/flink/current/flink-streaming-pinot.md 
b/site/docs/flink/current/flink-streaming-pinot.md
index b5a705a..b4c9a7b 100644
--- a/site/docs/flink/current/flink-streaming-pinot.md
+++ b/site/docs/flink/current/flink-streaming-pinot.md
@@ -44,12 +44,14 @@ See how to link with them for cluster execution 
[here](https://ci.apache.org/pro
 The sink class is called `PinotSink`.
 
 ## Architecture
+
 The Pinot sink stores elements from upstream Flink tasks in an Apache Pinot 
table.
 We support two execution modes
 * `RuntimeExecutionMode.BATCH`
 * `RuntimeExecutionMode.STREAMING` which requires checkpointing to be enabled.
 
 ### PinotSinkWriter
+
 Whenever the sink receives elements from upstream tasks, they are received by 
an instance of the PinotSinkWriter.
 The `PinotSinkWriter` holds a list of `PinotWriterSegment`s where each 
`PinotWriterSegment` is capable of storing `maxRowsPerSegment` elements.
 Whenever the maximum number of elements to hold is not yet reached the 
`PinotWriterSegment` is considered to be active.
@@ -60,7 +62,8 @@ Once the maximum number of elements to hold was reached, an 
active `PinotWriterS
 Thus, there is always one active `PinotWriterSegment` that new incoming 
elements will go to.
 Over time, the list of `PinotWriterSegment` per `PinotSinkWriter` increases up 
to the point where a checkpoint is created.
 
-**Checkpointing**  
+**Checkpointing**
+
 On checkpoint creation `PinotSinkWriter.prepareCommit` gets called by the 
Flink environment.
 This triggers the creation of `PinotSinkCommittable`s where each inactive 
`PinotWriterSegment` creates exactly one `PinotSinkCommittable`.
 
@@ -72,6 +75,7 @@ A `PinotSinkCommittables` then holds the path to the data 
file on the shared fil
 
 
 ### PinotSinkGlobalCommitter
+
 In order to be able to follow the guidelines for Pinot segment naming, we need 
to include the minimum and maximum timestamp of an element in the metadata of a 
segment and in its name.
 The minimum and maximum timestamp of all elements between two checkpoints is 
determined at a parallelism of 1 in the `PinotSinkGlobalCommitter`.
 This procedure allows recovering from failure by deleting previously uploaded 
segments which prevents having duplicate segments in the Pinot table.
@@ -90,10 +94,12 @@ When finally committing a `PinotSinkGlobalCommittable` the 
following procedure i
 
 
 ## Delivery Guarantees
+
 Resulting from the above described architecture the `PinotSink` provides an 
at-least-once delivery guarantee.
 While the failure recovery mechanism ensures that duplicate segments are 
prevented, there might be temporary inconsistencies in the Pinot table which 
can result in downstream tasks receiving an element multiple times.
 
 ## Options
+
 | Option                 | Description                                         
                             |
 | ---------------------- | 
--------------------------------------------------------------------------------
 | 
 | `pinotControllerHost`  | Host of the Pinot controller                        
                             |
@@ -108,6 +114,7 @@ While the failure recovery mechanism ensures that duplicate 
segments are prevent
 | `numCommitThreads`     | Number of threads used in the 
`PinotSinkGlobalCommitter` for committing segments |
 
 ## Usage
+
 ```java
 StreamExecutionEnvironment env = ...
 // Checkpointing needs to be enabled when executing in STREAMING mode

[bahir-website] branch master updated: [BAHIR-293] Fix flink documentation

Reply via email to