This is an automated email from the ASF dual-hosted git repository. arvid pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/flink-web.git
commit 1606a59de6bad3211f0179e8b80e6106b1162b02 Author: Arvid Heise <[email protected]> AuthorDate: Thu Jan 7 16:37:58 2021 +0100 Generate page --- content/2021/01/07/pulsar-flink-connector-270.html | 423 +++++++++++++++++++++ content/blog/feed.xml | 379 ++++++++---------- content/blog/index.html | 36 +- content/blog/page10/index.html | 40 +- content/blog/page11/index.html | 40 +- content/blog/page12/index.html | 39 +- content/blog/page13/index.html | 42 +- content/blog/page14/index.html | 28 ++ content/blog/page2/index.html | 38 +- content/blog/page3/index.html | 38 +- content/blog/page4/index.html | 36 +- content/blog/page5/index.html | 36 +- content/blog/page6/index.html | 36 +- content/blog/page7/index.html | 38 +- content/blog/page8/index.html | 40 +- content/blog/page9/index.html | 40 +- .../pulsar-flink-batch-stream.png | Bin 0 -> 281084 bytes .../2021-01-07-pulsar-flink/pulsar-key-shared.png | Bin 0 -> 109552 bytes content/index.html | 8 +- content/zh/index.html | 8 +- 20 files changed, 933 insertions(+), 412 deletions(-) diff --git a/content/2021/01/07/pulsar-flink-connector-270.html b/content/2021/01/07/pulsar-flink-connector-270.html new file mode 100644 index 0000000..c34ad57 --- /dev/null +++ b/content/2021/01/07/pulsar-flink-connector-270.html @@ -0,0 +1,423 @@ +<!DOCTYPE html> +<html lang="en"> + <head> + <meta charset="utf-8"> + <meta http-equiv="X-UA-Compatible" content="IE=edge"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags --> + <title>Apache Flink: What's New in the Pulsar Flink Connector 2.7.0</title> + <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon"> + <link rel="icon" href="/favicon.ico" type="image/x-icon"> + + <!-- Bootstrap --> + <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.4.1/css/bootstrap.min.css"> + <link rel="stylesheet" href="/css/flink.css"> + <link rel="stylesheet" href="/css/syntax.css"> + + <!-- Blog RSS feed --> + <link href="/blog/feed.xml" rel="alternate" type="application/rss+xml" title="Apache Flink Blog: RSS feed" /> + + <!-- jQuery (necessary for Bootstrap's JavaScript plugins) --> + <!-- We need to load Jquery in the header for custom google analytics event tracking--> + <script src="/js/jquery.min.js"></script> + + <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries --> + <!-- WARNING: Respond.js doesn't work if you view the page via file:// --> + <!--[if lt IE 9]> + <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script> + <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script> + <![endif]--> + </head> + <body> + + + <!-- Main content. --> + <div class="container"> + <div class="row"> + + + <div id="sidebar" class="col-sm-3"> + + +<!-- Top navbar. --> + <nav class="navbar navbar-default"> + <!-- The logo. --> + <div class="navbar-header"> + <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1"> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </button> + <div class="navbar-logo"> + <a href="/"> + <img alt="Apache Flink" src="/img/flink-header-logo.svg" width="147px" height="73px"> + </a> + </div> + </div><!-- /.navbar-header --> + + <!-- The navigation links. --> + <div class="collapse navbar-collapse" id="bs-example-navbar-collapse-1"> + <ul class="nav navbar-nav navbar-main"> + + <!-- First menu section explains visitors what Flink is --> + + <!-- What is Stream Processing? --> + <!-- + <li><a href="/streamprocessing1.html">What is Stream Processing?</a></li> + --> + + <!-- What is Flink? --> + <li><a href="/flink-architecture.html">What is Apache Flink?</a></li> + + + <ul class="nav navbar-nav navbar-subnav"> + <li > + <a href="/flink-architecture.html">Architecture</a> + </li> + <li > + <a href="/flink-applications.html">Applications</a> + </li> + <li > + <a href="/flink-operations.html">Operations</a> + </li> + </ul> + + + <!-- What is Stateful Functions? --> + + <li><a href="/stateful-functions.html">What is Stateful Functions?</a></li> + + <!-- Use cases --> + <li><a href="/usecases.html">Use Cases</a></li> + + <!-- Powered by --> + <li><a href="/poweredby.html">Powered By</a></li> + + + + <!-- Second menu section aims to support Flink users --> + + <!-- Downloads --> + <li><a href="/downloads.html">Downloads</a></li> + + <!-- Getting Started --> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Getting Started<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="https://ci.apache.org/projects/flink/flink-docs-release-1.12/try-flink/index.html" target="_blank">With Flink <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + <li><a href="https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2/getting-started/project-setup.html" target="_blank">With Flink Stateful Functions <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + <li><a href="/training.html">Training Course</a></li> + </ul> + </li> + + <!-- Documentation --> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Documentation<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="https://ci.apache.org/projects/flink/flink-docs-release-1.12" target="_blank">Flink 1.12 (Latest stable release) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + <li><a href="https://ci.apache.org/projects/flink/flink-docs-master" target="_blank">Flink Master (Latest Snapshot) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + <li><a href="https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2" target="_blank">Flink Stateful Functions 2.2 (Latest stable release) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + <li><a href="https://ci.apache.org/projects/flink/flink-statefun-docs-master" target="_blank">Flink Stateful Functions Master (Latest Snapshot) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + </ul> + </li> + + <!-- getting help --> + <li><a href="/gettinghelp.html">Getting Help</a></li> + + <!-- Blog --> + <li><a href="/blog/"><b>Flink Blog</b></a></li> + + + <!-- Flink-packages --> + <li> + <a href="https://flink-packages.org" target="_blank">flink-packages.org <small><span class="glyphicon glyphicon-new-window"></span></small></a> + </li> + + + <!-- Third menu section aim to support community and contributors --> + + <!-- Community --> + <li><a href="/community.html">Community & Project Info</a></li> + + <!-- Roadmap --> + <li><a href="/roadmap.html">Roadmap</a></li> + + <!-- Contribute --> + <li><a href="/contributing/how-to-contribute.html">How to Contribute</a></li> + + + <!-- GitHub --> + <li> + <a href="https://github.com/apache/flink" target="_blank">Flink on GitHub <small><span class="glyphicon glyphicon-new-window"></span></small></a> + </li> + + + + <!-- Language Switcher --> + <li> + + + <a href="/zh/2021/01/07/pulsar-flink-connector-270.html">中文版</a> + + + </li> + + </ul> + + <style> + .smalllinks:link { + display: inline-block !important; background: none; padding-top: 0px; padding-bottom: 0px; padding-right: 0px; min-width: 75px; + } + </style> + + <ul class="nav navbar-nav navbar-bottom"> + <hr /> + + <!-- Twitter --> + <li><a href="https://twitter.com/apacheflink" target="_blank">@ApacheFlink <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + + <!-- Visualizer --> + <li class=" hidden-md hidden-sm"><a href="/visualizer/" target="_blank">Plan Visualizer <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + + <li > + <a href="/security.html">Flink Security</a> + </li> + + <hr /> + + <li><a href="https://apache.org" target="_blank">Apache Software Foundation <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + + <li> + + <a class="smalllinks" href="https://www.apache.org/licenses/" target="_blank">License</a> <small><span class="glyphicon glyphicon-new-window"></span></small> + + <a class="smalllinks" href="https://www.apache.org/security/" target="_blank">Security</a> <small><span class="glyphicon glyphicon-new-window"></span></small> + + <a class="smalllinks" href="https://www.apache.org/foundation/sponsorship.html" target="_blank">Donate</a> <small><span class="glyphicon glyphicon-new-window"></span></small> + + <a class="smalllinks" href="https://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a> <small><span class="glyphicon glyphicon-new-window"></span></small> + </li> + + </ul> + </div><!-- /.navbar-collapse --> + </nav> + + </div> + <div class="col-sm-9"> + <div class="row-fluid"> + <div class="col-sm-12"> + <div class="row"> + <h1>What's New in the Pulsar Flink Connector 2.7.0</h1> + <p><i></i></p> + + <article> + <p>07 Jan 2021 Jianyun Zhao (<a href="https://twitter.com/yihy8023">@yihy8023</a>) & Jennifer Huang (<a href="https://twitter.com/Jennife06125739">@Jennife06125739</a>)</p> + +<h2 id="about-the-pulsar-flink-connector">About the Pulsar Flink Connector</h2> +<p>In order for companies to access real-time data insights, they need unified batch and streaming capabilities. Apache Flink unifies batch and stream processing into one single computing engine with “streams” as the unified data representation. Although developers have done extensive work at the computing and API layers, very little work has been done at the data messaging and storage layers. In reality, data is segregated into data silos, created by various storage and messaging techno [...] + +<p>The <a href="https://github.com/streamnative/pulsar-flink/">Pulsar Flink connector</a> provides elastic data processing with <a href="https://pulsar.apache.org/">Apache Pulsar</a> and <a href="https://flink.apache.org/">Apache Flink</a>, allowing Apache Flink to read/write data from/to Apache Pulsar. The Pulsar Flink Connector enables you to concentrate on your business logic without worrying about the storage details.</p> + +<h2 id="challenges">Challenges</h2> +<p>When we first developed the Pulsar Flink Connector, it received wide adoption from both the Flink and Pulsar communities. Leveraging the Pulsar Flink connector, <a href="https://www.hpe.com/us/en/home.html">Hewlett Packard Enterprise (HPE)</a> built a real-time computing platform, <a href="https://www.bigo.sg/">BIGO</a> built a <a href="https://pulsar-summit.org/en/event/asia-2020/sessions/how-bigo-builds-real-time-message-system-with-apache-pulsar-and-flink">real-time message process [...] + +<p>With more users adopting the Pulsar Flink Connector, it became clear that one of the common issues was evolving around data formats and specifically performing serialization and deserialization. While the Pulsar Flink connector leverages the Pulsar serialization, the previous connector versions did not support the Flink data format. As a result, users had to manually configure their setup in order to use the connector for real-time computing scenarios.</p> + +<p>To improve the user experience and make the Pulsar Flink connector easier-to-use, we built the capabilities to fully support the Flink data format, so users of the connector do not spend time on manual tuning and configuration.</p> + +<h2 id="whats-new-in-the-pulsar-flink-connector-270">What’s New in the Pulsar Flink Connector 2.7.0?</h2> +<p>The Pulsar Flink Connector 2.7.0 supports features in Apache Pulsar 2.7.0 and Apache Flink 1.12 and is fully compatible with the Flink connector and Flink message format. With the latest version, you can use important features in Flink, such as exactly-once sink, upsert Pulsar mechanism, Data Definition Language (DDL) computed columns, watermarks, and metadata. You can also leverage the Key-Shared subscription in Pulsar, and conduct serialization and deserialization without much confi [...] + +<p>Below, we provide more details about the key features in the Pulsar Flink Connector 2.7.0.</p> + +<h3 id="ordered-message-queue-with-high-performance">Ordered message queue with high-performance</h3> +<p>When users needed to strictly guarantee the ordering of messages, only one consumer was allowed to consume them. This had a severe impact on throughput. To address this, we designed a Key_Shared subscription model in Pulsar that guarantees the ordering of messages and improves throughput by adding a Key to each message and routes messages with the same Key Hash to one consumer.</p> + +<p><br /></p> +<div class="row front-graphic"> + <img src="/img/blog/2021-01-07-pulsar-flink/pulsar-key-shared.png" width="640px" alt="Apache Pulsar Key-Shared Subscription" /> +</div> + +<p>Pulsar Flink Connector 2.7.0 supports the Key_Shared subscription model. You can enable this feature by setting <code>enable-key-hash-range</code> to <code>true</code>. The Key Hash range processed by each consumer is decided by the parallelism of tasks.</p> + +<h3 id="introducing-exactly-once-semantics-for-pulsar-sink-based-on-the-pulsar-transaction">Introducing exactly-once semantics for Pulsar sink (based on the Pulsar transaction)</h3> +<p>In previous versions, sink operators only supported at-least-once semantics, which could not fully meet requirements for end-to-end consistency. To deduplicate messages, users had to do some dirty work, which was not user-friendly.</p> + +<p>Transactions are supported in Pulsar 2.7.0, which greatly improves the fault tolerance capability of the Flink sink. In the Pulsar Flink Connector 2.7.0, we designed exactly-once semantics for sink operators based on Pulsar transactions. Flink uses the two-phase commit protocol to implement TwoPhaseCommitSinkFunction. The main life cycle methods are beginTransaction(), preCommit(), commit(), abort(), recoverAndCommit(), recoverAndAbort().</p> + +<p>You can flexibly select semantics when creating a sink operator while the internal logic changes are transparent. Pulsar transactions are similar to the two-phase commit protocol in Flink, which greatly improves the reliability of the Connector Sink.</p> + +<p>It’s easy to implement beginTransaction and preCommit. You only need to start a Pulsar transaction and persist the TID of the transaction after the checkpoint. In the preCommit phase, you need to ensure that all messages are flushed to Pulsar, while any pre-committed messages will be committed eventually.</p> + +<p>We focus on recoverAndCommit and recoverAndAbort in implementation. Limited by Kafka features, Kafka connector adopts hack styles for recoverAndCommit. Pulsar transactions do not rely on the specific Producer, so it’s easy for you to commit and abort transactions based on TID.</p> + +<p>Pulsar transactions are highly efficient and flexible. Taking advantages of Pulsar and Flink, the Pulsar Flink connector is even more powerful. We will continue to improve transactional sink in the Pulsar Flink connector.</p> + +<h3 id="introducing-upsert-pulsar-connector">Introducing upsert-pulsar connector</h3> + +<p>Users in the Flink community expressed their needs for the upsert Pulsar. After looking through mailing lists and issues, we’ve summarized the following three reasons.</p> + +<ul> + <li>Interpret Pulsar topic as a changelog stream that interprets records with keys as upsert (aka insert/update) events.</li> + <li>As a part of the real time pipeline, join multiple streams for enrichment and store results into a Pulsar topic for further calculation later. However, the result may contain update events.</li> + <li>As a part of the real time pipeline, aggregate on data streams and store results into a Pulsar topic for further calculation later. However, the result may contain update events.</li> +</ul> + +<p>Based on the requirements, we add support for Upsert Pulsar. The upsert-pulsar connector allows for reading data from and writing data to Pulsar topics in the upsert fashion.</p> + +<ul> + <li> + <p>As a source, the upsert-pulsar connector produces a changelog stream, where each data record represents an update or delete event. More precisely, the value in a data record is interpreted as an UPDATE of the last value for the same key, if any (if a corresponding key does not exist yet, the update will be considered an INSERT). Using the table analogy, a data record in a changelog stream is interpreted as an UPSERT (aka INSERT/UPDATE) because any existing row with the same key is [...] + </li> + <li> + <p>As a sink, the upsert-pulsar connector can consume a changelog stream. It will write INSERT/UPDATE_AFTER data as normal Pulsar message values and write DELETE data as Pulsar message with null values (indicate tombstone for the key). Flink will guarantee the message ordering on the primary key by partitioning data on the values of the primary key columns, so the update/deletion messages on the same key will fall into the same partition.</p> + </li> +</ul> + +<h3 id="support-new-source-interface-and-table-api-introduced-in-flip-27httpscwikiapacheorgconfluencedisplayflinkflip-273arefactorsourceinterfaceflip27refactorsourceinterface-batchandstreamingunification-and-flip-95httpscwikiapacheorgconfluencedisplayflinkflip-953anewtablesourceandtablesinkinterfaces">Support new source interface and Table API introduced in <a href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface#FLIP27:RefactorSourceInterface-Batch [...] +<p>This feature unifies the source of the batch stream and optimizes the mechanism for task discovery and data reading. It is also the cornerstone of our implementation of Pulsar batch and streaming unification. The new Table API supports DDL computed columns, watermarks and metadata.</p> + +<h3 id="support-sql-read-and-write-metadata-as-described-in-flip-107httpscwikiapacheorgconfluencedisplayflinkflip-1073ahandlingofmetadatainsqlconnectors">Support SQL read and write metadata as described in <a href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-107%3A+Handling+of+metadata+in+SQL+connectors">FLIP-107</a></h3> +<p>FLIP-107 enables users to access connector metadata as a metadata column in table definitions. In real-time computing, users normally need additional information, such as eventTime, or customized fields. The Pulsar Flink connector supports SQL read and write metadata, so it is flexible and easy for users to manage metadata of Pulsar messages in the Pulsar Flink Connector 2.7.0. For details on the configuration, refer to <a href="https://github.com/streamnative/pulsar-flink#pulsar-mess [...] + +<h3 id="add-flink-format-type-atomic-to-support-pulsar-primitive-types">Add Flink format type <code>atomic</code> to support Pulsar primitive types</h3> +<p>In the Pulsar Flink Connector 2.7.0, we add Flink format type <code>atomic</code> to support Pulsar primitive types. When processing with Flink requires a Pulsar primitive type, you can use <code>atomic</code> as the connector format. You can find more information on Pulsar primitive types <a href="https://pulsar.apache.org/docs/en/schema-understand/">here</a>.</p> + +<h2 id="migration">Migration</h2> +<p>If you’re using the previous Pulsar Flink Connector version, you need to adjust your SQL and API parameters accordingly. Below we provide details on each.</p> + +<h2 id="sql">SQL</h2> +<p>In SQL, we’ve changed the Pulsar configuration parameters in the DDL declaration. The name of some parameters are changed, but the values are not.<br /> +- Remove the <code>connector.</code> prefix from the parameter names. +- Change the name of the <code>connector.type</code> parameter into <code>connector</code>. +- Change the startup mode parameter name from <code>connector.startup-mode</code> into <code>scan.startup.mode</code>. +- Adjust Pulsar properties as <code>properties.pulsar.reader.readername=testReaderName</code>.</p> + +<p>If you use SQL in the Pulsar Flink Connector, you need to adjust your SQL configuration accordingly when migrating to Pulsar Flink Connector 2.7.0. The following sample shows the differences between previous versions and the 2.7.0 version for SQL.</p> + +<p>SQL in previous versions:</p> + +<div class="highlight"><pre><code>create table topic1( + `rip` VARCHAR, + `rtime` VARCHAR, + `uid` bigint, + `client_ip` VARCHAR, + `day` as TO_DATE(rtime), + `hour` as date_format(rtime,'HH') +) with ( + 'connector.type' ='pulsar', + 'connector.version' = '1', + 'connector.topic' ='persistent://public/default/test_flink_sql', + 'connector.service-url' ='pulsar://xxx', + 'connector.admin-url' ='http://xxx', + 'connector.startup-mode' ='earliest', + 'connector.properties.0.key' ='pulsar.reader.readerName', + 'connector.properties.0.value' ='testReaderName', + 'format.type' ='json', + 'update-mode' ='append' +); +</code></pre></div> + +<p>SQL in Pulsar Flink Connector 2.7.0:</p> + +<div class="highlight"><pre><code>create table topic1( + `rip` VARCHAR, + `rtime` VARCHAR, + `uid` bigint, + `client_ip` VARCHAR, + `day` as TO_DATE(rtime), + `hour` as date_format(rtime,'HH') +) with ( + 'connector' ='pulsar', + 'topic' ='persistent://public/default/test_flink_sql', + 'service-url' ='pulsar://xxx', + 'admin-url' ='http://xxx', + 'scan.startup.mode' ='earliest', + 'properties.pulsar.reader.readername' = 'testReaderName', + 'format' ='json' +); +</code></pre></div> + +<h2 id="api">API</h2> +<p>From an API perspective, we adjusted some classes and enabled easier customization.</p> + +<ul> + <li>To solve serialization issues, we changed the signature of the construction method <code>FlinkPulsarSink</code>, and added <code>PulsarSerializationSchema</code>.</li> + <li>We removed inappropriate classes related to row, such as <code>FlinkPulsarRowSink</code>, <code>FlinkPulsarRowSource</code>. If you need to deal with Row formats, you can use Apache Flink’s Row related serialization components.</li> +</ul> + +<p>You can build <code>PulsarSerializationSchema</code> by using <code>PulsarSerializationSchemaWrapper.Builder</code>. <code>TopicKeyExtractor</code> is moved into <code>PulsarSerializationSchemaWrapper</code>. When you adjust your API, you can take the following sample as reference.</p> + +<div class="highlight"><pre><code>new PulsarSerializationSchemaWrapper.Builder<>(new SimpleStringSchema()) + .setTopicExtractor(str -> getTopic(str)) + .build(); +</code></pre></div> + +<h2 id="future-plan">Future Plan</h2> +<p>Future plans involve the design of a batch and stream solution integrated with Pulsar Source, based on the new Flink Source API (FLIP-27). The new solution will overcome the limitations of the current streaming source interface (SourceFunction) and simultaneously unify the source interfaces between the batch and streaming APIs.</p> + +<p>Pulsar offers a hierarchical architecture where data is divided into streaming, batch, and cold data, which enables Pulsar to provide infinite capacity. This makes Pulsar an ideal solution for unified batch and streaming.</p> + +<p>The batch and stream solution based on the new Flink Source API is divided into two simple parts: SplitEnumerator and Reader. SplitEnumerator discovers and assigns partitions, and Reader reads data from the partition.</p> + +<p><br /></p> +<div class="row front-graphic"> + <img src="/img/blog/2021-01-07-pulsar-flink/pulsar-flink-batch-stream.png" width="640px" alt="Batch and Stream Solution with Apache Pulsar and Apache Flink" /> +</div> + +<p>Apache Pulsar stores messages in the ledger block for users to locate the ledgers through Pulsar admin, and then provide broker partition, BookKeeper partition, Offloader partition, and other information through different partitioning policies. For more details, you can refer <a href="https://github.com/streamnative/pulsar-flink/issues/187">here</a>.</p> + +<h2 id="conclusion">Conclusion</h2> +<p>The latest version of the Pulsar Flink Connector is now available and we encourage everyone to use/upgrade to the Pulsar Flink Connector 2.7.0. The new version provides significant user enhancements, enabled by various features in Pulsar 2.7 and Flink 1.12. We will be contributing the Pulsar Flink Connector 2.7.0 to the <a href="https://github.com/apache/flink/">Apache Flink repository</a> soon. If you have any questions or concerns about the Pulsar Flink Connector, feel free to open [...] + + </article> + </div> + + <div class="row"> + <div id="disqus_thread"></div> + <script type="text/javascript"> + /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ + var disqus_shortname = 'stratosphere-eu'; // required: replace example with your forum shortname + + /* * * DON'T EDIT BELOW THIS LINE * * */ + (function() { + var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; + dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; + (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); + })(); + </script> + </div> + </div> +</div> + </div> + </div> + + <hr /> + + <div class="row"> + <div class="footer text-center col-sm-12"> + <p>Copyright © 2014-2019 <a href="http://apache.org">The Apache Software Foundation</a>. All Rights Reserved.</p> + <p>Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation.</p> + <p><a href="/privacy-policy.html">Privacy Policy</a> · <a href="/blog/feed.xml">RSS feed</a></p> + </div> + </div> + </div><!-- /.container --> + + <!-- Include all compiled plugins (below), or include individual files as needed --> + <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/js/bootstrap.min.js"></script> + <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery.matchHeight/0.7.0/jquery.matchHeight-min.js"></script> + <script src="/js/codetabs.js"></script> + <script src="/js/stickysidebar.js"></script> + + <!-- Google Analytics --> + <script> + (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ + (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), + m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) + })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); + + ga('create', 'UA-52545728-1', 'auto'); + ga('send', 'pageview'); + </script> + </body> +</html> diff --git a/content/blog/feed.xml b/content/blog/feed.xml index fba8370..e78798a 100644 --- a/content/blog/feed.xml +++ b/content/blog/feed.xml @@ -7,6 +7,170 @@ <atom:link href="https://flink.apache.org/blog/feed.xml" rel="self" type="application/rss+xml" /> <item> +<title>What's New in the Pulsar Flink Connector 2.7.0</title> +<description><h2 id="about-the-pulsar-flink-connector">About the Pulsar Flink Connector</h2> +<p>In order for companies to access real-time data insights, they need unified batch and streaming capabilities. Apache Flink unifies batch and stream processing into one single computing engine with “streams” as the unified data representation. Although developers have done extensive work at the computing and API layers, very little work has been done at the data messaging and storage layers. In reality, data is segregated into data silos, created by various storage and messaging [...] + +<p>The <a href="https://github.com/streamnative/pulsar-flink/">Pulsar Flink connector</a> provides elastic data processing with <a href="https://pulsar.apache.org/">Apache Pulsar</a> and <a href="https://flink.apache.org/">Apache Flink</a>, allowing Apache Flink to read/write data from/to Apache Pulsar. The Pulsar Flink Connector enables you to concentrate on your business logic without worrying about the storage det [...] + +<h2 id="challenges">Challenges</h2> +<p>When we first developed the Pulsar Flink Connector, it received wide adoption from both the Flink and Pulsar communities. Leveraging the Pulsar Flink connector, <a href="https://www.hpe.com/us/en/home.html">Hewlett Packard Enterprise (HPE)</a> built a real-time computing platform, <a href="https://www.bigo.sg/">BIGO</a> built a <a href="https://pulsar-summit.org/en/event/asia-2020/sessions/how-bigo-builds-real-time-message-syst [...] + +<p>With more users adopting the Pulsar Flink Connector, it became clear that one of the common issues was evolving around data formats and specifically performing serialization and deserialization. While the Pulsar Flink connector leverages the Pulsar serialization, the previous connector versions did not support the Flink data format. As a result, users had to manually configure their setup in order to use the connector for real-time computing scenarios.</p> + +<p>To improve the user experience and make the Pulsar Flink connector easier-to-use, we built the capabilities to fully support the Flink data format, so users of the connector do not spend time on manual tuning and configuration.</p> + +<h2 id="whats-new-in-the-pulsar-flink-connector-270">What’s New in the Pulsar Flink Connector 2.7.0?</h2> +<p>The Pulsar Flink Connector 2.7.0 supports features in Apache Pulsar 2.7.0 and Apache Flink 1.12 and is fully compatible with the Flink connector and Flink message format. With the latest version, you can use important features in Flink, such as exactly-once sink, upsert Pulsar mechanism, Data Definition Language (DDL) computed columns, watermarks, and metadata. You can also leverage the Key-Shared subscription in Pulsar, and conduct serialization and deserialization without much [...] + +<p>Below, we provide more details about the key features in the Pulsar Flink Connector 2.7.0.</p> + +<h3 id="ordered-message-queue-with-high-performance">Ordered message queue with high-performance</h3> +<p>When users needed to strictly guarantee the ordering of messages, only one consumer was allowed to consume them. This had a severe impact on throughput. To address this, we designed a Key_Shared subscription model in Pulsar that guarantees the ordering of messages and improves throughput by adding a Key to each message and routes messages with the same Key Hash to one consumer.</p> + +<p><br /></p> +<div class="row front-graphic"> + <img src="/img/blog/2021-01-07-pulsar-flink/pulsar-key-shared.png" width="640px" alt="Apache Pulsar Key-Shared Subscription" /> +</div> + +<p>Pulsar Flink Connector 2.7.0 supports the Key_Shared subscription model. You can enable this feature by setting <code>enable-key-hash-range</code> to <code>true</code>. The Key Hash range processed by each consumer is decided by the parallelism of tasks.</p> + +<h3 id="introducing-exactly-once-semantics-for-pulsar-sink-based-on-the-pulsar-transaction">Introducing exactly-once semantics for Pulsar sink (based on the Pulsar transaction)</h3> +<p>In previous versions, sink operators only supported at-least-once semantics, which could not fully meet requirements for end-to-end consistency. To deduplicate messages, users had to do some dirty work, which was not user-friendly.</p> + +<p>Transactions are supported in Pulsar 2.7.0, which greatly improves the fault tolerance capability of the Flink sink. In the Pulsar Flink Connector 2.7.0, we designed exactly-once semantics for sink operators based on Pulsar transactions. Flink uses the two-phase commit protocol to implement TwoPhaseCommitSinkFunction. The main life cycle methods are beginTransaction(), preCommit(), commit(), abort(), recoverAndCommit(), recoverAndAbort().</p> + +<p>You can flexibly select semantics when creating a sink operator while the internal logic changes are transparent. Pulsar transactions are similar to the two-phase commit protocol in Flink, which greatly improves the reliability of the Connector Sink.</p> + +<p>It’s easy to implement beginTransaction and preCommit. You only need to start a Pulsar transaction and persist the TID of the transaction after the checkpoint. In the preCommit phase, you need to ensure that all messages are flushed to Pulsar, while any pre-committed messages will be committed eventually.</p> + +<p>We focus on recoverAndCommit and recoverAndAbort in implementation. Limited by Kafka features, Kafka connector adopts hack styles for recoverAndCommit. Pulsar transactions do not rely on the specific Producer, so it’s easy for you to commit and abort transactions based on TID.</p> + +<p>Pulsar transactions are highly efficient and flexible. Taking advantages of Pulsar and Flink, the Pulsar Flink connector is even more powerful. We will continue to improve transactional sink in the Pulsar Flink connector.</p> + +<h3 id="introducing-upsert-pulsar-connector">Introducing upsert-pulsar connector</h3> + +<p>Users in the Flink community expressed their needs for the upsert Pulsar. After looking through mailing lists and issues, we’ve summarized the following three reasons.</p> + +<ul> + <li>Interpret Pulsar topic as a changelog stream that interprets records with keys as upsert (aka insert/update) events.</li> + <li>As a part of the real time pipeline, join multiple streams for enrichment and store results into a Pulsar topic for further calculation later. However, the result may contain update events.</li> + <li>As a part of the real time pipeline, aggregate on data streams and store results into a Pulsar topic for further calculation later. However, the result may contain update events.</li> +</ul> + +<p>Based on the requirements, we add support for Upsert Pulsar. The upsert-pulsar connector allows for reading data from and writing data to Pulsar topics in the upsert fashion.</p> + +<ul> + <li> + <p>As a source, the upsert-pulsar connector produces a changelog stream, where each data record represents an update or delete event. More precisely, the value in a data record is interpreted as an UPDATE of the last value for the same key, if any (if a corresponding key does not exist yet, the update will be considered an INSERT). Using the table analogy, a data record in a changelog stream is interpreted as an UPSERT (aka INSERT/UPDATE) because any existing row with the same [...] + </li> + <li> + <p>As a sink, the upsert-pulsar connector can consume a changelog stream. It will write INSERT/UPDATE_AFTER data as normal Pulsar message values and write DELETE data as Pulsar message with null values (indicate tombstone for the key). Flink will guarantee the message ordering on the primary key by partitioning data on the values of the primary key columns, so the update/deletion messages on the same key will fall into the same partition.</p> + </li> +</ul> + +<h3 id="support-new-source-interface-and-table-api-introduced-in-flip-27httpscwikiapacheorgconfluencedisplayflinkflip-273arefactorsourceinterfaceflip27refactorsourceinterface-batchandstreamingunification-and-flip-95httpscwikiapacheorgconfluencedisplayflinkflip-953anewtablesourceandtablesinkinterfaces">Support new source interface and Table API introduced in <a href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface#FLIP27:Refac [...] +<p>This feature unifies the source of the batch stream and optimizes the mechanism for task discovery and data reading. It is also the cornerstone of our implementation of Pulsar batch and streaming unification. The new Table API supports DDL computed columns, watermarks and metadata.</p> + +<h3 id="support-sql-read-and-write-metadata-as-described-in-flip-107httpscwikiapacheorgconfluencedisplayflinkflip-1073ahandlingofmetadatainsqlconnectors">Support SQL read and write metadata as described in <a href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-107%3A+Handling+of+metadata+in+SQL+connectors">FLIP-107</a></h3> +<p>FLIP-107 enables users to access connector metadata as a metadata column in table definitions. In real-time computing, users normally need additional information, such as eventTime, or customized fields. The Pulsar Flink connector supports SQL read and write metadata, so it is flexible and easy for users to manage metadata of Pulsar messages in the Pulsar Flink Connector 2.7.0. For details on the configuration, refer to <a href="https://github.com/streamnative/pulsar-fli [...] + +<h3 id="add-flink-format-type-atomic-to-support-pulsar-primitive-types">Add Flink format type <code>atomic</code> to support Pulsar primitive types</h3> +<p>In the Pulsar Flink Connector 2.7.0, we add Flink format type <code>atomic</code> to support Pulsar primitive types. When processing with Flink requires a Pulsar primitive type, you can use <code>atomic</code> as the connector format. You can find more information on Pulsar primitive types <a href="https://pulsar.apache.org/docs/en/schema-understand/">here</a>.</p> + +<h2 id="migration">Migration</h2> +<p>If you’re using the previous Pulsar Flink Connector version, you need to adjust your SQL and API parameters accordingly. Below we provide details on each.</p> + +<h2 id="sql">SQL</h2> +<p>In SQL, we’ve changed the Pulsar configuration parameters in the DDL declaration. The name of some parameters are changed, but the values are not.<br /> +- Remove the <code>connector.</code> prefix from the parameter names. +- Change the name of the <code>connector.type</code> parameter into <code>connector</code>. +- Change the startup mode parameter name from <code>connector.startup-mode</code> into <code>scan.startup.mode</code>. +- Adjust Pulsar properties as <code>properties.pulsar.reader.readername=testReaderName</code>.</p> + +<p>If you use SQL in the Pulsar Flink Connector, you need to adjust your SQL configuration accordingly when migrating to Pulsar Flink Connector 2.7.0. The following sample shows the differences between previous versions and the 2.7.0 version for SQL.</p> + +<p>SQL in previous versions:</p> + +<div class="highlight"><pre><code>create table topic1( + `rip` VARCHAR, + `rtime` VARCHAR, + `uid` bigint, + `client_ip` VARCHAR, + `day` as TO_DATE(rtime), + `hour` as date_format(rtime,'HH') +) with ( + 'connector.type' ='pulsar', + 'connector.version' = '1', + 'connector.topic' ='persistent://public/default/test_flink_sql', + 'connector.service-url' ='pulsar://xxx', + 'connector.admin-url' ='http://xxx', + 'connector.startup-mode' ='earliest', + 'connector.properties.0.key' ='pulsar.reader.readerName', + 'connector.properties.0.value' ='testReaderName', + 'format.type' ='json', + 'update-mode' ='append' +); +</code></pre></div> + +<p>SQL in Pulsar Flink Connector 2.7.0:</p> + +<div class="highlight"><pre><code>create table topic1( + `rip` VARCHAR, + `rtime` VARCHAR, + `uid` bigint, + `client_ip` VARCHAR, + `day` as TO_DATE(rtime), + `hour` as date_format(rtime,'HH') +) with ( + 'connector' ='pulsar', + 'topic' ='persistent://public/default/test_flink_sql', + 'service-url' ='pulsar://xxx', + 'admin-url' ='http://xxx', + 'scan.startup.mode' ='earliest', + 'properties.pulsar.reader.readername' = 'testReaderName', + 'format' ='json' +); +</code></pre></div> + +<h2 id="api">API</h2> +<p>From an API perspective, we adjusted some classes and enabled easier customization.</p> + +<ul> + <li>To solve serialization issues, we changed the signature of the construction method <code>FlinkPulsarSink</code>, and added <code>PulsarSerializationSchema</code>.</li> + <li>We removed inappropriate classes related to row, such as <code>FlinkPulsarRowSink</code>, <code>FlinkPulsarRowSource</code>. If you need to deal with Row formats, you can use Apache Flink’s Row related serialization components.</li> +</ul> + +<p>You can build <code>PulsarSerializationSchema</code> by using <code>PulsarSerializationSchemaWrapper.Builder</code>. <code>TopicKeyExtractor</code> is moved into <code>PulsarSerializationSchemaWrapper</code>. When you adjust your API, you can take the following sample as reference.</p> + +<div class="highlight"><pre><code>new PulsarSerializationSchemaWrapper.Builder&lt;&gt;(new SimpleStringSchema()) + .setTopicExtractor(str -&gt; getTopic(str)) + .build(); +</code></pre></div> + +<h2 id="future-plan">Future Plan</h2> +<p>Future plans involve the design of a batch and stream solution integrated with Pulsar Source, based on the new Flink Source API (FLIP-27). The new solution will overcome the limitations of the current streaming source interface (SourceFunction) and simultaneously unify the source interfaces between the batch and streaming APIs.</p> + +<p>Pulsar offers a hierarchical architecture where data is divided into streaming, batch, and cold data, which enables Pulsar to provide infinite capacity. This makes Pulsar an ideal solution for unified batch and streaming.</p> + +<p>The batch and stream solution based on the new Flink Source API is divided into two simple parts: SplitEnumerator and Reader. SplitEnumerator discovers and assigns partitions, and Reader reads data from the partition.</p> + +<p><br /></p> +<div class="row front-graphic"> + <img src="/img/blog/2021-01-07-pulsar-flink/pulsar-flink-batch-stream.png" width="640px" alt="Batch and Stream Solution with Apache Pulsar and Apache Flink" /> +</div> + +<p>Apache Pulsar stores messages in the ledger block for users to locate the ledgers through Pulsar admin, and then provide broker partition, BookKeeper partition, Offloader partition, and other information through different partitioning policies. For more details, you can refer <a href="https://github.com/streamnative/pulsar-flink/issues/187">here</a>.</p> + +<h2 id="conclusion">Conclusion</h2> +<p>The latest version of the Pulsar Flink Connector is now available and we encourage everyone to use/upgrade to the Pulsar Flink Connector 2.7.0. The new version provides significant user enhancements, enabled by various features in Pulsar 2.7 and Flink 1.12. We will be contributing the Pulsar Flink Connector 2.7.0 to the <a href="https://github.com/apache/flink/">Apache Flink repository</a> soon. If you have any questions or concerns about the Pulsar Flink C [...] +</description> +<pubDate>Thu, 07 Jan 2021 09:00:00 +0100</pubDate> +<link>https://flink.apache.org/2021/01/07/pulsar-flink-connector-270.html</link> +<guid isPermaLink="true">/2021/01/07/pulsar-flink-connector-270.html</guid> +</item> + +<item> <title>Stateful Functions 2.2.2 Release Announcement</title> <description><p>The Apache Flink community released the second bugfix release of the Stateful Functions (StateFun) 2.2 series, version 2.2.2.</p> @@ -18507,220 +18671,5 @@ If you have, for example, a flatMap() operator that keeps a running aggregate pe <guid isPermaLink="true">/news/2017/02/06/release-1.2.0.html</guid> </item> -<item> -<title>Apache Flink 1.1.4 Released</title> -<description><p>The Apache Flink community released the next bugfix version of the Apache Flink 1.1 series.</p> - -<p>This release includes major robustness improvements for checkpoint cleanup on failures and consumption of intermediate streams. We highly recommend all users to upgrade to Flink 1.1.4.</p> - -<div class="highlight"><pre><code class="language-xml"><span class="nt">&lt;dependency&gt;</span> - <span class="nt">&lt;groupId&gt;</span>org.apache.flink<span class="nt">&lt;/groupId&gt;</span> - <span class="nt">&lt;artifactId&gt;</span>flink-java<span class="nt">&lt;/artifactId&gt;</span> - <span class="nt">&lt;version&gt;</span>1.1.4<span class="nt">&lt;/version&gt;</span> -<span class="nt">&lt;/dependency&gt;</span> -<span class="nt">&lt;dependency&gt;</span> - <span class="nt">&lt;groupId&gt;</span>org.apache.flink<span class="nt">&lt;/groupId&gt;</span> - <span class="nt">&lt;artifactId&gt;</span>flink-streaming-java_2.10<span class="nt">&lt;/artifactId&gt;</span> - <span class="nt">&lt;version&gt;</span>1.1.4<span class="nt">&lt;/version&gt;</span> -<span class="nt">&lt;/dependency&gt;</span> -<span class="nt">&lt;dependency&gt;</span> - <span class="nt">&lt;groupId&gt;</span>org.apache.flink<span class="nt">&lt;/groupId&gt;</span> - <span class="nt">&lt;artifactId&gt;</span>flink-clients_2.10<span class="nt">&lt;/artifactId&gt;</span> - <span class="nt">&lt;version&gt;</span>1.1.4<span class="nt">&lt;/version&gt;</span> -<span class="nt">&lt;/dependency&gt;</span></code></pre></div> - -<p>You can find the binaries on the updated <a href="http://flink.apache.org/downloads.html">Downloads page</a>.</p> - -<h2 id="note-for-rocksdb-backend-users">Note for RocksDB Backend Users</h2> - -<p>We updated Flink’s RocksDB dependency version from <code>4.5.1</code> to <code>4.11.2</code>. Between these versions some of RocksDB’s internal configuration defaults changed that would affect the memory footprint of running Flink with RocksDB. Therefore, we manually reset them to the previous defaults. If you want to run with the new Rocks 4.11.2 defaults, you can do this via:</p> - -<div class="highlight"><pre><code class="language-java"><span class="n">RocksDBStateBackend</span> <span class="n">backend</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">RocksDBStateBackend</span><span class="o">(</span><span class="s">&quot;...&quot;</span><s [...] -<span class="c1">// Use the new default options. Otherwise, the default for RocksDB 4.5.1</span> -<span class="c1">// `PredefinedOptions.DEFAULT_ROCKS_4_5_1` will be used.</span> -<span class="n">backend</span><span class="o">.</span><span class="na">setPredefinedOptions</span><span class="o">(</span><span class="n">PredefinedOptions</span><span class="o">.</span><span class="na">DEFAULT</span><span class="o">);</span></code></pre></div> - -<h2 id="release-notes---flink---version-114">Release Notes - Flink - Version 1.1.4</h2> - -<h3 id="sub-task">Sub-task</h3> -<ul> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4510">FLINK-4510</a>] - Always create CheckpointCoordinator -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4984">FLINK-4984</a>] - Add Cancellation Barriers to BarrierTracker and BarrierBuffer -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4985">FLINK-4985</a>] - Report Declined/Canceled Checkpoints to Checkpoint Coordinator -</li> -</ul> - -<h3 id="bug">Bug</h3> -<ul> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-2662">FLINK-2662</a>] - CompilerException: &quot;Bug: Plan generation for Unions picked a ship strategy between binary plan operators.&quot; -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-3680">FLINK-3680</a>] - Remove or improve (not set) text in the Job Plan UI -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-3813">FLINK-3813</a>] - YARNSessionFIFOITCase.testDetachedMode failed on Travis -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4108">FLINK-4108</a>] - NPE in Row.productArity -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4506">FLINK-4506</a>] - CsvOutputFormat defaults allowNullValues to false, even though doc and declaration says true -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4581">FLINK-4581</a>] - Table API throws &quot;No suitable driver found for jdbc:calcite&quot; -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4586">FLINK-4586</a>] - NumberSequenceIterator and Accumulator threading issue -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4619">FLINK-4619</a>] - JobManager does not answer to client when restore from savepoint fails -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4727">FLINK-4727</a>] - Kafka 0.9 Consumer should also checkpoint auto retrieved offsets even when no data is read -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4862">FLINK-4862</a>] - NPE on EventTimeSessionWindows with ContinuousEventTimeTrigger -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4932">FLINK-4932</a>] - Don&#39;t let ExecutionGraph fail when in state Restarting -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4933">FLINK-4933</a>] - ExecutionGraph.scheduleOrUpdateConsumers can fail the ExecutionGraph -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4977">FLINK-4977</a>] - Enum serialization does not work in all cases -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4991">FLINK-4991</a>] - TestTask hangs in testWatchDogInterruptsTask -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4998">FLINK-4998</a>] - ResourceManager fails when num task slots &gt; Yarn vcores -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5013">FLINK-5013</a>] - Flink Kinesis connector doesn&#39;t work on old EMR versions -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5028">FLINK-5028</a>] - Stream Tasks must not go through clean shutdown logic on cancellation -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5038">FLINK-5038</a>] - Errors in the &quot;cancelTask&quot; method prevent closeables from being closed early -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5039">FLINK-5039</a>] - Avro GenericRecord support is broken -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5040">FLINK-5040</a>] - Set correct input channel types with eager scheduling -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5050">FLINK-5050</a>] - JSON.org license is CatX -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5057">FLINK-5057</a>] - Cancellation timeouts are picked from wrong config -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5058">FLINK-5058</a>] - taskManagerMemory attribute set wrong value in FlinkShell -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5063">FLINK-5063</a>] - State handles are not properly cleaned up for declined or expired checkpoints -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5073">FLINK-5073</a>] - ZooKeeperCompleteCheckpointStore executes blocking delete operation in ZooKeeper client thread -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5075">FLINK-5075</a>] - Kinesis consumer incorrectly determines shards as newly discovered when tested against Kinesalite -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5082">FLINK-5082</a>] - Pull ExecutionService lifecycle management out of the JobManager -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5085">FLINK-5085</a>] - Execute CheckpointCoodinator&#39;s state discard calls asynchronously -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5114">FLINK-5114</a>] - PartitionState update with finished execution fails -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5142">FLINK-5142</a>] - Resource leak in CheckpointCoordinator -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5149">FLINK-5149</a>] - ContinuousEventTimeTrigger doesn&#39;t fire at the end of the window -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5154">FLINK-5154</a>] - Duplicate TypeSerializer when writing RocksDB Snapshot -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5158">FLINK-5158</a>] - Handle ZooKeeperCompletedCheckpointStore exceptions in CheckpointCoordinator -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5172">FLINK-5172</a>] - In RocksDBStateBackend, set flink-core and flink-streaming-java to &quot;provided&quot; -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5173">FLINK-5173</a>] - Upgrade RocksDB dependency -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5184">FLINK-5184</a>] - Error result of compareSerialized in RowComparator class -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5193">FLINK-5193</a>] - Recovering all jobs fails completely if a single recovery fails -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5197">FLINK-5197</a>] - Late JobStatusChanged messages can interfere with running jobs -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5214">FLINK-5214</a>] - Clean up checkpoint files when failing checkpoint operation on TM -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5215">FLINK-5215</a>] - Close checkpoint streams upon cancellation -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5216">FLINK-5216</a>] - CheckpointCoordinator&#39;s &#39;minPauseBetweenCheckpoints&#39; refers to checkpoint start rather then checkpoint completion -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5218">FLINK-5218</a>] - Eagerly close checkpoint streams on cancellation -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5228">FLINK-5228</a>] - LocalInputChannel re-trigger request and release deadlock -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5229">FLINK-5229</a>] - Cleanup StreamTaskStates if a checkpoint operation of a subsequent operator fails -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5246">FLINK-5246</a>] - Don&#39;t discard unknown checkpoint messages in the CheckpointCoordinator -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5248">FLINK-5248</a>] - SavepointITCase doesn&#39;t catch savepoint restore failure -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5274">FLINK-5274</a>] - LocalInputChannel throws NPE if partition reader is released -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5275">FLINK-5275</a>] - InputChanelDeploymentDescriptors throws misleading Exception if producer failed/cancelled -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5276">FLINK-5276</a>] - ExecutionVertex archiving can throw NPE with many previous attempts -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5285">FLINK-5285</a>] - CancelCheckpointMarker flood when using at least once mode -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5326">FLINK-5326</a>] - IllegalStateException: Bug in Netty consumer logic: reader queue got notified by partition about available data, but none was available -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5352">FLINK-5352</a>] - Restore RocksDB 1.1.3 memory behavior -</li> -</ul> - -<h3 id="improvement">Improvement</h3> -<ul> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-3347">FLINK-3347</a>] - TaskManager (or its ActorSystem) need to restart in case they notice quarantine -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-3787">FLINK-3787</a>] - Yarn client does not report unfulfillable container constraints -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4445">FLINK-4445</a>] - Ignore unmatched state when restoring from savepoint -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4715">FLINK-4715</a>] - TaskManager should commit suicide after cancellation failure -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4894">FLINK-4894</a>] - Don&#39;t block on buffer request after broadcastEvent -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4975">FLINK-4975</a>] - Add a limit for how much data may be buffered during checkpoint alignment -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4996">FLINK-4996</a>] - Make CrossHint @Public -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5046">FLINK-5046</a>] - Avoid redundant serialization when creating the TaskDeploymentDescriptor -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5123">FLINK-5123</a>] - Add description how to do proper shading to Flink docs. -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5169">FLINK-5169</a>] - Make consumption of input channels fair -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5192">FLINK-5192</a>] - Provide better log config templates -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5194">FLINK-5194</a>] - Log heartbeats on TRACE level -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5196">FLINK-5196</a>] - Don&#39;t log InputChannelDescriptor -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5198">FLINK-5198</a>] - Overwrite TaskState toString -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5199">FLINK-5199</a>] - Improve logging of submitted job graph actions in HA case -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5201">FLINK-5201</a>] - Promote loaded config properties to INFO -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5207">FLINK-5207</a>] - Decrease HadoopFileSystem logging -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5249">FLINK-5249</a>] - description of datastream rescaling doesn&#39;t match the figure -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5259">FLINK-5259</a>] - wrong execution environment in retry delays example -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-5278">FLINK-5278</a>] - Improve Task and checkpoint logging -</li> -</ul> - -<h3 id="new-feature">New Feature</h3> -<ul> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4976">FLINK-4976</a>] - Add a way to abort in flight checkpoints -</li> -</ul> - -<h3 id="task">Task</h3> -<ul> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-4778">FLINK-4778</a>] - Update program example in /docs/setup/cli.md due to the change in FLINK-2021 -</li> -</ul> - -</description> -<pubDate>Wed, 21 Dec 2016 10:00:00 +0100</pubDate> -<link>https://flink.apache.org/news/2016/12/21/release-1.1.4.html</link> -<guid isPermaLink="true">/news/2016/12/21/release-1.1.4.html</guid> -</item> - </channel> </rss> diff --git a/content/blog/index.html b/content/blog/index.html index 9e36160..0579f4d 100644 --- a/content/blog/index.html +++ b/content/blog/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></h2> + + <p>07 Jan 2021 + Jianyun Zhao (<a href="https://twitter.com/yihy8023">@yihy8023</a>) & Jennifer Huang (<a href="https://twitter.com/Jennife06125739">@Jennife06125739</a>)</p> + + <p>With the unification of batch and streaming regarded as the future in data processing, the Pulsar Flink Connector provides an ideal solution for unified batch and stream processing with Apache Pulsar and Apache Flink. The Pulsar Flink Connector 2.7.0 supports features in Pulsar 2.7 and Flink 1.12 and is fully compatible with Flink's data format. The Pulsar Flink Connector 2.7.0 will be contributed to the Flink repository soon and the contribution process is ongoing.</p> + + <p><a href="/2021/01/07/pulsar-flink-connector-270.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></h2> <p>02 Jan 2021 @@ -331,19 +344,6 @@ as well as increased observability for operational purposes.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2020/09/04/community-update.html">Flink Community Update - August'20</a></h2> - - <p>04 Sep 2020 - Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p> - - <p>Ah, so much for a quiet August month. This time around, we bring you some new Flink Improvement Proposals (FLIPs), a preview of the upcoming Flink Stateful Functions 2.2 release and a look into how far Flink has come in comparison to 2019.</p> - - <p><a href="/news/2020/09/04/community-update.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -376,6 +376,16 @@ as well as increased observability for operational purposes.</p> <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/blog/page10/index.html b/content/blog/page10/index.html index 04d57cf..6d689b8 100644 --- a/content/blog/page10/index.html +++ b/content/blog/page10/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2017/08/05/release-1.3.2.html">Apache Flink 1.3.2 Released</a></h2> + + <p>05 Aug 2017 + </p> + + <p><p>The Apache Flink community released the second bugfix version of the Apache Flink 1.3 series.</p> + +</p> + + <p><a href="/news/2017/08/05/release-1.3.2.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/features/2017/07/04/flink-rescalable-state.html">A Deep Dive into Rescalable State in Apache Flink</a></h2> <p>04 Jul 2017 by Stefan Richter (<a href="https://twitter.com/">@StefanRRichter</a>) @@ -328,21 +343,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/news/2016/12/21/release-1.1.4.html">Apache Flink 1.1.4 Released</a></h2> - - <p>21 Dec 2016 - </p> - - <p><p>The Apache Flink community released the next bugfix version of the Apache Flink 1.1 series.</p> - -</p> - - <p><a href="/news/2016/12/21/release-1.1.4.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -375,6 +375,16 @@ <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/blog/page11/index.html b/content/blog/page11/index.html index fabeb40..dc0871b 100644 --- a/content/blog/page11/index.html +++ b/content/blog/page11/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2016/12/21/release-1.1.4.html">Apache Flink 1.1.4 Released</a></h2> + + <p>21 Dec 2016 + </p> + + <p><p>The Apache Flink community released the next bugfix version of the Apache Flink 1.1 series.</p> + +</p> + + <p><a href="/news/2016/12/21/release-1.1.4.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2016/12/19/2016-year-in-review.html">Apache Flink in 2016: Year in Review</a></h2> <p>19 Dec 2016 by Mike Winters @@ -332,21 +347,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/news/2016/04/14/flink-forward-announce.html">Flink Forward 2016 Call for Submissions Is Now Open</a></h2> - - <p>14 Apr 2016 by Aljoscha Krettek (<a href="https://twitter.com/">@aljoscha</a>) - </p> - - <p><p>We are happy to announce that the call for submissions for Flink Forward 2016 is now open! The conference will take place September 12-14, 2016 in Berlin, Germany, bringing together the open source stream processing community. Most Apache Flink committers will attend the conference, making it the ideal venue to learn more about the project and its roadmap and connect with the community.</p> - -</p> - - <p><a href="/news/2016/04/14/flink-forward-announce.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -379,6 +379,16 @@ <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/blog/page12/index.html b/content/blog/page12/index.html index 660d1a4..b69d74a 100644 --- a/content/blog/page12/index.html +++ b/content/blog/page12/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2016/04/14/flink-forward-announce.html">Flink Forward 2016 Call for Submissions Is Now Open</a></h2> + + <p>14 Apr 2016 by Aljoscha Krettek (<a href="https://twitter.com/">@aljoscha</a>) + </p> + + <p><p>We are happy to announce that the call for submissions for Flink Forward 2016 is now open! The conference will take place September 12-14, 2016 in Berlin, Germany, bringing together the open source stream processing community. Most Apache Flink committers will attend the conference, making it the ideal venue to learn more about the project and its roadmap and connect with the community.</p> + +</p> + + <p><a href="/news/2016/04/14/flink-forward-announce.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2016/04/06/cep-monitoring.html">Introducing Complex Event Processing (CEP) with Apache Flink</a></h2> <p>06 Apr 2016 by Till Rohrmann (<a href="https://twitter.com/">@stsffap</a>) @@ -328,20 +343,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/news/2015/09/16/off-heap-memory.html">Off-heap Memory in Apache Flink and the curious JIT compiler</a></h2> - - <p>16 Sep 2015 by Stephan Ewen (<a href="https://twitter.com/">@stephanewen</a>) - </p> - - <p><p>Running data-intensive code in the JVM and making it well-behaved is tricky. Systems that put billions of data objects naively onto the JVM heap face unpredictable OutOfMemoryErrors and Garbage Collection stalls. Of course, you still want to to keep your data in memory as much as possible, for speed and responsiveness of the processing applications. In that context, "off-heap" has become almost something like a magic word to solve these problems.</p> -<p>In this blog post, we will look at how Flink exploits off-heap memory. The feature is part of the upcoming release, but you can try it out with the latest nightly builds. We will also give a few interesting insights into the behavior for Java's JIT compiler for highly optimized methods and loops.</p></p> - - <p><a href="/news/2015/09/16/off-heap-memory.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -374,6 +375,16 @@ <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/blog/page13/index.html b/content/blog/page13/index.html index 3e99720..a58cf6f 100644 --- a/content/blog/page13/index.html +++ b/content/blog/page13/index.html @@ -201,6 +201,20 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2015/09/16/off-heap-memory.html">Off-heap Memory in Apache Flink and the curious JIT compiler</a></h2> + + <p>16 Sep 2015 by Stephan Ewen (<a href="https://twitter.com/">@stephanewen</a>) + </p> + + <p><p>Running data-intensive code in the JVM and making it well-behaved is tricky. Systems that put billions of data objects naively onto the JVM heap face unpredictable OutOfMemoryErrors and Garbage Collection stalls. Of course, you still want to to keep your data in memory as much as possible, for speed and responsiveness of the processing applications. In that context, "off-heap" has become almost something like a magic word to solve these problems.</p> +<p>In this blog post, we will look at how Flink exploits off-heap memory. The feature is part of the upcoming release, but you can try it out with the latest nightly builds. We will also give a few interesting insights into the behavior for Java's JIT compiler for highly optimized methods and loops.</p></p> + + <p><a href="/news/2015/09/16/off-heap-memory.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2015/09/03/flink-forward.html">Announcing Flink Forward 2015</a></h2> <p>03 Sep 2015 @@ -342,24 +356,6 @@ release is a preview release that contains known issues.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2015/03/02/february-2015-in-flink.html">February 2015 in the Flink community</a></h2> - - <p>02 Mar 2015 - </p> - - <p><p>February might be the shortest month of the year, but this does not -mean that the Flink community has not been busy adding features to the -system and fixing bugs. Here’s a rundown of the activity in the Flink -community last month.</p> - -</p> - - <p><a href="/news/2015/03/02/february-2015-in-flink.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -392,6 +388,16 @@ community last month.</p> <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/blog/page14/index.html b/content/blog/page14/index.html index 84681e0..e31fcbc 100644 --- a/content/blog/page14/index.html +++ b/content/blog/page14/index.html @@ -201,6 +201,24 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2015/03/02/february-2015-in-flink.html">February 2015 in the Flink community</a></h2> + + <p>02 Mar 2015 + </p> + + <p><p>February might be the shortest month of the year, but this does not +mean that the Flink community has not been busy adding features to the +system and fixing bugs. Here’s a rundown of the activity in the Flink +community last month.</p> + +</p> + + <p><a href="/news/2015/03/02/february-2015-in-flink.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2015/02/09/streaming-example.html">Introducing Flink Streaming</a></h2> <p>09 Feb 2015 @@ -374,6 +392,16 @@ academic and open source project that Flink originates from.</p> <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/blog/page2/index.html b/content/blog/page2/index.html index f1ff66c..30826ee 100644 --- a/content/blog/page2/index.html +++ b/content/blog/page2/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2020/09/04/community-update.html">Flink Community Update - August'20</a></h2> + + <p>04 Sep 2020 + Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p> + + <p>Ah, so much for a quiet August month. This time around, we bring you some new Flink Improvement Proposals (FLIPs), a preview of the upcoming Flink Stateful Functions 2.2 release and a look into how far Flink has come in comparison to 2019.</p> + + <p><a href="/news/2020/09/04/community-update.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/2020/09/01/flink-1.11-memory-management-improvements.html">Memory Management improvements for Flink’s JobManager in Apache Flink 1.11</a></h2> <p>01 Sep 2020 @@ -321,21 +334,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/2020/07/23/catalogs.html">Sharing is caring - Catalogs in Flink SQL</a></h2> - - <p>23 Jul 2020 - Dawid Wysakowicz (<a href="https://twitter.com/dwysakowicz">@dwysakowicz</a>)</p> - - <p><p>With an ever-growing number of people working with data, it’s a common practice for companies to build self-service platforms with the goal of democratizing their access across different teams and — especially — to enable users from any background to be independent in their data needs. In such environments, metadata management becomes a crucial aspect. Without it, users often work blindly, spending too much time searching for datasets and their location, figuring out data for [...] - -</p> - - <p><a href="/2020/07/23/catalogs.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -368,6 +366,16 @@ <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/blog/page3/index.html b/content/blog/page3/index.html index b86d642..8285769 100644 --- a/content/blog/page3/index.html +++ b/content/blog/page3/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/2020/07/23/catalogs.html">Sharing is caring - Catalogs in Flink SQL</a></h2> + + <p>23 Jul 2020 + Dawid Wysakowicz (<a href="https://twitter.com/dwysakowicz">@dwysakowicz</a>)</p> + + <p><p>With an ever-growing number of people working with data, it’s a common practice for companies to build self-service platforms with the goal of democratizing their access across different teams and — especially — to enable users from any background to be independent in their data needs. In such environments, metadata management becomes a crucial aspect. Without it, users often work blindly, spending too much time searching for datasets and their location, figuring out data for [...] + +</p> + + <p><a href="/2020/07/23/catalogs.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2020/07/21/release-1.11.1.html">Apache Flink 1.11.1 Released</a></h2> <p>21 Jul 2020 @@ -337,19 +352,6 @@ and provide a tutorial for running Streaming ETL with Flink on Zeppelin.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2020/05/04/season-of-docs.html">Applying to Google Season of Docs 2020</a></h2> - - <p>04 May 2020 - Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p> - - <p>The Flink community is thrilled to share that the project is applying again to Google Season of Docs (GSoD) this year! If you’re unfamiliar with the program, GSoD is a great initiative organized by Google Open Source to pair technical writers with mentors to work on documentation for open source projects. Does working shoulder to shoulder with the Flink community on documentation sound exciting? We’d love to hear from you!</p> - - <p><a href="/news/2020/05/04/season-of-docs.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -382,6 +384,16 @@ and provide a tutorial for running Streaming ETL with Flink on Zeppelin.</p> <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/blog/page4/index.html b/content/blog/page4/index.html index 599f53f..8f73a86 100644 --- a/content/blog/page4/index.html +++ b/content/blog/page4/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2020/05/04/season-of-docs.html">Applying to Google Season of Docs 2020</a></h2> + + <p>04 May 2020 + Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p> + + <p>The Flink community is thrilled to share that the project is applying again to Google Season of Docs (GSoD) this year! If you’re unfamiliar with the program, GSoD is a great initiative organized by Google Open Source to pair technical writers with mentors to work on documentation for open source projects. Does working shoulder to shoulder with the Flink community on documentation sound exciting? We’d love to hear from you!</p> + + <p><a href="/news/2020/05/04/season-of-docs.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2020/04/24/release-1.9.3.html">Apache Flink 1.9.3 Released</a></h2> <p>24 Apr 2020 @@ -324,19 +337,6 @@ This release marks a big milestone: Stateful Functions 2.0 is not only an API up <hr> - <article> - <h2 class="blog-title"><a href="/news/2020/02/20/ddl.html">No Java Required: Configuring Sources and Sinks in SQL</a></h2> - - <p>20 Feb 2020 - Seth Wiesman (<a href="https://twitter.com/sjwiesman">@sjwiesman</a>)</p> - - <p>This post discusses the efforts of the Flink community as they relate to end to end applications with SQL in Apache Flink.</p> - - <p><a href="/news/2020/02/20/ddl.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -369,6 +369,16 @@ This release marks a big milestone: Stateful Functions 2.0 is not only an API up <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/blog/page5/index.html b/content/blog/page5/index.html index 531f6ae..84055fc 100644 --- a/content/blog/page5/index.html +++ b/content/blog/page5/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2020/02/20/ddl.html">No Java Required: Configuring Sources and Sinks in SQL</a></h2> + + <p>20 Feb 2020 + Seth Wiesman (<a href="https://twitter.com/sjwiesman">@sjwiesman</a>)</p> + + <p>This post discusses the efforts of the Flink community as they relate to end to end applications with SQL in Apache Flink.</p> + + <p><a href="/news/2020/02/20/ddl.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2020/02/11/release-1.10.0.html">Apache Flink 1.10.0 Release Announcement</a></h2> <p>11 Feb 2020 @@ -325,19 +338,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/feature/2019/09/13/state-processor-api.html">The State Processor API: How to Read, write and modify the state of Flink applications</a></h2> - - <p>13 Sep 2019 - Seth Wiesman (<a href="https://twitter.com/sjwiesman">@sjwiesman</a>) & Fabian Hueske (<a href="https://twitter.com/fhueske">@fhueske</a>)</p> - - <p>This post explores the State Processor API, introduced with Flink 1.9.0, why this feature is a big step for Flink, what you can use it for, how to use it and explores some future directions that align the feature with Apache Flink's evolution into a system for unified batch and stream processing.</p> - - <p><a href="/feature/2019/09/13/state-processor-api.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -370,6 +370,16 @@ <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/blog/page6/index.html b/content/blog/page6/index.html index 6e85b8a..63b7e8a 100644 --- a/content/blog/page6/index.html +++ b/content/blog/page6/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/feature/2019/09/13/state-processor-api.html">The State Processor API: How to Read, write and modify the state of Flink applications</a></h2> + + <p>13 Sep 2019 + Seth Wiesman (<a href="https://twitter.com/sjwiesman">@sjwiesman</a>) & Fabian Hueske (<a href="https://twitter.com/fhueske">@fhueske</a>)</p> + + <p>This post explores the State Processor API, introduced with Flink 1.9.0, why this feature is a big step for Flink, what you can use it for, how to use it and explores some future directions that align the feature with Apache Flink's evolution into a system for unified batch and stream processing.</p> + + <p><a href="/feature/2019/09/13/state-processor-api.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2019/09/11/release-1.8.2.html">Apache Flink 1.8.2 Released</a></h2> <p>11 Sep 2019 @@ -324,19 +337,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/2019/05/03/pulsar-flink.html">When Flink & Pulsar Come Together</a></h2> - - <p>03 May 2019 - Sijie Guo (<a href="https://twitter.com/sijieg">@sijieg</a>)</p> - - <p>Apache Flink and Apache Pulsar are distributed data processing systems. When combined, they offer elastic data processing at large scale. This post describes how Pulsar and Flink can work together to provide a seamless developer experience.</p> - - <p><a href="/2019/05/03/pulsar-flink.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -369,6 +369,16 @@ <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/blog/page7/index.html b/content/blog/page7/index.html index e5a8759..aac7ac0 100644 --- a/content/blog/page7/index.html +++ b/content/blog/page7/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/2019/05/03/pulsar-flink.html">When Flink & Pulsar Come Together</a></h2> + + <p>03 May 2019 + Sijie Guo (<a href="https://twitter.com/sijieg">@sijieg</a>)</p> + + <p>Apache Flink and Apache Pulsar are distributed data processing systems. When combined, they offer elastic data processing at large scale. This post describes how Pulsar and Flink can work together to provide a seamless developer experience.</p> + + <p><a href="/2019/05/03/pulsar-flink.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2019/04/17/sod.html">Apache Flink's Application to Season of Docs</a></h2> <p>17 Apr 2019 @@ -331,21 +344,6 @@ for more details.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2018/12/22/release-1.6.3.html">Apache Flink 1.6.3 Released</a></h2> - - <p>22 Dec 2018 - </p> - - <p><p>The Apache Flink community released the third bugfix version of the Apache Flink 1.6 series.</p> - -</p> - - <p><a href="/news/2018/12/22/release-1.6.3.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -378,6 +376,16 @@ for more details.</p> <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/blog/page8/index.html b/content/blog/page8/index.html index ac5a1f3..ab728e1 100644 --- a/content/blog/page8/index.html +++ b/content/blog/page8/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2018/12/22/release-1.6.3.html">Apache Flink 1.6.3 Released</a></h2> + + <p>22 Dec 2018 + </p> + + <p><p>The Apache Flink community released the third bugfix version of the Apache Flink 1.6 series.</p> + +</p> + + <p><a href="/news/2018/12/22/release-1.6.3.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2018/12/21/release-1.7.1.html">Apache Flink 1.7.1 Released</a></h2> <p>21 Dec 2018 @@ -337,21 +352,6 @@ Please check the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa <hr> - <article> - <h2 class="blog-title"><a href="/news/2018/07/12/release-1.5.1.html">Apache Flink 1.5.1 Released</a></h2> - - <p>12 Jul 2018 - </p> - - <p><p>The Apache Flink community released the first bugfix version of the Apache Flink 1.5 series.</p> - -</p> - - <p><a href="/news/2018/07/12/release-1.5.1.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -384,6 +384,16 @@ Please check the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/blog/page9/index.html b/content/blog/page9/index.html index f380a00..5d6f865 100644 --- a/content/blog/page9/index.html +++ b/content/blog/page9/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2018/07/12/release-1.5.1.html">Apache Flink 1.5.1 Released</a></h2> + + <p>12 Jul 2018 + </p> + + <p><p>The Apache Flink community released the first bugfix version of the Apache Flink 1.5 series.</p> + +</p> + + <p><a href="/news/2018/07/12/release-1.5.1.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2018/05/25/release-1.5.0.html">Apache Flink 1.5.0 Release Announcement</a></h2> <p>25 May 2018 @@ -334,21 +349,6 @@ what’s coming in Flink 1.4.0 as well as a preview of what the Flink community <hr> - <article> - <h2 class="blog-title"><a href="/news/2017/08/05/release-1.3.2.html">Apache Flink 1.3.2 Released</a></h2> - - <p>05 Aug 2017 - </p> - - <p><p>The Apache Flink community released the second bugfix version of the Apache Flink 1.3 series.</p> - -</p> - - <p><a href="/news/2017/08/05/release-1.3.2.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -381,6 +381,16 @@ what’s coming in Flink 1.4.0 as well as a preview of what the Flink community <ul id="markdown-toc"> + <li><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></li> + + + + + + + + + <li><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></li> diff --git a/content/img/blog/2021-01-07-pulsar-flink/pulsar-flink-batch-stream.png b/content/img/blog/2021-01-07-pulsar-flink/pulsar-flink-batch-stream.png new file mode 100755 index 0000000..2f167a8 Binary files /dev/null and b/content/img/blog/2021-01-07-pulsar-flink/pulsar-flink-batch-stream.png differ diff --git a/content/img/blog/2021-01-07-pulsar-flink/pulsar-key-shared.png b/content/img/blog/2021-01-07-pulsar-flink/pulsar-key-shared.png new file mode 100755 index 0000000..e07027c Binary files /dev/null and b/content/img/blog/2021-01-07-pulsar-flink/pulsar-key-shared.png differ diff --git a/content/index.html b/content/index.html index ad0aa3e..0665bb9 100644 --- a/content/index.html +++ b/content/index.html @@ -573,6 +573,9 @@ <dl> + <dt> <a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></dt> + <dd>With the unification of batch and streaming regarded as the future in data processing, the Pulsar Flink Connector provides an ideal solution for unified batch and stream processing with Apache Pulsar and Apache Flink. The Pulsar Flink Connector 2.7.0 supports features in Pulsar 2.7 and Flink 1.12 and is fully compatible with Flink's data format. The Pulsar Flink Connector 2.7.0 will be contributed to the Flink repository soon and the contribution process is ongoing.</dd> + <dt> <a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></dt> <dd><p>The Apache Flink community released the second bugfix release of the Stateful Functions (StateFun) 2.2 series, version 2.2.2.</p> @@ -588,11 +591,6 @@ <dt> <a href="/news/2020/12/10/release-1.12.0.html">Apache Flink 1.12.0 Release Announcement</a></dt> <dd>The Apache Flink community is excited to announce the release of Flink 1.12.0! Close to 300 contributors worked on over 1k threads to bring significant improvements to usability as well as new features to Flink users across the whole API stack. We're particularly excited about adding efficient batch execution to the DataStream API, Kubernetes HA as an alternative to ZooKeeper, support for upsert mode in the Kafka SQL connector and the new Python DataStream API! Read on for al [...] - - <dt> <a href="/news/2020/11/11/release-statefun-2.2.1.html">Stateful Functions 2.2.1 Release Announcement</a></dt> - <dd><p>The Apache Flink community released the first bugfix release of the Stateful Functions (StateFun) 2.2 series, version 2.2.1.</p> - -</dd> </dl> diff --git a/content/zh/index.html b/content/zh/index.html index 00ef0a3..27a1c37 100644 --- a/content/zh/index.html +++ b/content/zh/index.html @@ -570,6 +570,9 @@ <dl> + <dt> <a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></dt> + <dd>With the unification of batch and streaming regarded as the future in data processing, the Pulsar Flink Connector provides an ideal solution for unified batch and stream processing with Apache Pulsar and Apache Flink. The Pulsar Flink Connector 2.7.0 supports features in Pulsar 2.7 and Flink 1.12 and is fully compatible with Flink's data format. The Pulsar Flink Connector 2.7.0 will be contributed to the Flink repository soon and the contribution process is ongoing.</dd> + <dt> <a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></dt> <dd><p>The Apache Flink community released the second bugfix release of the Stateful Functions (StateFun) 2.2 series, version 2.2.2.</p> @@ -585,11 +588,6 @@ <dt> <a href="/news/2020/12/10/release-1.12.0.html">Apache Flink 1.12.0 Release Announcement</a></dt> <dd>The Apache Flink community is excited to announce the release of Flink 1.12.0! Close to 300 contributors worked on over 1k threads to bring significant improvements to usability as well as new features to Flink users across the whole API stack. We're particularly excited about adding efficient batch execution to the DataStream API, Kubernetes HA as an alternative to ZooKeeper, support for upsert mode in the Kafka SQL connector and the new Python DataStream API! Read on for al [...] - - <dt> <a href="/news/2020/11/11/release-statefun-2.2.1.html">Stateful Functions 2.2.1 Release Announcement</a></dt> - <dd><p>The Apache Flink community released the first bugfix release of the Stateful Functions (StateFun) 2.2 series, version 2.2.1.</p> - -</dd> </dl>
