Huanli-Meng commented on a change in pull request #12085:
URL: https://github.com/apache/pulsar/pull/12085#discussion_r713938784



##########
File path: site2/website-next/docs/txn-how.md
##########
@@ -0,0 +1,154 @@
+---
+id: txn-how
+title: How transactions work?
+sidebar_label: How transactions work?
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+This section describes transaction components and how the components work 
together. For the complete design details, see [PIP-31: Transactional 
Streaming](https://docs.google.com/document/d/145VYp09JKTw9jAT-7yNyFU255FptB2_B2Fye100ZXDI/edit#heading=h.bm5ainqxosrx).
+
+## Key concept
+
+It is important to know the following key concepts, which is a prerequisite 
for understanding how transactions work.
+
+### Transaction coordinator
+
+The transaction coordinator (TC) is a module running inside a Pulsar broker. 
+
+* It maintains the entire life cycle of transactions and prevents a 
transaction from getting into an incorrect status. 
+  
+* It handles transaction timeout, and ensures that the transaction is aborted 
after a transaction timeout.
+
+### Transaction log
+
+All the transaction metadata persists in the transaction log. The transaction 
log is backed by a Pulsar topic. If the transaction coordinator crashes, it can 
restore the transaction metadata from the transaction log.
+
+The transaction log stores the transaction status rather than actual messages 
in the transaction (the actual messages are stored in the actual topic 
partitions). 
+
+### Transaction buffer
+
+Messages produced to a topic partition within a transaction are stored in the 
transaction buffer (TB) of that topic partition. The messages in the 
transaction buffer are not visible to consumers until the transactions are 
committed. The messages in the transaction buffer are discarded when the 
transactions are aborted. 
+
+Transaction buffer stores all ongoing and aborted transactions in memory. All 
messages are sent to the actual partitioned Pulsar topics.  After transactions 
are committed, the messages in the transaction buffer are materialized 
(visible) to consumers. When the transactions are aborted, the messages in the 
transaction buffer are discarded.
+
+### Transaction ID
+
+Transaction ID (TxnID) identifies a unique transaction in Pulsar. The 
transaction ID is 128-bit. The highest 16 bits are reserved for the ID of the 
transaction coordinator, and the remaining bits are used for monotonically 
increasing numbers in each transaction coordinator. It is easy to locate the 
transaction crash with the TxnID.

Review comment:
       ```suggestion
   The transaction ID (TxnID) identifies a unique transaction in Pulsar. The 
transaction ID is 128-bit. The highest 16 bits are reserved for the transaction 
coordinator ID, and the remaining bits are used for monotonically increasing 
numbers in each transaction coordinator. It is easy to locate the transaction 
crash with the TxnID.
   ```
   One more comment, for expressions such as "transaction coordinator (TC)"
   First I think the first letter of each word should use the uppercase letter.
   Second, since you give an abbreviation of the word phrase, then you should 
use TC in the rest of the docs.

##########
File path: site2/website-next/docs/txn-how.md
##########
@@ -0,0 +1,154 @@
+---
+id: txn-how
+title: How transactions work?
+sidebar_label: How transactions work?
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+This section describes transaction components and how the components work 
together. For the complete design details, see [PIP-31: Transactional 
Streaming](https://docs.google.com/document/d/145VYp09JKTw9jAT-7yNyFU255FptB2_B2Fye100ZXDI/edit#heading=h.bm5ainqxosrx).
+
+## Key concept
+
+It is important to know the following key concepts, which is a prerequisite 
for understanding how transactions work.
+
+### Transaction coordinator
+
+The transaction coordinator (TC) is a module running inside a Pulsar broker. 
+
+* It maintains the entire life cycle of transactions and prevents a 
transaction from getting into an incorrect status. 
+  
+* It handles transaction timeout, and ensures that the transaction is aborted 
after a transaction timeout.
+
+### Transaction log
+
+All the transaction metadata persists in the transaction log. The transaction 
log is backed by a Pulsar topic. If the transaction coordinator crashes, it can 
restore the transaction metadata from the transaction log.
+
+The transaction log stores the transaction status rather than actual messages 
in the transaction (the actual messages are stored in the actual topic 
partitions). 
+
+### Transaction buffer
+
+Messages produced to a topic partition within a transaction are stored in the 
transaction buffer (TB) of that topic partition. The messages in the 
transaction buffer are not visible to consumers until the transactions are 
committed. The messages in the transaction buffer are discarded when the 
transactions are aborted. 
+
+Transaction buffer stores all ongoing and aborted transactions in memory. All 
messages are sent to the actual partitioned Pulsar topics.  After transactions 
are committed, the messages in the transaction buffer are materialized 
(visible) to consumers. When the transactions are aborted, the messages in the 
transaction buffer are discarded.

Review comment:
       ```suggestion
   The transaction buffer stores all ongoing and aborted transactions in 
memory. All messages are sent to the actual partitioned Pulsar topics. After 
transactions are committed, the messages in the transaction buffer are 
materialized (visible) to consumers. When the transactions are aborted, the 
messages in the transaction buffer are discarded.
   ```
   BTW, I think the following sentences have the same meaning with sentences in 
the previous paragraph.
   "After transactions are committed, the messages in the transaction buffer 
are materialized (visible) to consumers. When the transactions are aborted, the 
messages in the transaction buffer are discarded."

##########
File path: site2/website-next/docs/txn-how.md
##########
@@ -0,0 +1,154 @@
+---
+id: txn-how
+title: How transactions work?
+sidebar_label: How transactions work?
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+This section describes transaction components and how the components work 
together. For the complete design details, see [PIP-31: Transactional 
Streaming](https://docs.google.com/document/d/145VYp09JKTw9jAT-7yNyFU255FptB2_B2Fye100ZXDI/edit#heading=h.bm5ainqxosrx).
+
+## Key concept
+
+It is important to know the following key concepts, which is a prerequisite 
for understanding how transactions work.
+
+### Transaction coordinator
+
+The transaction coordinator (TC) is a module running inside a Pulsar broker. 
+
+* It maintains the entire life cycle of transactions and prevents a 
transaction from getting into an incorrect status. 
+  
+* It handles transaction timeout, and ensures that the transaction is aborted 
after a transaction timeout.
+
+### Transaction log
+
+All the transaction metadata persists in the transaction log. The transaction 
log is backed by a Pulsar topic. If the transaction coordinator crashes, it can 
restore the transaction metadata from the transaction log.
+
+The transaction log stores the transaction status rather than actual messages 
in the transaction (the actual messages are stored in the actual topic 
partitions). 
+
+### Transaction buffer
+
+Messages produced to a topic partition within a transaction are stored in the 
transaction buffer (TB) of that topic partition. The messages in the 
transaction buffer are not visible to consumers until the transactions are 
committed. The messages in the transaction buffer are discarded when the 
transactions are aborted. 
+
+Transaction buffer stores all ongoing and aborted transactions in memory. All 
messages are sent to the actual partitioned Pulsar topics.  After transactions 
are committed, the messages in the transaction buffer are materialized 
(visible) to consumers. When the transactions are aborted, the messages in the 
transaction buffer are discarded.
+
+### Transaction ID
+
+Transaction ID (TxnID) identifies a unique transaction in Pulsar. The 
transaction ID is 128-bit. The highest 16 bits are reserved for the ID of the 
transaction coordinator, and the remaining bits are used for monotonically 
increasing numbers in each transaction coordinator. It is easy to locate the 
transaction crash with the TxnID.
+
+### Pending acknowledge state
+
+Pending acknowledge state maintains message acknowledgments within a 
transaction before a transaction completes. If a message is in the pending 
acknowledge state, the message cannot be acknowledged by other transactions 
until the message is removed from the pending acknowledge state.
+
+The pending acknowledge state is persisted to the pending acknowledge log 
(cursor ledger). A new broker can restore the state from the pending 
acknowledge log to ensure the acknowledgement is not lost.    
+
+## Data flow
+
+At a high level, the data flow can be split into several steps:
+
+1. Begin a transaction.
+   
+2. Publish messages with a transaction.
+   
+3. Acknowledge messages with a transaction.
+   
+4. End a transaction.
+
+To help you debug or tune the transaction for better performance, review the 
following diagrams and descriptions. 
+
+### 1. Begin a transaction
+
+Before introducing the transaction in Pulsar, a producer is created and then 
messages are sent to brokers and stored in data logs. 
+
+![](/assets/txn-3.png)
+
+Let’s walk through the steps for _beginning a transaction_.
+
+| Step  |  Description  | 

Review comment:
       1. Could we use the orderlist instead of putting the steps in a table? 
same comment for the following sections.
   2. Update "transaction ID" to TxnID, as you have mentioned it before. 
--Check through the whole doc and update it accordingly.
   

##########
File path: site2/website-next/docs/txn-how.md
##########
@@ -0,0 +1,154 @@
+---
+id: txn-how
+title: How transactions work?
+sidebar_label: How transactions work?
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+This section describes transaction components and how the components work 
together. For the complete design details, see [PIP-31: Transactional 
Streaming](https://docs.google.com/document/d/145VYp09JKTw9jAT-7yNyFU255FptB2_B2Fye100ZXDI/edit#heading=h.bm5ainqxosrx).
+
+## Key concept
+
+It is important to know the following key concepts, which is a prerequisite 
for understanding how transactions work.
+
+### Transaction coordinator
+
+The transaction coordinator (TC) is a module running inside a Pulsar broker. 
+
+* It maintains the entire life cycle of transactions and prevents a 
transaction from getting into an incorrect status. 
+  
+* It handles transaction timeout, and ensures that the transaction is aborted 
after a transaction timeout.
+
+### Transaction log
+
+All the transaction metadata persists in the transaction log. The transaction 
log is backed by a Pulsar topic. If the transaction coordinator crashes, it can 
restore the transaction metadata from the transaction log.
+
+The transaction log stores the transaction status rather than actual messages 
in the transaction (the actual messages are stored in the actual topic 
partitions). 
+
+### Transaction buffer
+
+Messages produced to a topic partition within a transaction are stored in the 
transaction buffer (TB) of that topic partition. The messages in the 
transaction buffer are not visible to consumers until the transactions are 
committed. The messages in the transaction buffer are discarded when the 
transactions are aborted. 
+
+Transaction buffer stores all ongoing and aborted transactions in memory. All 
messages are sent to the actual partitioned Pulsar topics.  After transactions 
are committed, the messages in the transaction buffer are materialized 
(visible) to consumers. When the transactions are aborted, the messages in the 
transaction buffer are discarded.
+
+### Transaction ID
+
+Transaction ID (TxnID) identifies a unique transaction in Pulsar. The 
transaction ID is 128-bit. The highest 16 bits are reserved for the ID of the 
transaction coordinator, and the remaining bits are used for monotonically 
increasing numbers in each transaction coordinator. It is easy to locate the 
transaction crash with the TxnID.
+
+### Pending acknowledge state
+
+Pending acknowledge state maintains message acknowledgments within a 
transaction before a transaction completes. If a message is in the pending 
acknowledge state, the message cannot be acknowledged by other transactions 
until the message is removed from the pending acknowledge state.
+
+The pending acknowledge state is persisted to the pending acknowledge log 
(cursor ledger). A new broker can restore the state from the pending 
acknowledge log to ensure the acknowledgement is not lost.    
+
+## Data flow
+
+At a high level, the data flow can be split into several steps:
+
+1. Begin a transaction.

Review comment:
       Start?

##########
File path: site2/website-next/sidebars.json
##########
@@ -81,6 +85,17 @@
         "tiered-storage-azure",
         "tiered-storage-aliyun"
       ]
+    },
+    {
+      "type": "category",
+      "label": "Transactions",
+      "items": [
+        "txn-why",

Review comment:
       I think it might be more appropriate to put What are transactions as the 
first one, and then why transactions,.....

##########
File path: site2/website-next/docs/txn-how.md
##########
@@ -0,0 +1,154 @@
+---
+id: txn-how
+title: How transactions work?
+sidebar_label: How transactions work?
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+This section describes transaction components and how the components work 
together. For the complete design details, see [PIP-31: Transactional 
Streaming](https://docs.google.com/document/d/145VYp09JKTw9jAT-7yNyFU255FptB2_B2Fye100ZXDI/edit#heading=h.bm5ainqxosrx).
+
+## Key concept
+
+It is important to know the following key concepts, which is a prerequisite 
for understanding how transactions work.
+
+### Transaction coordinator
+
+The transaction coordinator (TC) is a module running inside a Pulsar broker. 
+
+* It maintains the entire life cycle of transactions and prevents a 
transaction from getting into an incorrect status. 
+  
+* It handles transaction timeout, and ensures that the transaction is aborted 
after a transaction timeout.
+
+### Transaction log
+
+All the transaction metadata persists in the transaction log. The transaction 
log is backed by a Pulsar topic. If the transaction coordinator crashes, it can 
restore the transaction metadata from the transaction log.
+
+The transaction log stores the transaction status rather than actual messages 
in the transaction (the actual messages are stored in the actual topic 
partitions). 
+
+### Transaction buffer
+
+Messages produced to a topic partition within a transaction are stored in the 
transaction buffer (TB) of that topic partition. The messages in the 
transaction buffer are not visible to consumers until the transactions are 
committed. The messages in the transaction buffer are discarded when the 
transactions are aborted. 
+
+Transaction buffer stores all ongoing and aborted transactions in memory. All 
messages are sent to the actual partitioned Pulsar topics.  After transactions 
are committed, the messages in the transaction buffer are materialized 
(visible) to consumers. When the transactions are aborted, the messages in the 
transaction buffer are discarded.
+
+### Transaction ID
+
+Transaction ID (TxnID) identifies a unique transaction in Pulsar. The 
transaction ID is 128-bit. The highest 16 bits are reserved for the ID of the 
transaction coordinator, and the remaining bits are used for monotonically 
increasing numbers in each transaction coordinator. It is easy to locate the 
transaction crash with the TxnID.
+
+### Pending acknowledge state
+
+Pending acknowledge state maintains message acknowledgments within a 
transaction before a transaction completes. If a message is in the pending 
acknowledge state, the message cannot be acknowledged by other transactions 
until the message is removed from the pending acknowledge state.
+
+The pending acknowledge state is persisted to the pending acknowledge log 
(cursor ledger). A new broker can restore the state from the pending 
acknowledge log to ensure the acknowledgement is not lost.    

Review comment:
       pending acknowledgment log?

##########
File path: site2/website-next/docs/txn-use.md
##########
@@ -0,0 +1,96 @@
+---
+id: txn-use
+title: How to use transactions?
+sidebar_label: How to use transactions?
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+## Transaction API
+
+The transaction feature is primarily a server-side and protocol-level feature. 
You can use the transaction feature via the [transaction 
API](https://pulsar.apache.org/api/admin/), which is available in **Pulsar 
2.8.0 or later**. 

Review comment:
       ```suggestion
   The transaction feature is primarily a server-side and protocol-level 
feature. You can use the transaction feature via the [transaction 
API](https://pulsar.apache.org/api/admin/), which is available in **Pulsar 
2.8.0 or higher**. 
   ```
   One more comment, as I know, In pulsar 2.8.0, transaction is not stable 
feature. Please confirm with eng whether the Pulsar version should be changed.

##########
File path: site2/website-next/docs/txn-use.md
##########
@@ -0,0 +1,96 @@
+---
+id: txn-use
+title: How to use transactions?
+sidebar_label: How to use transactions?
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+## Transaction API
+
+The transaction feature is primarily a server-side and protocol-level feature. 
You can use the transaction feature via the [transaction 
API](https://pulsar.apache.org/api/admin/), which is available in **Pulsar 
2.8.0 or later**. 
+
+To use the transaction API, you do not need any additional settings in the 
Pulsar client. **By default**, transactions is **disabled**. 
+
+Currently, transaction API is only available for **Java** clients. Support for 
other language clients will be added in the future releases.
+
+## Quick start
+
+This section provides an example of how to use the transaction API to send and 
receive messages in a Java client. 
+
+1. Start Pulsar 2.8.0 or later. 
+
+2. Enable transaction. 
+
+    Change the configuration in the `broker.conf` file.

Review comment:
       can we choose to change config in `standalone.conf` file? I noticed for 
batch messages, we can either update transaction configs in `broker.conf` or 
`standalone.conf` file

##########
File path: site2/website-next/docs/txn-what.md
##########
@@ -0,0 +1,63 @@
+---
+id: txn-what
+title: What are transactions?
+sidebar_label: What are transactions?
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+Transactions strengthen the message delivery semantics of Apache Pulsar and 
[processing guarantees of Pulsar 
Functions](https://pulsar.apache.org/docs/en/next/functions-overview/#processing-guarantees).
 The Pulsar Transaction API supports atomic writes and acknowledgments across 
multiple topics. 
+
+Transactions allow:
+
+- A producer to send a batch of messages to multiple topics where all messages 
in the batch are eventually visible to any consumer, or none are ever visible 
to consumers. 
+
+- End-to-end exactly-once semantics (execute a `consume-process-produce` 
operation exactly once).
+
+## Transaction semantics
+
+Pulsar transactions have the following semantics: 
+
+* All operations within a transaction are committed as a single unit.
+   
+  * Either all messages are committed, or none of them are. 
+
+  * Each message is written or processed exactly once, without data loss or 
duplicates (even in the event of failures). 
+
+  * If a transaction is aborted, all the writes and acknowledgments in this 
transaction rollback.

Review comment:
       ```suggestion
     * If a transaction is aborted, all the writes and acknowledgments in this 
transaction roll back.
   ```

##########
File path: site2/website-next/versioned_docs/version-2.7.3/transactions-api.md
##########
@@ -0,0 +1,155 @@
+---
+id: transactions-api
+title: Transactions API (Developer Preview)
+sidebar_label: Transactions API
+original_id: transactions-api
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+All messages in a transaction is available only to consumers after the 
transaction is committed. If a transaction is aborted, all the writes and 
acknowledgments in this transaction rollback. 
+
+Currently, Pulsar transaction is a developer preview feature. It is disabled 
by default. You can enable the feature and use transactions in your application 
in development environment.
+
+## Prerequisites
+1. To enable transactions in Pulsar, you need to configure the parameter in 
the `broker.conf` file.
+
+```

Review comment:
       The code should be indented. Same comment for the following step.

##########
File path: site2/website-next/docs/txn-what.md
##########
@@ -0,0 +1,63 @@
+---
+id: txn-what
+title: What are transactions?
+sidebar_label: What are transactions?
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+Transactions strengthen the message delivery semantics of Apache Pulsar and 
[processing guarantees of Pulsar 
Functions](https://pulsar.apache.org/docs/en/next/functions-overview/#processing-guarantees).
 The Pulsar Transaction API supports atomic writes and acknowledgments across 
multiple topics. 
+
+Transactions allow:
+
+- A producer to send a batch of messages to multiple topics where all messages 
in the batch are eventually visible to any consumer, or none are ever visible 
to consumers. 
+
+- End-to-end exactly-once semantics (execute a `consume-process-produce` 
operation exactly once).
+
+## Transaction semantics
+
+Pulsar transactions have the following semantics: 
+
+* All operations within a transaction are committed as a single unit.
+   
+  * Either all messages are committed, or none of them are. 
+
+  * Each message is written or processed exactly once, without data loss or 
duplicates (even in the event of failures). 
+
+  * If a transaction is aborted, all the writes and acknowledgments in this 
transaction rollback.
+  
+* A group of messages in a transaction can be received from, produced to, and 
acknowledged by multiple partitions.
+  
+  * Consumers are only allowed to read committed (acked) messages. In other 
words, the broker does not deliver transactional messages which are part of an 
open transaction or messages which are part of an aborted transaction.
+    
+  * Message writes across multiple partitions are atomic.
+    
+  * Message acks across multiple subscriptions are atomic. A message is acked 
successfully only once by a consumer under the subscription when acknowledging 
the message with the transaction ID.
+
+## Transactions and stream processing
+
+Stream processing on Pulsar is a `consume-process-produce` operation on Pulsar 
topics:
+
+* `Consume`: a source operator that runs a Pulsar consumer reads messages from 
one or multiple Pulsar topics.
+  
+* `Process`: a processing operator transforms the messages. 
+  
+* `Produce`: a sink operator that runs a Pulsar producer writes the resulting 
messages to one or multiple Pulsar topics.
+
+![](/assets/txn-2.png)
+
+Pulsar transactions support end-to-end exactly-once stream processing, which 
means messages are not lost from a source operator and messages are not 
duplicated to a sink operator.
+
+## Use case
+
+Prior to Pulsar 2.8.0, there was no easy way to build stream processing 
applications with Pulsar to achieve exactly-once processing guarantees. With 
the transaction introduced in Pulsar 2.8.0, the following services support 
exactly-once semantics:
+
+* [Pulsar Flink 
connector](https://flink.apache.org/2021/01/07/pulsar-flink-connector-270.html)
+
+    Prior to Pulsar 2.8.0, if you want to build stream applications using 
Pulsar and Flink, the Pulsar Flink connector only supported exactly-once source 
connector and at-least-once sink connector, which means the highest processing 
guarantee for end-to-end was at-least-once, there was possibility that the 
resulting messages from streaming applications produce duplicated messages to 
the resulting topics in Pulsar.

Review comment:
       ```suggestion
       Prior to Pulsar 2.8.0, if you wanted to build stream applications using 
Pulsar and Flink, the Pulsar Flink connector only supported exactly-once source 
connector and at-least-once sink connector, which means the highest processing 
guarantee for end-to-end was at-least-once. There was possibility that the 
resulting messages from streaming applications produce duplicated messages to 
the resulting topics in Pulsar.
   ```

##########
File path: site2/website-next/docs/txn-use.md
##########
@@ -0,0 +1,96 @@
+---
+id: txn-use
+title: How to use transactions?
+sidebar_label: How to use transactions?
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+## Transaction API
+
+The transaction feature is primarily a server-side and protocol-level feature. 
You can use the transaction feature via the [transaction 
API](https://pulsar.apache.org/api/admin/), which is available in **Pulsar 
2.8.0 or later**. 
+
+To use the transaction API, you do not need any additional settings in the 
Pulsar client. **By default**, transactions is **disabled**. 
+
+Currently, transaction API is only available for **Java** clients. Support for 
other language clients will be added in the future releases.
+
+## Quick start
+
+This section provides an example of how to use the transaction API to send and 
receive messages in a Java client. 
+
+1. Start Pulsar 2.8.0 or later. 
+
+2. Enable transaction. 
+
+    Change the configuration in the `broker.conf` file.
+
+    ```
+    transactionCoordinatorEnabled=true
+    ```
+
+    If you want to enable batch messages in transactions, follow the steps 
below.

Review comment:
       ```suggestion
       If you want to enable batch messages in transactions, set 
`acknowledgmentAtBatchIndexLevelEnabled` to `true` in the `broker.conf` or 
`standalone.conf` file.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to