This is an automated email from the ASF dual-hosted git repository.
urfree pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/pulsar-site.git
The following commit(s) were added to refs/heads/main by this push:
new e117fd9e29e Docs sync done from apache/pulsar(#5d0eb9b)
e117fd9e29e is described below
commit e117fd9e29eaa2528f22403a89118f627b37d07e
Author: Pulsar Site Updater <[email protected]>
AuthorDate: Fri Jul 29 06:01:34 2022 +0000
Docs sync done from apache/pulsar(#5d0eb9b)
---
.../blog/2022-07-27-Apache-Pulsar-2-9-3.md | 82 ++++++++++++++++++++++
site2/website-next/docs/client-libraries-python.md | 75 +++-----------------
2 files changed, 91 insertions(+), 66 deletions(-)
diff --git a/site2/website-next/blog/2022-07-27-Apache-Pulsar-2-9-3.md
b/site2/website-next/blog/2022-07-27-Apache-Pulsar-2-9-3.md
new file mode 100644
index 00000000000..196883a739a
--- /dev/null
+++ b/site2/website-next/blog/2022-07-27-Apache-Pulsar-2-9-3.md
@@ -0,0 +1,82 @@
+---
+title: "What’s New in Apache Pulsar 2.9.3"
+date: 2022-07-27
+author: "mattisonchao, momo-jun"
+---
+
+The Apache Pulsar community releases version 2.9.3! 50 contributors provided
improvements and bug fixes that delivered 200+ commits. Thanks for all your
contributions.
+
+The highlight of the 2.9.3 release is introducing 30+ transaction fixes and
improvements. Earlier-adoption users of Pulsar transactions have documented
long-term use in their production environments and reported valuable findings
in real applications. This provides the Pulsar community with the opportunity
to make a difference.
+
+This blog walks through the most noteworthy changes. For the complete list
including all feature enhancements and bug fixes, check out the [Pulsar 2.9.3
Release Notes](https://pulsar.apache.org/release-notes/versioned/pulsar-2.9.3/).
+
+
+### Enabled cursor data compression to reduce persistent cursor data size.
[14542](https://github.com/apache/pulsar/pull/14542)
+
+#### Issue
+The cursor data is managed by the ZooKeeper/Etcd metadata store. When the data
size increases, it may take too much time to pull the data, and brokers may end
up writing large chunks of data to the ZooKeeper/Etcd metadata store.
+
+#### Resolution
+Provide the ability to enable compression mechanisms to reduce cursor data
size and the pulling time.
+
+
+### Reduced the memory occupied by `metadataPositions` and avoid OOM.
[15137](https://github.com/apache/pulsar/pull/15137)
+
+#### Issue
+The map `metadataPositions` in MLPendingAckStore is used to clear useless data
in PendingAck, where the key is the position that is persistent in PendingAck
and the value is the max position acked by an operation. It judges whether the
max subscription cursor position is smaller than the subscription cursor’s
`markDeletePosition`. If the max position is smaller, then the log cursor will
mark to delete the position. It causes two main issues:
+* In normal cases, this map stores all transaction ack operations. This is a
waste of memory and CPU.
+* If a transaction that has not been committed for a long time acks a message
in a later position, the map will not be cleaned up, which finally leads to OOM
(out-of-memory).
+
+#### Resolution
+Regularly store a small amount of data according to certain rules. For more
detailed implementation, refer to
[PIP-153](https://github.com/apache/pulsar/issues/15073).
+
+
+### Checked `lowWaterMark` before appending transaction entries to Transaction
Buffer. [15424](https://github.com/apache/pulsar/pull/15424)
+
+#### Issue
+When a client sends messages using a previously committed transaction, these
messages are visible to consumers unexpectedly.
+
+#### Resolution
+Add a map to store the `lowWaterMark` of Transaction Coordinator in
Trasanction Buffer, and check `lowWaterMark` before appending transaction
entries to Trasanction Buffer. So when sending messages using an invalid
transaction, clients will receive `NotAllowedException`.
+
+
+### Fixed the consumption performance regression.
[PR-15162](https://github.com/apache/pulsar/pull/15162)
+
+#### Issue
+This performance regression was introduced in 2.10.0, 2.9.1, and 2.8.3. You
may find a significant performance drop with message listeners while using Java
Client. The root cause is each message will introduce the thread switching from
the external thread pool to the internal thread poll and then to the external
thread pool.
+
+#### Resolution
+Avoid the thread switching for each message to improve consumption throughput.
+
+
+### Fixed a deadlock issue of topic creation.
[PR-15570](https://github.com/apache/pulsar/pull/15570)
+
+#### Issue
+This deadlock issue occurred during topic creation by trying to re-acquire the
same `StampedLock` from the same thread when removing it. This will cause the
topic to stop service for a long time, and ultimately with a failure in the
deduplication or geo-replication check. The workaround is restarting the broker.
+
+
+### Optimized the memory usage of brokers.
+
+#### Issue
+Pulsar has some internal data structures, such as
`ConcurrentLongLongPairHashMap`, and `ConcurrentLongPairHashMap`, which can
reduce the memory usage rather than using the Boxing type. However, in earlier
versions, the data structures were not supported for shrinking even if the data
was removed, which wasted a certain amount of memory in certain situations.
+
+**Pull requests**
+* https://github.com/apache/pulsar/pull/15354
+* https://github.com/apache/pulsar/pull/15342
+* https://github.com/apache/pulsar/pull/14663
+* https://github.com/apache/pulsar/pull/14515
+* https://github.com/apache/pulsar/pull/14497
+
+#### Resolution
+Support the shrinking of the internal data structures, such as
`ConcurrentSortedLongPairSet`, `ConcurrentOpenHashMap`, and so on.
+
+
+# What’s Next?
+
+If you are interested in learning more about Pulsar 2.9.3, you can
[download](https://pulsar.apache.org/versions/) and try it out now!
+
+**Pulsar Summit San Francisco 2022** will take place on August 18th, 2022.
[Register now](https://pulsar-summit.org/) and help us make it an even bigger
success by spreading the word on social media!
+
+For more information about the Apache Pulsar project and current progress,
visit
+the [Pulsar website](https://pulsar.apache.org), follow the project on Twitter
+[@apache_pulsar](https://twitter.com/apache_pulsar), and join [Pulsar
Slack](https://apache-pulsar.herokuapp.com/)!
\ No newline at end of file
diff --git a/site2/website-next/docs/client-libraries-python.md
b/site2/website-next/docs/client-libraries-python.md
index 66c690404db..11aa73d590e 100644
--- a/site2/website-next/docs/client-libraries-python.md
+++ b/site2/website-next/docs/client-libraries-python.md
@@ -34,16 +34,14 @@ $ pip install pulsar-client==@pulsar:version_number@
If you install the client libraries on Linux to support services like Pulsar
functions or Avro serialization, you can install optional components alongside
the `pulsar-client` library.
```shell
-
# avro serialization
-$ pip install pulsar-client[avro]=='@pulsar:version_number@'
+$ pip install 'pulsar-client[avro]==@pulsar:version_number@'
# functions runtime
-$ pip install pulsar-client[functions]=='@pulsar:version_number@'
+$ pip install 'pulsar-client[functions]==@pulsar:version_number@'
# all optional components
-$ pip install pulsar-client[all]=='@pulsar:version_number@'
-
+$ pip install 'pulsar-client[all]==@pulsar:version_number@'
```
Installation via PyPi is available for the following Python versions:
@@ -61,11 +59,9 @@ To install the `pulsar-client` library by building from
source, follow [instruct
To install the built Python bindings:
```shell
-
$ git clone https://github.com/apache/pulsar
$ cd pulsar/pulsar-client-cpp/python
$ sudo python setup.py install
-
```
## API Reference
@@ -81,7 +77,6 @@ You can find a variety of Python code examples for the
`pulsar-client` library.
The following example creates a Python producer for the `my-topic` topic and
sends 10 messages on that topic:
```python
-
import pulsar
client = pulsar.Client('pulsar://localhost:6650')
@@ -92,7 +87,6 @@ for i in range(10):
producer.send(('Hello-%d' % i).encode('utf-8'))
client.close()
-
```
### Consumer example
@@ -100,7 +94,6 @@ client.close()
The following example creates a consumer with the `my-subscription`
subscription name on the `my-topic` topic, receives incoming messages, prints
the content and ID of messages that arrive, and acknowledges each message to
the Pulsar broker.
```python
-
import pulsar
client = pulsar.Client('pulsar://localhost:6650')
@@ -118,13 +111,11 @@ while True:
consumer.negative_acknowledge(msg)
client.close()
-
```
This example shows how to configure negative acknowledgement.
```python
-
from pulsar import Client, schema
client = Client('pulsar://localhost:6650')
consumer =
client.subscribe('negative_acks','test',schema=schema.StringSchema())
@@ -147,7 +138,6 @@ try:
except:
print("no more msg")
pass
-
```
### Reader interface example
@@ -155,7 +145,6 @@ except:
You can use the Pulsar Python API to use the Pulsar [reader
interface](concepts-clients.md#reader-interface). Here's an example:
```python
-
# MessageId taken from a previously fetched message
msg_id = msg.message_id()
@@ -165,7 +154,6 @@ while True:
msg = reader.read_next()
print("Received message '{}' id='{}'".format(msg.data(), msg.message_id()))
# No acknowledgment
-
```
### Multi-topic subscriptions
@@ -175,7 +163,6 @@ In addition to subscribing a consumer to a single Pulsar
topic, you can also sub
The following is an example:
```python
-
import re
consumer = client.subscribe(re.compile('persistent://public/default/topic-*'),
'my-subscription')
while True:
@@ -188,7 +175,6 @@ while True:
# Message failed to be processed
consumer.negative_acknowledge(msg)
client.close()
-
```
### Create a Python client with multiple advertised listeners
@@ -197,11 +183,9 @@ To ensure clients in both internal and external networks
can connect to a Pulsar
The following example creates a Python client using multiple advertised
listeners:
```python
-
import pulsar
client = pulsar.Client('pulsar://localhost:6650', listener_name='external')
-
```
## Schema
@@ -254,19 +238,16 @@ When adding a field, you can use these parameters in the
constructor.
##### Simple definition
```python
-
class Example(Record):
a = String()
b = Integer()
c = Array(String())
i = Map(String())
-
```
##### Using enums
```python
-
from enum import Enum
class Color(Enum):
@@ -277,13 +258,11 @@ class Color(Enum):
class Example(Record):
name = String()
color = Color
-
```
##### Complex types
```python
-
class MySubRecord(Record):
x = Integer()
y = Long()
@@ -292,7 +271,6 @@ class MySubRecord(Record):
class Example(Record):
a = String()
sub = MySubRecord()
-
```
##### Set namespace for Avro schema
@@ -300,25 +278,21 @@ class Example(Record):
Set the namespace for Avro Record schema using the special field
`_avro_namespace`.
```python
-
class NamespaceDemo(Record):
_avro_namespace = 'xxx.xxx.xxx'
x = String()
y = Integer()
-
```
The schema definition is like this.
-```
-
+```json
{
- 'name': 'NamespaceDemo', 'namespace': 'xxx.xxx.xxx', 'type': 'record',
'fields': [
- {'name': 'x', 'type': ['null', 'string']},
- {'name': 'y', 'type': ['null', 'int']}
+ "name": "NamespaceDemo", "namespace": "xxx.xxx.xxx", "type": "record",
"fields": [
+ {"name": "x", "type": ["null", "string"]},
+ {"name": "y", "type": ["null", "int"]}
]
}
-
```
### Declare and validate schema
@@ -334,7 +308,6 @@ Similarly, for a consumer or reader, the consumer returns
an object (which is an
**Example**
```python
-
consumer = client.subscribe(
topic='my-topic',
subscription_name='my-subscription',
@@ -350,7 +323,6 @@ while True:
except Exception:
# Message failed to be processed
consumer.negative_acknowledge(msg)
-
```
````mdx-code-block
@@ -365,7 +337,6 @@ You can send byte data using a `BytesSchema`.
**Example**
```python
-
producer = client.create_producer(
'bytes-schema-topic',
schema=BytesSchema())
@@ -377,7 +348,6 @@ consumer = client.subscribe(
schema=BytesSchema())
msg = consumer.receive()
data = msg.value()
-
```
</TabItem>
@@ -388,7 +358,6 @@ You can send string data using a `StringSchema`.
**Example**
```python
-
producer = client.create_producer(
'string-schema-topic',
schema=StringSchema())
@@ -400,7 +369,6 @@ consumer = client.subscribe(
schema=StringSchema())
msg = consumer.receive()
str = msg.value()
-
```
</TabItem>
@@ -417,7 +385,6 @@ class variables.
**Example**
```python
-
class Example(Record):
a = Integer()
b = Integer()
@@ -434,7 +401,6 @@ consumer = client.subscribe(
schema=AvroSchema(Example))
msg = consumer.receive()
e = msg.value()
-
```
#### Method 2: JSON definition
@@ -446,7 +412,6 @@ You can declare an `AvroSchema` using JSON. In this case,
Avro schemas are defin
Below is an `AvroSchema` defined using a JSON file (_company.avsc_).
```json
-
{
"doc": "this is doc",
"namespace": "example.avro",
@@ -466,7 +431,6 @@ Below is an `AvroSchema` defined using a JSON file
(_company.avsc_).
{"name": "labels", "type": ["null", {"type": "map", "values":
"string"}]}
]
}
-
```
You can load a schema definition from file by using
[`avro.schema`]((http://avro.apache.org/docs/current/gettingstartedpython.html)
or
[`fastavro.schema`](https://fastavro.readthedocs.io/en/latest/schema.html#fastavro._schema_py.load_schema).
@@ -479,8 +443,7 @@ If you use the "JSON definition" method to declare an
`AvroSchema`, pay attentio
**Example**
-```
-
+```python
from fastavro.schema import load_schema
from pulsar.schema import *
schema_definition = load_schema("examples/company.avsc")
@@ -507,7 +470,6 @@ producer.send(company)
msg = consumer.receive()
# Users could get a dict object by `value()` method.
msg.value()
-
```
</TabItem>
@@ -518,8 +480,7 @@ msg.value()
You can declare a `JsonSchema` by passing a class that inherits
from `pulsar.schema.Record` and defines the fields as class variables. This is
similar to using `AvroSchema`. The only difference is to use `JsonSchema`
instead of `AvroSchema` when defining schema type as shown below. For how to
use `AvroSchema` via record, see [heres-python.md#method-1-record).
-```
-
+```python
producer = client.create_producer(
'avro-schema-topic',
schema=JsonSchema(Example))
@@ -528,7 +489,6 @@ consumer = client.subscribe(
'avro-schema-topic',
'sub',
schema=JsonSchema(Example))
-
```
</TabItem>
@@ -545,10 +505,8 @@ consumer = client.subscribe(
To use the end-to-end encryption feature in the Python client, you need to
configure `publicKeyPath` for producer and `privateKeyPath` for consumer.
```
-
publicKeyPath: "./public.pem"
privateKeyPath: "./private.pem"
-
```
### Tutorial
@@ -566,10 +524,8 @@ This section provides step-by-step instructions on how to
use the end-to-end enc
**Input**
```shell
-
openssl genrsa -out private.pem 2048
openssl rsa -in private.pem -pubout -out public.pem
-
```
2. Create a producer to send encrypted messages.
@@ -577,7 +533,6 @@ This section provides step-by-step instructions on how to
use the end-to-end enc
**Input**
```python
-
import pulsar
publicKeyPath = "./public.pem"
@@ -589,7 +544,6 @@ This section provides step-by-step instructions on how to
use the end-to-end enc
print('sent message')
producer.close()
client.close()
-
```
3. Create a consumer to receive encrypted messages.
@@ -597,7 +551,6 @@ This section provides step-by-step instructions on how to
use the end-to-end enc
**Input**
```python
-
import pulsar
publicKeyPath = ""
@@ -609,7 +562,6 @@ This section provides step-by-step instructions on how to
use the end-to-end enc
print("Received msg '{}' id = '{}'".format(msg.data(), msg.message_id()))
consumer.close()
client.close()
-
```
4. Run the consumer to receive encrypted messages.
@@ -617,9 +569,7 @@ This section provides step-by-step instructions on how to
use the end-to-end enc
**Input**
```shell
-
python consumer.py
-
```
5. In a new terminal tab, run the producer to produce encrypted messages.
@@ -627,9 +577,7 @@ This section provides step-by-step instructions on how to
use the end-to-end enc
**Input**
```shell
-
python producer.py
-
```
Now you can see the producer sends messages and the consumer receives
messages successfully.
@@ -639,16 +587,11 @@ This section provides step-by-step instructions on how to
use the end-to-end enc
This is from the producer side.
```
-
sent message
-
```
This is from the consumer side.
```
-
Received msg 'encryption message' id = '(0,0,-1,-1)'
-
```
-