asafm commented on code in PR #20455: URL: https://github.com/apache/pulsar/pull/20455#discussion_r1222978068
########## pip/pip-272.md: ########## @@ -0,0 +1,108 @@ +# Background knowledge + +In Pulsar, a pulsar function support storing state, such as a `WordCount` function which stores the state of its counters. + +```python +from pulsar import Function + +class WordCount(Function): + def process(self, item, context): + for word in item.split(): + context.incr_counter(word, 1) +``` + +Currently, Pulsar uses Bookkeeper as the default state storage interface. We can also use other state stores, which can be configured in the `conf/functions_worker.yml` using the field: `stateStorageProviderImplementation`, this YAML file will be parsed and loaded in Pulsar as the `WorkerConfig`. + +## WorkerConfig + +The `WorkerConfig` is used to configure the Pulsar functions worker, it's parsed from a YAML file. + +The `PulsarBrokerStarter`(and `PulsarStandalone`) has an optional argument `--functions-worker-conf` or `-fwc` in short to specify the path of this yaml file, if not specified, the default path: `conf/functions_worker.yml` will be used. + +By using YAML as the config file for Pulsar functions worker, operators can provide complicated fields such as `List` and `Map` to the `WorkerConfig`, which is more convenient than using JSON. + +Currently, the `WorkerConfig` has two fields related to the state store: + +1. `stateStorageProviderImplementation`: The implementation class for the state store which should implement the interface `StateStoreProvider`, such as `org.apache.pulsar.functions.instance.state.BKStateStoreProviderImpl` + +2. `stateStorageServiceUrl`: The service URL of state storage, such as: `bk://localhost:4181` + +## `Runtime` and `RuntimeFactory`: + +Pulsar Function supports three kinds of runtime and, correspondingly, has three related `RuntimeFactory` to create them, Review Comment: ```suggestion Pulsar Function supports three kinds of runtime and, correspondingly, has three related `RuntimeFactory` to create them: ``` ########## pip/pip-272.md: ########## @@ -0,0 +1,108 @@ +# Background knowledge + +In Pulsar, a pulsar function support storing state, such as a `WordCount` function which stores the state of its counters. + +```python +from pulsar import Function + +class WordCount(Function): + def process(self, item, context): + for word in item.split(): + context.incr_counter(word, 1) +``` + +Currently, Pulsar uses Bookkeeper as the default state storage interface. We can also use other state stores, which can be configured in the `conf/functions_worker.yml` using the field: `stateStorageProviderImplementation`, this YAML file will be parsed and loaded in Pulsar as the `WorkerConfig`. Review Comment: ```suggestion Currently, Pulsar uses Bookkeeper as the default state storage interface. We can also use other state stores, which can be configured in the `conf/functions_worker.yml` using the field `stateStorageProviderImplementation` (this YAML file is parsed and loaded in Pulsar as the `WorkerConfig` - see below) ``` ########## pip/pip-272.md: ########## @@ -42,11 +15,11 @@ Currently, Pulsar uses Bookkeeper as the default state storage interface. We can The `WorkerConfig` is used to configure the Pulsar functions worker and has two fields related to the state store: -1. `stateStorageProviderImplementation`: The implementation class for the state store which should implement the interface`StateStoreProvider`, such as `org.apache.pulsar.functions.instance.state.BKStateStoreProviderImpl` +1. `stateStorageProviderImplementation`: The implementation class for the state store which should implement the interface `StateStoreProvider`, such as `org.apache.pulsar.functions.instance.state.BKStateStoreProviderImpl` 2. `stateStorageServiceUrl`: The service URL of state storage, such as: `bk://localhost:4181` -`Runtime` and `RuntimeFactory`: +## `Runtime` and `RuntimeFactory`: Review Comment: @jiangpengcheng -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
