This is an automated email from the ASF dual-hosted git repository.

fanng pushed a commit to branch lineage_doc
in repository https://gitbox.apache.org/repos/asf/gravitino.git


The following commit(s) were added to refs/heads/lineage_doc by this push:
     new e7c29d28a4 update doc
e7c29d28a4 is described below

commit e7c29d28a4020a266a8ff3fb348d1b240feb7383
Author: fanng <[email protected]>
AuthorDate: Wed Apr 16 11:11:56 2025 +0800

    update doc
---
 docs/lineage/gravitino-server-lineage.md | 7 +++----
 docs/lineage/gravitino-spark-lineage.md  | 6 ++----
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/docs/lineage/gravitino-server-lineage.md 
b/docs/lineage/gravitino-server-lineage.md
index 0cb40179bc..09f7b8fc75 100644
--- a/docs/lineage/gravitino-server-lineage.md
+++ b/docs/lineage/gravitino-server-lineage.md
@@ -13,10 +13,10 @@ Gravitino server provides a pluginable lineage framework to 
receive, process, an
 
 | Configuration item                            | Description                  
                                                                                
                                                                                
         | Default value                                          | Required | 
Since Version |
 
|-----------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------|----------|---------------|
-| `gravitino.lineage.source`                    | The name of lineage event 
source. The default `http` event source will .                                  
                                                                                
            | http                                                   | No       
| 0.9.0         |
+| `gravitino.lineage.source`                    | The name of lineage event 
source.                                                                         
                                                                                
            | http                                                   | No       
| 0.9.0         |
 | `gravitino.lineage.${sourceName}.sourceClass` | The name of the lineage 
source class which should implement 
`org.apache.gravitino.lineage.source.LineageSource` interface.                  
                                                          | (none)              
                                   | No       | 0.9.0         |
 | `gravitino.lineage.processorClass`            | The name of the lineage 
processor class which should implement 
`org.apache.gravitino.lineage.processor.LineageProcessor` interface. The 
default noop processor will do nothing about the run event.   | 
`org.apache.gravitino.lineage.processor.NoopProcessor` | No       | 0.9.0       
  |
-| `gravitino.lineage.sinks`                     | The name of lineage event 
sinks.                                                                          
                                                                                
            | log                                                    | No       
| 0.9.0         |
+| `gravitino.lineage.sinks`                     | The Lineage event sink names 
(support multiple sinks separated by commas).                                   
                                                                                
         | log                                                    | No       | 
0.9.0         |
 | `gravitino.lineage.${sinkName}.sinkClass`     | The name of the lineage sink 
class which should implement `org.apache.gravitino.lineage.sink.LineageSink` 
interface.                                                                      
            | (none)                                                 | No       
| 0.9.0         |
 | `gravitino.lineage.queueCapacity`             | The total capacity of 
lineage event queues. If there are multi lineage sinks, the sinks will use an 
isolated event queue with the capacity of `gravitino.lineage.queueCapacity` div 
the num of sinks. | 10000                                                  | No 
      | 0.9.0         |
 
@@ -52,5 +52,4 @@ Log sink will print the log in a separate log file 
`gravitino_lineage.log`, you
 
 ## High watermark status
 
-If the lineage sink is slow, the lineage event will heap in the async queue, 
the lineage system will enter high watermark status if the queue size is larger 
than the capability*0.9. In high watermark status, the lineage source should 
implement appropriate retry/logging mechanisms for rejected events to prevent
-system overload. For `http` source, it will return http status code `429` to 
the client side.
\ No newline at end of file
+When the lineage sink operates slowly, lineage events accumulate in the async 
queue. Once the queue size exceeds 90% of its capacity (high watermark 
threshold), the lineage system enters a high watermark status. In this state, 
the lineage source must implement retry and logging mechanisms for rejected 
events to prevent system overload. For the HTTP source, it will return the `429 
Too Many Requests` status code to the client.
\ No newline at end of file
diff --git a/docs/lineage/gravitino-spark-lineage.md 
b/docs/lineage/gravitino-spark-lineage.md
index a2c539f0bb..d26811585e 100644
--- a/docs/lineage/gravitino-spark-lineage.md
+++ b/docs/lineage/gravitino-spark-lineage.md
@@ -20,10 +20,8 @@ By leveraging OpenLineage Spark plugin, Gravitino provides a 
separate Spark plug
 
 The Gravitino OpenLineage Spark plugin transforms the Gravitino metalake name 
into the dataset namespace. The dataset name varies by dataset type when 
generating lineage information.
 
-If you are using to access the table managed by Gravitino, the dataset name is 
as follows:
 When using the [Gravitino Spark 
connector](/spark-connector/spark-connector.md) to access tables managed by 
Gravitino, the dataset name follows this format:
 
-
 | Dataset Type    | Dataset name                                   | Example   
                 | Since Version |
 
|-----------------|------------------------------------------------|----------------------------|---------------|
 | Hive catalog    | `$GravitinoCatalogName.$schemaName.$tableName` | 
`hive_catalog.db.student`  | 0.9.0         |
@@ -47,7 +45,7 @@ When accessing datasets by location (e.g., `SELECT * FROM 
parquet.$dataset_path`
 | GVFS location  | `$GravitinoCatalogName.$schemaName.$filesetName` | 
`fileset_catalog.schema.fileset_a`    | 0.9.0         |
 | Other location | location path                                    | 
`hdfs://127.0.0.1:9000/tmp/a/student` | 0.9.0         |
 
-For fileset dataset, the plugin add `fileset-location` facets which contains 
the location path.
+For GVFS location, the plugin add `fileset-location` facets which contains the 
location path.
 
 ```json
 "fileset-location" :
@@ -61,7 +59,7 @@ For fileset dataset, the plugin add `fileset-location` facets 
which contains the
 ## How to use 
 
 1. Download Gravitino OpenLineage plugin jar and place it to the classpath of 
Spark.
-2. Add configuration to the Spark to enable lineage collect.
+2. Add configuration to the Spark to enable lineage collection.
 
 Configuration example For Spark shell:
 

Reply via email to