yihua commented on code in PR #18876:
URL: https://github.com/apache/hudi/pull/18876#discussion_r3322269255


##########
website/docs/lance_file_format.md:
##########
@@ -91,35 +90,18 @@ export 
LANCE_BUNDLE_JAR=/path/to/lance-spark-bundle-3.5_2.12-0.4.0.jar
 spark-shell --jars $HUDI_BUNDLE_JAR,$LANCE_BUNDLE_JAR
 ```
 
-## How Hudi + Lance Work Together
+## Layering
 
-Hudi manages the table layer — transactions, schema, timeline, table services 
— while Lance handles the
-file-level storage:
+Hudi manages the table layer (timeline, metadata, schema, file groups, table 
services). Lance is the
+on-disk file format for base files. Log files for MOR tables remain Avro.
 
-```
-┌───────────────────────────────────┐
-│         Hudi Table Layer          │
-│  Timeline, Metadata, Indexing     │
-│  Transactions, Schema Evolution   │
-├───────────────────────────────────┤
-│     File Group / File Slice       │
-│  (same Hudi concepts as Parquet)  │
-├───────────────────────────────────┤
-│     Lance Data Files (.lance)     │
-│  Columnar storage                 │
-│  Fragment-based layout            │
-├───────────────────────────────────┤
-│   Storage (S3, GCS, HDFS, FS)    │
-└───────────────────────────────────┘
-```
-
-All Hudi table services work with Lance-backed tables:
+Table-service behavior on Lance-backed tables:
 
-- **Compaction** — merges log files into Lance base files
-- **Clustering** — reorganizes Lance files for better data locality
-- **Cleaning** — removes old Lance file versions
-- **Metadata indexing** — bloom filters work across Lance files; column stats 
and partition stats are
-  **automatically disabled** for Lance tables
+- **Compaction** — merges Avro log files into Lance base files.
+- **Clustering** — reorganizes records into new Lance files.
+- **Cleaning** — removes obsolete Lance file slices.
+- **Metadata indexing** — bloom filter indexing is supported. Column-stats and 
partition-stats
+  indices are automatically disabled for Lance base files.

Review Comment:
   This is removed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to