Re: [PR] Spec, Core: add a new task-type field to task JSON serialization. add data task JSON serialization spec and imp. [iceberg]

via GitHub Fri, 16 Feb 2024 15:31:29 -0800


stevenzwu commented on code in PR #9728:
URL: https://github.com/apache/iceberg/pull/9728#discussion_r1493074229



##########
format/spec.md:
##########
@@ -1237,17 +1237,36 @@ Content file (data or delete) is serialized as a JSON 
object according to the fo
 | **`equality-ids`**       |`JSON list of int: Field ids used to determine row 
equality in equality delete files`|`[1]`|
 | **`sort-order-id`**      |`JSON int`|`1`|
 
-### File Scan Task Serialization
-
-File scan task is serialized as a JSON object according to the following table.
-
-| Metadata field       |JSON representation|Example|
-|--------------------------|--- |--- |
-| **`schema`**          |`JSON object`|`See above, read schemas instead`|
-| **`spec`**            |`JSON object`|`See above, read partition specs 
instead`|
-| **`data-file`**       |`JSON object`|`See above, read content file instead`|
-| **`delete-files`**    |`JSON list of objects`|`See above, read content file 
instead`|
-| **`residual-filter`** |`JSON object: residual filter 
expression`|`{"type":"eq","term":"id","value":1}`|
+### Task Serialization

Review Comment:
   Flink source checkpoints pending splits of `FileScanTask`. The 
standardization of JSON serialization scan task is used by both REST OpenAPI 
and Flink (checkpoint and job manager -> task manager split assignment). I 
would imagine if Spark streaming checkpoint pending splits (scan task), it 
would probably also prefer to use JSON serialization (than Java serialization).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Spec, Core: add a new task-type field to task JSON serialization. add data task JSON serialization spec and imp. [iceberg]

Reply via email to