[GitHub] [druid] techdocsmith commented on a change in pull request #10748: reword for clarity
techdocsmith commented on a change in pull request #10748: URL: https://github.com/apache/druid/pull/10748#discussion_r571158173 ## File path: docs/ingestion/native-batch.md ## @@ -192,7 +192,8 @@ that range if there's some stray data with unexpected timestamps. ||---|---|-| |type|The task type, this should always be `index_parallel`.|none|yes| |inputFormat|[`inputFormat`](./data-formats.md#input-format) to specify how to parse input data.|none|yes| -|appendToExisting|Creates segments as additional shards of the latest version, effectively appending to the segment set instead of replacing it. The current limitation is that you can append to any datasources regardless of their original partitioning scheme, but the appended segments should be partitioned using the `dynamic` partitionsSpec.|false|no| +|appendToExisting|Creates segments as additional shards of the latest version, effectively appending to the segment set instead of replacing it. This means that you can append new segments to any datasource regardless of its original partitioning scheme. +`appendToExisting` is incompatible with `forceGuaranteedRollup` and its associated partitioning types, for example `hashed`. Therefore, you must use the `dynamic` partitioning type for the appended segments. Otherwise the task fails with the following error: "forceGuaranteedRollup must be set".|false|no| Review comment: thanks for the detailed review @jihoonson . I removed the line break and mention of error. I think your info can go into a troubleshooting topic on the forum This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] techdocsmith commented on a change in pull request #10748: reword for clarity
techdocsmith commented on a change in pull request #10748: URL: https://github.com/apache/druid/pull/10748#discussion_r570675734 ## File path: docs/ingestion/native-batch.md ## @@ -192,7 +192,7 @@ that range if there's some stray data with unexpected timestamps. ||---|---|-| |type|The task type, this should always be `index_parallel`.|none|yes| |inputFormat|[`inputFormat`](./data-formats.md#input-format) to specify how to parse input data.|none|yes| -|appendToExisting|Creates segments as additional shards of the latest version, effectively appending to the segment set instead of replacing it. The current limitation is that you can append to any datasources regardless of their original partitioning scheme, but the appended segments should be partitioned using the `dynamic` partitionsSpec.|false|no| +|appendToExisting|Creates segments as additional shards of the latest version, effectively appending to the segment set instead of replacing it. This means that you can append new segments to any datasource regardless of its original partitioning scheme. However, you must use the `dynamic` partitioning type for the appended segments .|false|no| Review comment: @jihoonson PTAL. Clarified that the task will fail if you use another partitioning type and why. Included the error text in case someone needs to search on that. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] techdocsmith commented on a change in pull request #10748: reword for clarity
techdocsmith commented on a change in pull request #10748: URL: https://github.com/apache/druid/pull/10748#discussion_r570675734 ## File path: docs/ingestion/native-batch.md ## @@ -192,7 +192,7 @@ that range if there's some stray data with unexpected timestamps. ||---|---|-| |type|The task type, this should always be `index_parallel`.|none|yes| |inputFormat|[`inputFormat`](./data-formats.md#input-format) to specify how to parse input data.|none|yes| -|appendToExisting|Creates segments as additional shards of the latest version, effectively appending to the segment set instead of replacing it. The current limitation is that you can append to any datasources regardless of their original partitioning scheme, but the appended segments should be partitioned using the `dynamic` partitionsSpec.|false|no| +|appendToExisting|Creates segments as additional shards of the latest version, effectively appending to the segment set instead of replacing it. This means that you can append new segments to any datasource regardless of its original partitioning scheme. However, you must use the `dynamic` partitioning type for the appended segments .|false|no| Review comment: @jihoonson PTAL. Clarified that the task will fail if you use another partitioning type and why. Included the error text in case someone needs to search on that. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] techdocsmith commented on a change in pull request #10748: reword for clarity
techdocsmith commented on a change in pull request #10748: URL: https://github.com/apache/druid/pull/10748#discussion_r556188376 ## File path: docs/ingestion/native-batch.md ## @@ -192,7 +192,7 @@ that range if there's some stray data with unexpected timestamps. ||---|---|-| |type|The task type, this should always be `index_parallel`.|none|yes| |inputFormat|[`inputFormat`](./data-formats.md#input-format) to specify how to parse input data.|none|yes| -|appendToExisting|Creates segments as additional shards of the latest version, effectively appending to the segment set instead of replacing it. The current limitation is that you can append to any datasources regardless of their original partitioning scheme, but the appended segments should be partitioned using the `dynamic` partitionsSpec.|false|no| +|appendToExisting|Creates segments as additional shards of the latest version, effectively appending to the segment set instead of replacing it. This means that you can append new segments to any datasource regardless of its original partitioning scheme. However, you must use the `dynamic` partitioning type for the appended segments .|false|no| Review comment: I am still a noob, but I tried setting the partitionspec to `hashed` and appendToExisting to `true`. The task succeeded, but when I check the payload it looks like appendToExisting was set to `false` ``` "ioConfig": { "type": "index_parallel", "inputSource": { "type": "http", "uris": [ "https://static.imply.io/data/wikipedia.json.gz; ], "httpAuthenticationUsername": null, "httpAuthenticationPassword": null }, "inputFormat": { "type": "json", "flattenSpec": { "useFieldDiscovery": true, "fields": [] }, "featureSpec": {} }, "appendToExisting": false ``` I'll need to try again maybe and follow up with someone with deeper insight to verify how it is supposed to work in this case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org