jihoonson opened a new issue #9352: Broken feature: appending linearly partitioned segments into a hash partitioned datasource URL: https://github.com/apache/druid/issues/9352 ### Affected Version 0.16, 0.17, master ### Description Before 0.16, Druid used to allow you to create a datasource with the `HashedPartitionsSpec` and then run a task that appends to the datasource with a linear partitioning (using `maxRowsPerSegment`). This was possible because the segments created with `HashedPartitionsSpec` have the `HashBasedNumberedShardSpec` which extends `NumberedShardSpec` which in turn is used for linearly partitioned segments (see https://github.com/apache/druid/blob/0.15.1-incubating/server/src/main/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinator.java#L691-L700). This feature was broken in #7547 and it is supposed to be a bug. However, I'm wondering we really want to support this in the future because of the below reasons. - Allowing mixed partitioning methods for one datasource is confusing and not very useful. - This feature introduces an ambiguous concept of the "core partitions". Only the hash partitioned datasource has the core partitions which is the set of segments created by the initial task. All segments in the core partitions should have the same `HashBasedNumberedShardSpec`, but other segments should have the `NumberedShardSpec`. In the timeline management, a hash partitioned datasource is regarded as visible in brokers once all segments in the core partitions become available in historicals no matter how many segments are left in the non-core partitions. I think this concept is not that useful but makes things complicated. - This feature allows you to append only _linearly_ partitioned segments to a _hash_ partitioned datasource. Other combinations or directions are not allowed. - Finally, https://github.com/apache/druid/issues/9241 was recently proposed which seems more promising. I would like to promote https://github.com/apache/druid/issues/9241 rather than fixing this bug. Welcome any thoughts.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
