jihoonson commented on pull request #11507: URL: https://github.com/apache/druid/pull/11507#issuecomment-891402587
> > Seems like only subTaskId and loadSpec are in use in this class which confuses me why this class needs other fields. I suggest to add a new interface for PartitionLocation and let DeepStoragePartitionLocation implement it instead of extending GenericPartitionLocation because GenericPartitionLocation is designed for local storage for shuffle. Since we already have an abstract class of the same name of PartitionLocation, you will have to rename it or use another name for the new interface. > > @jihoonson I moved PartitionLocation to be an interface but realized there are other abstract classes like `PartialSegmentMergeTask`, `PartialSegmentMergeIOConfig`, `PartialSegmentMergeIngestionSpec` similar to `PartitionLocation` which will ideally also need to be made interfaces and each having two implementations based on `PartitionLocation` class. However I don't see what will be gained from interfacing these as the logic for these classes will be same for any PartitionLocation as of now, seems like generalizing too much. I think we can keep `DeepStoragePartitionLocation` an extension of `GenericPartitionLocation` to avoid this problem but some of the fields would be redundant. What do you think ? @pjain1 I wanted to understand what the difficulty is, so took a stab and implemented what I suggested in my previous comment. The code is available in my [branch](https://github.com/jihoonson/druid/tree/partition-location-interface). I haven't tested my change, but at least it compiles successfully. `PartialSegmentMergeTask`, `PartialSegmentMergeIOConfig`, and `PartialSegmentMergeIngestionSpec` still remain as abstract classes today because of a historical reason (I created them as abstract but haven't cleaned them up). So I propomted `PartialSegmentMergeIOConfig` to be non-abstract which now accepts a list of `PartitionLocation`s instead of a list of extensions of `PartitionLocation`. `ShuffleClient` now needs to know what type of `PartitionLocation` to use. `PartialSegmentMergeIngestionSpec` is also no longer an abstract class. This way, you don't have to add new `PartialSegmentMergeTask`, `PartialSegmentMergeIOConfig`, and `PartialSegmentMergeIngestionSpec` implementations corresponding to `DeepStoragePartitionLocation`. I haven't touched `PartitionStat` in my branch, but think a similar technique can apply. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
