C0urante commented on code in PR #13945: URL: https://github.com/apache/kafka/pull/13945#discussion_r1261413255
########## connect/file/src/main/java/org/apache/kafka/connect/file/FileStreamSourceConnector.java: ########## @@ -101,4 +105,50 @@ public ExactlyOnceSupport exactlyOnceSupport(Map<String, String> props) { : ExactlyOnceSupport.UNSUPPORTED; } + @Override + public boolean alterOffsets(Map<String, String> connectorConfig, Map<Map<String, ?>, Map<String, ?>> offsets) { + AbstractConfig config = new AbstractConfig(CONFIG_DEF, connectorConfig); + String filename = config.getString(FILE_CONFIG); + if (filename == null || filename.isEmpty()) { + // If the 'file' configuration is unspecified, stdin is used and no offsets are tracked + throw new ConnectException("Offsets cannot be modified if the '" + FILE_CONFIG + "' configuration is unspecified. " + + "This is because stdin is used for input and offsets are not tracked."); + } + + // This connector makes use of a single source partition at a time which represents the file that it is configured to read from. + // However, there could also be source partitions from previous configurations of the connector. + for (Map.Entry<Map<String, ?>, Map<String, ?>> partitionOffset : offsets.entrySet()) { + Map<String, ?> partition = partitionOffset.getKey(); + if (partition == null) { + throw new ConnectException("Partition objects cannot be null"); + } + + if (!partition.containsKey(FILENAME_FIELD)) { + throw new ConnectException("Partition objects should contain the key '" + FILENAME_FIELD + "'"); + } + + Map<String, ?> offset = partitionOffset.getValue(); + // null offsets are allowed and represent a deletion of offsets for a partition + if (offset == null) { + return true; + } + + if (!offset.containsKey(POSITION_FIELD)) { + throw new ConnectException("Offset objects should either be null or contain the key '" + POSITION_FIELD + "'"); + } + + // The 'position' in the offset represents the position in the file's byte stream and should be a non-negative long value + try { + long offsetPosition = Long.parseLong(String.valueOf(offset.get(POSITION_FIELD))); Review Comment: Ah, nice catch! I noticed the discrepancy in numeric types while working on [KAFKA-15177](https://issues.apache.org/jira/browse/KAFKA-15177) but hadn't even considered the possibility of aligning the types across invocations of `alterOffsets` and `OffsetStorageReader::offset`/`OffsetStorageReader::offsets`. I think re-deserializing the offsets before passing them to `alterOffsets` is a great idea. Unless the request body is gigantic there shouldn't be serious performance concerns, and it also acts as a nice preflight check to ensure that the offsets can be successfully propagated to the connector's tasks through the offsets topic. I still don't love permitting string types for the connector's `position` offset values--it doesn't seem like a great endorsement of our API if we have to implement workarounds in the file connectors, which are the first example of the connector API that many developers see. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org