djouallah commented on issue #695: URL: https://github.com/apache/arrow-rs-object-store/issues/695#issuecomment-4293523636
Follow-up: confirmed the issue is **OneLake-specific**, not a general regression against Azure storage accounts. Tested against a standard ADLS Gen2 account (`*.dfs.core.windows.net`) with the same `object_store = "=0.13.2"` binary. Test table `_delta_log/` listing (3 files): - `_delta_log/00000000000000000000.crc` - `_delta_log/00000000000000000000.json` - `_delta_log/_autostats/00000000000000000000.0000000000.stats.json` `list_with_offset` results on **standard ADLS Gen2** (`dataiceberg.dfs.core.windows.net`): | offset | files > offset (expected) | returned | |---|---|---| | `_delta_log/00000000000000000000` | 3 (all files; `.crc`, `.json`, `_autostats` all lex > bare offset) | 3 ✓ | | `_delta_log/00000000000000000001` | 1 (`_autostats` only; `_` 0x5F > `0` 0x30, while `00...00.crc/json` < `...01`) | 1 ✓ | | `_delta_log/00000000000000000099` | 1 (`_autostats` only; same reasoning) | 1 ✓ | So `startFrom` does exactly what #623 intends on standard ADLS Gen2 — a proper lexicographic server-side cursor. Against OneLake (same binary, same auth flow, a `_delta_log/` with 11 files including `00000000000000000010.checkpoint.parquet`, `.../00000000000000000011.json` etc.), `list_with_offset(prefix, _delta_log/00000000000000000010)` returns **0** files instead of the expected 6. So the OneLake endpoint either: - Rejects or mis-parses the `startFrom` parameter entirely, OR - Interprets it as an inclusive prefix boundary rather than an exclusive lex cursor Either way, the OneLake list-blobs REST surface is not compatible with the `startFrom` contract that #623 relies on. Standard ADLS Gen2 users are unaffected. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
