avarnon opened a new issue, #6868:
URL: https://github.com/apache/arrow-rs/issues/6868
**Is your feature request related to a problem or challenge? Please describe
what you are trying to do.**
My team found that `object_store`'s `AzureClient.put_block()` uses an
incrementing counter to calculate the `content_id` and `block_id`. We are using
`object_store` inside of a Rust based web service which means that multiple
streams _could_ attempt to write to the same BLOB path in parallel. In our
opinion, this could lead to a corrupt file as stream `a` could upload a block
with the same content/block ID as stream `b`.
**Describe the solution you'd like**
My team would like to see `AzureClient.put_block()` use a randomized
content/block ID to prevent collisions.
**Describe alternatives you've considered**
We have considered using Azure BLOB leases to prevent concurrent writes but
determined this would be cost prohibitive.
**Additional context**
```rust
let part_idx = u128::from_be_bytes(rand::thread_rng().gen());
let content_id = format!("{part_idx:40}");
let block_id = BASE64_STANDARD.encode(&content_id);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]