Tom-Newton opened a new issue, #40057:
URL: https://github.com/apache/arrow/issues/40057

   ### Describe the enhancement requested
   
   Child of https://github.com/apache/arrow/issues/18014
   
   Manually modifying https://github.com/apache/arrow/pull/40021 to run the 
Python tests against a real blob storage account caused a failure when 
attempting to write the metadata `{'Content-Type': 'x-pyarrow/test'}`.
   
   It turns out real Azure storage doesn't allow `-`s in the metadata keys. 
This behaviour is the same on real flat namespace and hierarchical namespace 
but azurite accepts it. 
   
   Apparently the keys (names) of metadata must conform to 
https://learn.microsoft.com/en-us/rest/api/storageservices/naming-and-referencing-containers--blobs--and-metadata#metadata-names
 the naming rules for [C# 
identifiers](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference).
   
   We need to decide what we want to do here. I think either we have to accept 
these limitations or we will need to encode the metadata keys before writing 
them to azure then decode when reading back. A quick Google search came up with 
these options for potential encodings 
https://stackoverflow.com/questions/32037525/encode-to-alphanumeric-in-javascript#:~:text=To%20encode%20to%20an%20alphanumeric,in%20a%20shorter%20encoded%20string.
   
   The downside of encoding the metadata would be that other Azure clients 
won't know to decode. 
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to