[AzureDataStore] Removing dependency on outdated Azure SDK

Matt Ryan Mon, 04 Mar 2019 15:26:16 -0800

Hi,

I've learned that Azure has released a new Java SDK for blob storage that
replaces the SDK originally used to create the AzureDataStore.  The new SDK
is not backwards compatible with the original, but contains a key bug fix
for an Oak bug identified in OAK-8013.


I'd like to have a discussion whether we should consider updating
AzureDataStore to use this latest Azure SDK.  Please take some time to read
and weigh in.

Question 1 - Why move from the old SDK to the new SDK?
The old SDK has a bug which prevents a fix for OAK-8013 (see also
OAK-8104).  In the current state, Oak will not properly support direct
download of binaries with special characters in the filename.  The way to
fix this issue is to move away from the old SDK.

Question 2 - Why is moving to the new SDK a big deal?
The new SDK is completely different from the old SDK.  While the new SDK
has new classes etc., the primary difference is a new paradigm - it uses a
more fluent/event-driven, async-style model.  Using this new SDK will
require that AzureDataStore do some tricks to perform the async operations
in synchronized ways, have to manage conversions from byte buffers to
streams, etc.  So not only is the new SDK not backward compatible, it also
uses a different approach.  This will result in substantial changes to
AzureDataStore, with a significant accompanying risk.
In addition, I've been playing with the new SDK over the past few days and
I have concerns about the SDK itself.  A very basic sample application,
which is nearly a verbatim copy of their online sample, prints warnings to
the console when it is run:

> WARNING: An illegal reflective access operation has occurred
> WARNING: Illegal reflective access by com.microsoft.rest.v2.Validator to
field java.util.HashMap.serialVersionUID
> WARNING: Please consider reporting this to the maintainers of
com.microsoft.rest.v2.Validator
> WARNING: Use --illegal-access=warn to enable warnings of further illegal
reflective access operations
> WARNING: All illegal access operations will be denied in a future release

I've seen other issues, like unhandled exceptions, in other sample apps
I've created, even in code that actually does perform the desired tasks
correctly.

Question 3 - What are our options?
I see three:
1. Stay with the current, deprecated Azure SDK.  We would probably be
unable to fix OAK-8013/OAK-8105 correctly in that case, at least for Azure,
which would mean direct downloads of files with special characters in the
filename would not work.  (I suppose it is theoretically possible that
Microsoft would implement a fix in the deprecated SDK, but considering that
this bug is fixed in the new SDK I think it is unlikely.)
2. Update AzureDataStore to use the latest SDK.  I expect this will be a
significant effort - several weeks probably, at least, due to many unknowns
and the errors and exceptions the code currently produces and trying to
work them out of the code.
3. Rip out the Azure SDK dependencies altogether and instead implement
AzureDataStore directly to the Azure REST endpoints.

The last option is one I'm strongly considering.  Moving away from the SDK
is perhaps not great at first, but it avoids this problem in the future and
we don't have to worry about trying to accommodate an asynchronous API in
our synchronous access model.  I don't expect that the work will be any
greater.  My primary concern is whether we can rely on backwards
compatibility in the REST APIs moving forward.  I'm trying to find this out.

What does everyone else think?  What questions do I need to get answered?
Which option sounds best, or is there another better option I didn't list?


-MR

[AzureDataStore] Removing dependency on outdated Azure SDK

Reply via email to