Hi All,
Perhaps we can wrap up the HDF5 and REST plugin PRs in their current form.
Making small improvements seems to work better than trying to do too much in
any one PR.
Over time, we can use the V2 JSON reader in the REST plugin, but we should do
so step-by-step after making sure the basics work. Since Vitalii graciously ran
all the tests with the V2 reader enabled, I've got a list of things to fix. The
REST plugin PR should not be blocked on that work.
The REST plugin is one that Charles picked up from a previous contributor (I
believe.) It uses the same copy-and-paste model as our other plugins. Most of
the filter push-down implementations even have the same bugs.
Over time, we should fix these, but again we can do it step-by-step. Let's
commit what we have, then we Charles and I can work together to convert the
REST plugin to use the new "Base" framework. Doing so will probably reveal
changes I need to make to that framework.
In short, I'd suggest that we wrap up the in-flight PRs, then plot out the next
round of improvements that builds on this set.
All that said, I agree with Arina in another e-mail chain: we should enforce
high standards on whatever code we do commit. So we do need to get both plugins
to a high level of quality. I'm hoping we can do that with the existing code.
I'll make a point to review the PRs in question to help move them along.
Thanks,
- Paul
On Friday, January 3, 2020, 9:10:18 AM PST, Charles Givre
<[email protected]> wrote:
Hi Arina,
Thanks for getting back with me. We have 2 separate PRs...
API Storage:
1. From the conversation, Paul seemed to be ok with going ahead without
waiting for those PRs. (@Paul, please correct me if I'm wrong, but that was
how I read your comment)
2. From what I've seen, the V2 JSON reader isn't going to be 100% ready, even
when the PR is committed, and there will be issues such as the complex data
type handling that aren't going to be fully ready.
3. To get the API plugin to work, I had to make a minor mod to the V1 reader
to enable it to accept InputStreams as input not just files. The V2 reader
does not support this yet either.
4. I removed (for the time being) the filter pushdown which was the major
advantage of the Base Storage PR. My with this plugin is to get an MVP
committed. If there is interest (and need) I'll add filter pushdown (using
Paul's Base Storage PR) as well as OAUTH2 authentication and other features as
needed. There's been a lot of interest in this at my firm, so I want to get it
committed so we can start using it.
5. I'm fine with committing to updating this to the V2 reader once it is 100%
ready.
HDF5 Format Plugin:
1. I believe we were just waiting on Drill 1.17 to be released for this.
2. I've addressed the review comments (and thank you for taking the time to
review this!) I know it is a very lengthy and complicated plugin and I really
appreciate the effort. If I should ever be in Kiev, I'd definitely buy you a
beer or beverage or two of your choice ;-)
3. The HDF5 library (which this plugin uses), how shall we say, is not the
most usable. Some of the code is less than ideal because I had to work around
the HDF5 library. For example, the HDF5 library does not currently accept
InputStreams as input. You have to pass it a File object. However, to make
this work in Drill, I had to copy the InputStream from Drill to a file in the
temp directory and then open that. I've already spoken with the team that
maintains the HDF5 libraries and they are working on adding InputStream
support, and if and when they make that available, I will update the Drill
plugin accordingly.
Thanks!
-- C
> On Jan 3, 2020, at 4:48 AM, Arina Yelchiyeva <[email protected]>
> wrote:
>
> I though we agreed in the PR that API storage plugin won’t be submitted until
> #1914 and #1913 are committed.
> Since it would required API plugin code rewrite. I think there is no need
> spending time reviewing it until final changes are done.
>
> Kind regards,
> Arina
>
>> On Jan 2, 2020, at 7:08 PM, Charles Givre <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> Hello all,
>> Now that we've released Drill 1.17, I wanted to ask if you could please take
>> a look at the HDF5 format plugin (https://github.com/apache/drill/pull/1778
>> <https://github.com/apache/drill/pull/1778>) I submitted as well as the API
>> storage plugin (https://github.com/apache/drill/pull/1892
>> <https://github.com/apache/drill/pull/1892>).
>> Thanks,
>> -- C
>>
>>
>>
>