My two cents.

Haved used it for testing and prototyping little things, for example a
twitter firehouse datasource, or even a generic JDBC wrapper, makes cool
demos but not something one would use in for data intensive workloads. It
definitely has issues like defining and extracting a schema is tedious, it
does not parallelize but that is generally a hard problem. I do think it
would be cool to document better and see if the community would come up
with fun datasources. It's one feature that SparkSQL and Drill kind of do
well that I'd wish to see better support in Impala for. If it is not too
much overhead to maintain might be worth keeping.

On Wed, Feb 7, 2018 at 8:48 AM Daniel Hecht <[email protected]> wrote:

> As it is implemented today, it doesn't have much value. It never really
> passed the prototype stage in terms of functionality.  For instance, it's
> not parallelized -- it runs on a single node only.
>
> On Tue, Feb 6, 2018 at 8:47 PM, Jim Apple <[email protected]> wrote:
>
>> Is there an argument for documenting it and keeping it? Did it not meet
>> the need it was added for in the first place, or has that need deceased in
>> importance?
>>
>> On Tue, Feb 6, 2018 at 7:29 PM Philip Zeyliger <[email protected]>
>> wrote:
>>
>>> Hi folks,
>>>
>>> I want to bring your attention to http://gerrit.cloudera.org:8080/9192,
>>> "IMPALA-6204: Remove external DataSource". This is functionality that was
>>> never publicly documented and, to my knowledge, is not in use by anyone.
>>> We'd like to remove it to reduce complexity.
>>>
>>> Please let me know if you've got concerns!
>>>
>>> Thanks,
>>>
>>> -- Philip
>>>
>>
>

Reply via email to