The class HBaseInputOperator seems to be quite old. HBaseStore seems to be
having all the functionality provided by HBaseInputOperator and even more
(including Kerberos authentication).

It would be a good idea to avoid the usage of HBaseInputOperator going
forward and use HBaseStore instead.

I will also work on abstracting out the HBase input functionality in the
HBaseInputOperator, which can be extended by concrete implementations.

-Bhupesh

On Wed, Dec 23, 2015 at 7:47 PM, Bhupesh Chawda <[email protected]>
wrote:

> Thanks for the inputs.
> As an input operator, I am targeting just the Scan operation. Get
> operation may be supported better as a generic operator (like a query
> operator) which I can take up later.
>
> -Bhupesh
>
> On Tue, Dec 22, 2015 at 3:48 PM, Mohit Jotwani <[email protected]>
> wrote:
>
>> +1
>>
>> Regards,
>> Mohit
>>
>> On Tue, Dec 22, 2015 at 11:21 AM, Chinmay Kolhatkar <
>> [email protected]
>> > wrote:
>>
>> > +1 for above.
>> > I see that there is HbaseGetOperator but but its abstract no concrete
>> > implementation of this I can find.
>> > Are you going to implement of that too?
>> >
>> > Maybe the concrete implementation of HbaseGetOperator should have this.
>> >
>> > Also, I want to mention one thing about scan from my previous
>> experience of
>> > Hbase. The Hbase client is synchronous.
>> > This means when you fire a scan call, until certain number of records
>> are
>> > received at client end, the function blocks.
>> > This causes a lot of problems in the current thread as it might just get
>> > blocked for a long period of time.
>> > Plus, there are always network related latency to add to the problem.
>> >
>> > Usually the way to deal with this is to fire scan like queries on a
>> > separate thread and then consume the results in the main thread.
>> >
>> > Please take care of this scenario while implementation of scan operator.
>> >
>> > -Chinmay.
>> >
>> >
>> > ~ Chinmay.
>> >
>> > On Tue, Dec 22, 2015 at 11:08 AM, Sandeep Deshmukh <
>> > [email protected]>
>> > wrote:
>> >
>> > > +1 for this Bhupesh.
>> > >
>> > > Additionally, I would suggest to add support for;
>> > > 1. Point query
>> > > 2. Returning any row version
>> > >
>> > > The above two are key features of HBase and should be supported.
>> > >
>> > > Regards,
>> > > Sandeep
>> > >
>> > > On Fri, Dec 18, 2015 at 4:39 PM, Bhupesh Chawda <
>> [email protected]
>> > >
>> > > wrote:
>> > >
>> > > > Hi All,
>> > > >
>> > > > The current HBasePOJOInputOperator does not allow us to do the
>> > following:
>> > > >
>> > > >    1. Allow us to specify a set of "column family: column" and fetch
>> > data
>> > > >    only for these columns.
>> > > >    2. Output format is currently a POJO. We need to have other
>> output
>> > > >    formats such that "columnFamily:column" representation is
>> supported.
>> > > > Map /
>> > > >    CSV are some of the options.
>> > > >    3. Allow specifying "end row-key" to stop scanning a table.
>> > > >    4. No metrics.
>> > > >
>> > > > I am planning to add the above functionality to the HBase Input
>> > > operators.
>> > > > These features may go into the HBaseScanOperator /
>> > > HBasePOJOInputOperator.
>> > > >
>> > > > Please let me know your comments.
>> > > >
>> > > > Thanks.
>> > > >
>> > > > Bhupesh
>> > > >
>> > >
>> >
>>
>
>

Reply via email to