+1 for above.
I see that there is HbaseGetOperator but but its abstract no concrete
implementation of this I can find.
Are you going to implement of that too?

Maybe the concrete implementation of HbaseGetOperator should have this.

Also, I want to mention one thing about scan from my previous experience of
Hbase. The Hbase client is synchronous.
This means when you fire a scan call, until certain number of records are
received at client end, the function blocks.
This causes a lot of problems in the current thread as it might just get
blocked for a long period of time.
Plus, there are always network related latency to add to the problem.

Usually the way to deal with this is to fire scan like queries on a
separate thread and then consume the results in the main thread.

Please take care of this scenario while implementation of scan operator.

-Chinmay.


~ Chinmay.

On Tue, Dec 22, 2015 at 11:08 AM, Sandeep Deshmukh <[email protected]>
wrote:

> +1 for this Bhupesh.
>
> Additionally, I would suggest to add support for;
> 1. Point query
> 2. Returning any row version
>
> The above two are key features of HBase and should be supported.
>
> Regards,
> Sandeep
>
> On Fri, Dec 18, 2015 at 4:39 PM, Bhupesh Chawda <[email protected]>
> wrote:
>
> > Hi All,
> >
> > The current HBasePOJOInputOperator does not allow us to do the following:
> >
> >    1. Allow us to specify a set of "column family: column" and fetch data
> >    only for these columns.
> >    2. Output format is currently a POJO. We need to have other output
> >    formats such that "columnFamily:column" representation is supported.
> > Map /
> >    CSV are some of the options.
> >    3. Allow specifying "end row-key" to stop scanning a table.
> >    4. No metrics.
> >
> > I am planning to add the above functionality to the HBase Input
> operators.
> > These features may go into the HBaseScanOperator /
> HBasePOJOInputOperator.
> >
> > Please let me know your comments.
> >
> > Thanks.
> >
> > Bhupesh
> >
>

Reply via email to