Nick, Thanks for your kind suggestions.
I am not sure exactly the use case yet, just doing some experiment. Current idea is to have a join with data from a mpp database, and have a program from mpp run on each node of Hbase, so instead of get a collection of all data, the join operation can occur at each regserver lever. Actually join may not be a good example here. The idea is to access data at regserver level but still be able to leverage Hbase filters. Demai on the run On Aug 19, 2014, at 7:39 PM, Nick Dimiduk <[email protected]> wrote: > A coprocessor is certainly possible. You haven't shared your motivation, > only a specific implementation, so I cannot assist further. > > > On Tue, Aug 19, 2014 at 6:28 PM, Demai Ni <[email protected]> wrote: > >> Nick, >> >> Thanks for the quick responds, I will definitely look into the Hadoop >> streaming. >> >> What do you think about AggregationClient? It is carried out at >> region/region server level, maybe instead do a count/min/avg, a method can >> be used to write the data out to local file system? >> >> Demai on the run >> >> On Aug 19, 2014, at 5:04 PM, Nick Dimiduk <[email protected]> wrote: >> >>> This sounds an awful lot like a map-only MR job... With Hadoop Streaming, >>> you should be able to achieve your goal of piping to an arbitrary >> process. >>> >>> >>> On Tue, Aug 19, 2014 at 4:26 PM, Demai Ni <[email protected]> wrote: >>> >>>> Dear experts , >>>> >>>> I understand that I can do a simple command like: >>>> >>>> echo "scan 'table1'"| hbase she'll > myoutput >>>> >>>> This scenario i am thinking is to: >>>> 1) output to local file system(like Linux ) instead of hdfs >>>> 2) each regserver only output its only data to it's node's file system >>>> >>>> To elaborate the 2) a bit. Basically, this will be like export Hbase >> data >>>> to local file system without going through network. And on each node, >> one >>>> file will be created. >>>> >>>> Is there a way to achieve it? Actually the receiving side of 1) doesn't >>>> have to be a file system , it can be another process to process the >> data. >>>> But let's use file system to simplify the scenario for now. >>>> >>>> Thanks >>>> >>>> Demai on the run >>
