I don't think Phoenix will solve his problem. 

He also needs to explain more about his problem before we can start to think 
about the problem. 


On Apr 25, 2013, at 4:54 PM, lars hofhansl <[email protected]> wrote:

> You might want to have a look at Phoenix 
> (https://github.com/forcedotcom/phoenix), which does that and more, and gives 
> a SQL/JDBC interface.
> 
> -- Lars
> 
> 
> 
> ________________________________
> From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) <[email protected]>
> To: [email protected] 
> Sent: Thursday, April 25, 2013 2:44 PM
> Subject: Coprocessors
> 
> 
> Folks:
> 
> This is my first post on the HBase user mailing list. 
> 
> I have the following scenario:
> I've a HBase table of upto a billion keys. I'm looking to support an 
> application where on some user action, I'd need to fetch multiple columns for 
> upto 250K keys and do some sort of aggregation on it. Fetching all that data 
> and doing the aggregation in my application takes about a minute.
> 
> I'm looking to co-locate the aggregation logic with the region servers to
> a. Distribute the aggregation
> b. Avoid having to fetch large amounts of data over the network (this could 
> potentially be cross-datacenter)
> 
> Neither observers nor aggregation endpoints work for this use case. Observers 
> don't return data back to the client while aggregation endpoints work in the 
> context of scans not a multi-get (Are these correct assumptions?).
> 
> I'm looking to write a service that runs alongside the region servers and 
> acts a proxy b/w my application and the region servers. 
> 
> I plan to use the logic in HBase client's HConnectionManager, to segment my 
> request of 1M rowkeys into sub-requests per region-server. These are sent 
> over to the proxy which fetches the data from the region server, aggregates 
> locally and sends data back. Does this sound reasonable or even a useful 
> thing to pursue?
> 
> Regards,
> -sudarshan

Reply via email to