HI, Following is the approach to load hbase data into Ingnite
1. Create Cluster wide singleton distributed custom service 2. Get all region(s) information in the init() method of your custom service 3. Broadcast region(s) using ignite.compute().call() in execute() method of your custom service 4. Scan a particular region and load the cache Note : Need to handle node failure during cache load as distributed service is deployed on some other node. How a broadcast job process intermediate failure handled in ignite compute() ? rescheduled ? or ignored ? Please clarify. Please let me know if you see any anti-pattern in terms of ignite ? Thanks. On 11 October 2016 at 20:49, Anil <[email protected]> wrote: > Thank you Vladislav and Andrey. I will look at the document and give a > try. > > Thanks again. > > On 11 October 2016 at 20:47, Andrey Gura <[email protected]> wrote: > >> Hi, >> >> HBase regions doesn't map to Ignite nodes due to architectural >> differences. Each HBase region contains rows in some range of keys that >> sorted lexicographically while distribution of keys in Ignite depends on >> affinity function and key hash code. Also how do you remap region to nodes >> in case of region was splitted? >> >> Of course you can get node ID in cluster for given key but because HBase >> keeps rows sorted by keys lexicographically you should perform full scan in >> HBase table. So the simplest way for parallelization data loading from >> HBase to Ignite it concurrently scan regions and stream all rows to one or >> more DataStreamer. >> >> >> On Tue, Oct 11, 2016 at 4:11 PM, Anil <[email protected]> wrote: >> >>> HI, >>> >>> we have around 18 M records in hbase which needs to be loaded into >>> ignite cluster. >>> >>> i was looking at >>> >>> http://apacheignite.gridgain.org/v1.7/docs/data-loading >>> >>> https://github.com/apache/ignite/tree/master/examples >>> >>> is there any approach where each ignite node loads the data of one hbase >>> region ? >>> >>> Do you have any recommendations ? >>> >>> Thanks. >>> >> >> >
