Dear N Keywal, Thanks so much for your reply!
The total amount of data is about 110M. The available memory is enough, 2G. In Java, I just set a collection to NULL to collect garbage. Do you think it is fine? Best regards, Bing On Wed, Aug 29, 2012 at 11:22 PM, N Keywal <[email protected]> wrote: > Hi Bing, > > You should expect HBase to be slower in the generic case: > 1) it writes much more data (see hbase data model), with extra columns > qualifiers, timestamps & so on. > 2) the data is written multiple times: once in the write-ahead-log, once > per replica on datanode & so on again. > 3) there are inter process calls & inter machine calls on the critical > path. > > This is the cost of the atomicity, reliability and scalability features. > With these features in mind, HBase is reasonably fast to save data on a > cluster. > > On your specific case (without the points 2 & 3 above), the performance > seems to be very bad. > > You should first look at: > - how much is spent in the put vs. preparing the list > - do you have garbage collection going on? even swap? > - what's the size of your final Array vs. the available memory? > > Cheers, > > N. > > > > On Wed, Aug 29, 2012 at 4:08 PM, Bing Li <[email protected]> wrote: > >> Dear all, >> >> By the way, my HBase is in the pseudo-distributed mode. Thanks! >> >> Best regards, >> Bing >> >> On Wed, Aug 29, 2012 at 10:04 PM, Bing Li <[email protected]> wrote: >> >> > Dear all, >> > >> > According to my experiences, it is very slow for HBase to save data? Am >> I >> > right? >> > >> > For example, today I need to save data in a HashMap to HBase. It took >> > about more than three hours. However when saving the same HashMap in a >> file >> > in the text format with the redirected System.out, it took only 4.5 >> seconds! >> > >> > Why is HBase so slow? It is indexing? >> > >> > My code to save data in HBase is as follows. I think the code must be >> > correct. >> > >> > ...... >> > public synchronized void >> > AddVirtualOutgoingHHNeighbors(ConcurrentHashMap<String, >> > ConcurrentHashMap<String, Set<String>>> hhOutNeighborMap, int >> timingScale) >> > { >> > List<Put> puts = new ArrayList<Put>(); >> > >> > String hhNeighborRowKey; >> > Put hubKeyPut; >> > Put groupKeyPut; >> > Put topGroupKeyPut; >> > Put timingScalePut; >> > Put nodeKeyPut; >> > Put hubNeighborTypePut; >> > >> > for (Map.Entry<String, ConcurrentHashMap<String, >> > Set<String>>> sourceHubGroupNeighborEntry : hhOutNeighborMap.entrySet()) >> > { >> > for (Map.Entry<String, Set<String>> >> > groupNeighborEntry : sourceHubGroupNeighborEntry.getValue().entrySet()) >> > { >> > for (String neighborKey : >> > groupNeighborEntry.getValue()) >> > { >> > hhNeighborRowKey = >> > NeighborStructure.HUB_HUB_NEIGHBOR_ROW + >> > Tools.GetAHash(sourceHubGroupNeighborEntry.getKey() + >> > groupNeighborEntry.getKey() + timingScale + neighborKey); >> > >> > hubKeyPut = new >> > Put(Bytes.toBytes(hhNeighborRowKey)); >> > >> > hubKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), >> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_HUB_KEY_COLUMN), >> > Bytes.toBytes(sourceHubGroupNeighborEntry.getKey())); >> > puts.add(hubKeyPut); >> > >> > groupKeyPut = new >> > Put(Bytes.toBytes(hhNeighborRowKey)); >> > >> > >> groupKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), >> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_GROUP_KEY_COLUMN), >> > Bytes.toBytes(groupNeighborEntry.getKey())); >> > puts.add(groupKeyPut); >> > >> > topGroupKeyPut = new >> > Put(Bytes.toBytes(hhNeighborRowKey)); >> > >> > >> topGroupKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), >> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_TOP_GROUP_KEY_COLUMN), >> > >> Bytes.toBytes(GroupRegistry.WWW().GetParentGroupKey(groupNeighborEntry.getKey()))); >> > puts.add(topGroupKeyPut); >> > >> > timingScalePut = new >> > Put(Bytes.toBytes(hhNeighborRowKey)); >> > >> > >> timingScalePut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), >> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_TIMING_SCALE_COLUMN), >> > Bytes.toBytes(timingScale)); >> > puts.add(timingScalePut); >> > >> > nodeKeyPut = new >> > Put(Bytes.toBytes(hhNeighborRowKey)); >> > >> > nodeKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), >> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_NODE_KEY_COLUMN), >> > Bytes.toBytes(neighborKey)); >> > puts.add(nodeKeyPut); >> > >> > hubNeighborTypePut = new >> > Put(Bytes.toBytes(hhNeighborRowKey)); >> > >> > >> hubNeighborTypePut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), >> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_TYPE_COLUMN), >> > Bytes.toBytes(SocialRole.VIRTUAL_NEIGHBOR)); >> > puts.add(hubNeighborTypePut); >> > } >> > } >> > } >> > >> > try >> > { >> > this.neighborTable.put(puts); >> > } >> > catch (IOException e) >> > { >> > e.printStackTrace(); >> > } >> > } >> > ...... >> > >> > Thanks so much! >> > >> > Best regards, >> > Bing >> > >> > >
