In my experience , insert data under 15k/s per region server to avoid gc, compaction.
On Thu, Aug 30, 2012 at 1:45 AM, Bing Li <[email protected]> wrote: > Dear Cristofer, > > Thanks so much for your reminding! > > Best regards, > Bing > > On Thu, Aug 30, 2012 at 12:32 AM, Cristofer Weber < > [email protected]> wrote: > > > There's also a lot of conversions from same values to byte array > > representation, eg, your NeighborStructure constants. You should do this > > conversion only once to save time, since you are doing this inside 3 > nested > > loops. Not sure about how much this can improve, but you should try this > > also. > > > > Best regards, > > Cristofer > > > > -----Mensagem original----- > > De: Bing Li [mailto:[email protected]] > > Enviada em: quarta-feira, 29 de agosto de 2012 13:07 > > Para: [email protected] > > Cc: [email protected] > > Assunto: Re: HBase Is So Slow To Save Data? > > > > I see. Thanks so much! > > > > Bing > > > > > > On Wed, Aug 29, 2012 at 11:59 PM, N Keywal <[email protected]> wrote: > > > > > It's not useful here: if you have a memory issue, it's when your using > > > the list, not when you have finished with it and set it to null. > > > You need to monitor the memory consumption of the jvm, both the client > > > & the server. > > > Google around these keywords, there are many examples on the web. > > > Google as well arrayList initialization. > > > > > > Note as well that the important is not the memory size of the > > > structure on disk but the size of the" List<Put> puts = new > > > ArrayList<Put>();" before the table put. > > > > > > On Wed, Aug 29, 2012 at 5:42 PM, Bing Li <[email protected]> wrote: > > > > > > > Dear N Keywal, > > > > > > > > Thanks so much for your reply! > > > > > > > > The total amount of data is about 110M. The available memory is > > > > enough, > > > 2G. > > > > > > > > In Java, I just set a collection to NULL to collect garbage. Do you > > > > think it is fine? > > > > > > > > Best regards, > > > > Bing > > > > > > > > > > > > On Wed, Aug 29, 2012 at 11:22 PM, N Keywal <[email protected]> > wrote: > > > > > > > >> Hi Bing, > > > >> > > > >> You should expect HBase to be slower in the generic case: > > > >> 1) it writes much more data (see hbase data model), with extra > > > >> columns qualifiers, timestamps & so on. > > > >> 2) the data is written multiple times: once in the write-ahead-log, > > > >> once per replica on datanode & so on again. > > > >> 3) there are inter process calls & inter machine calls on the > > > >> critical path. > > > >> > > > >> This is the cost of the atomicity, reliability and scalability > > features. > > > >> With these features in mind, HBase is reasonably fast to save data > > > >> on a cluster. > > > >> > > > >> On your specific case (without the points 2 & 3 above), the > > > >> performance seems to be very bad. > > > >> > > > >> You should first look at: > > > >> - how much is spent in the put vs. preparing the list > > > >> - do you have garbage collection going on? even swap? > > > >> - what's the size of your final Array vs. the available memory? > > > >> > > > >> Cheers, > > > >> > > > >> N. > > > >> > > > >> > > > >> > > > >> On Wed, Aug 29, 2012 at 4:08 PM, Bing Li <[email protected]> wrote: > > > >> > > > >>> Dear all, > > > >>> > > > >>> By the way, my HBase is in the pseudo-distributed mode. Thanks! > > > >>> > > > >>> Best regards, > > > >>> Bing > > > >>> > > > >>> On Wed, Aug 29, 2012 at 10:04 PM, Bing Li <[email protected]> > wrote: > > > >>> > > > >>> > Dear all, > > > >>> > > > > >>> > According to my experiences, it is very slow for HBase to save > > data? > > > >>> Am I > > > >>> > right? > > > >>> > > > > >>> > For example, today I need to save data in a HashMap to HBase. It > > > >>> > took about more than three hours. However when saving the same > > > >>> > HashMap in > > > a > > > >>> file > > > >>> > in the text format with the redirected System.out, it took only > > > >>> > 4.5 > > > >>> seconds! > > > >>> > > > > >>> > Why is HBase so slow? It is indexing? > > > >>> > > > > >>> > My code to save data in HBase is as follows. I think the code > > > >>> > must be correct. > > > >>> > > > > >>> > ...... > > > >>> > public synchronized void > > > >>> > AddVirtualOutgoingHHNeighbors(ConcurrentHashMap<String, > > > >>> > ConcurrentHashMap<String, Set<String>>> hhOutNeighborMap, int > > > >>> timingScale) > > > >>> > { > > > >>> > List<Put> puts = new ArrayList<Put>(); > > > >>> > > > > >>> > String hhNeighborRowKey; > > > >>> > Put hubKeyPut; > > > >>> > Put groupKeyPut; > > > >>> > Put topGroupKeyPut; > > > >>> > Put timingScalePut; > > > >>> > Put nodeKeyPut; > > > >>> > Put hubNeighborTypePut; > > > >>> > > > > >>> > for (Map.Entry<String, ConcurrentHashMap<String, > > > >>> > Set<String>>> sourceHubGroupNeighborEntry : > > > >>> hhOutNeighborMap.entrySet()) > > > >>> > { > > > >>> > for (Map.Entry<String, Set<String>> > > > >>> > groupNeighborEntry : > > > sourceHubGroupNeighborEntry.getValue().entrySet()) > > > >>> > { > > > >>> > for (String neighborKey : > > > >>> > groupNeighborEntry.getValue()) > > > >>> > { > > > >>> > hhNeighborRowKey = > > > >>> > NeighborStructure.HUB_HUB_NEIGHBOR_ROW + > > > >>> > Tools.GetAHash(sourceHubGroupNeighborEntry.getKey() + > > > >>> > groupNeighborEntry.getKey() + timingScale + neighborKey); > > > >>> > > > > >>> > hubKeyPut = new > > > >>> > Put(Bytes.toBytes(hhNeighborRowKey)); > > > >>> > > > > >>> > > > > hubKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY) > > > , > > > >>> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_HUB_KEY_COLUMN) > > > >>> > , Bytes.toBytes(sourceHubGroupNeighborEntry.getKey())); > > > >>> > puts.add(hubKeyPut); > > > >>> > > > > >>> > groupKeyPut = new > > > >>> > Put(Bytes.toBytes(hhNeighborRowKey)); > > > >>> > > > > >>> > > > > >>> > > > groupKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMIL > > > Y), > > > >>> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_GROUP_KEY_COLUM > > > >>> > N), Bytes.toBytes(groupNeighborEntry.getKey())); > > > >>> > puts.add(groupKeyPut); > > > >>> > > > > >>> > topGroupKeyPut = new > > > >>> > Put(Bytes.toBytes(hhNeighborRowKey)); > > > >>> > > > > >>> > > > > >>> > > > topGroupKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FA > > > MILY), > > > >>> > > > > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_TOP_GROUP_KEY_COLUMN) > > > , > > > >>> > > > > >>> > > > Bytes.toBytes(GroupRegistry.WWW().GetParentGroupKey(groupNeighborEntry > > > .getKey()))); > > > >>> > > > > >>> > puts.add(topGroupKeyPut); > > > >>> > > > > >>> > timingScalePut = new > > > >>> > Put(Bytes.toBytes(hhNeighborRowKey)); > > > >>> > > > > >>> > > > > >>> > > > timingScalePut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FA > > > MILY), > > > >>> > > > > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_TIMING_SCALE_COLUMN), > > > >>> > Bytes.toBytes(timingScale)); > > > >>> > > > > >>> > puts.add(timingScalePut); > > > >>> > > > > >>> > nodeKeyPut = new > > > >>> > Put(Bytes.toBytes(hhNeighborRowKey)); > > > >>> > > > > >>> > > > > >>> > > > nodeKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY > > > ), > > > >>> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_NODE_KEY_COLUMN > > > >>> > ), > > > >>> > Bytes.toBytes(neighborKey)); > > > >>> > puts.add(nodeKeyPut); > > > >>> > > > > >>> > hubNeighborTypePut = new > > > >>> > Put(Bytes.toBytes(hhNeighborRowKey)); > > > >>> > > > > >>> > > > > >>> > > > hubNeighborTypePut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBO > > > R_FAMILY), > > > >>> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_TYPE_COLUMN), > > > >>> > Bytes.toBytes(SocialRole.VIRTUAL_NEIGHBOR)); > > > >>> > > > puts.add(hubNeighborTypePut); > > > >>> > } > > > >>> > } > > > >>> > } > > > >>> > > > > >>> > try > > > >>> > { > > > >>> > this.neighborTable.put(puts); > > > >>> > } > > > >>> > catch (IOException e) > > > >>> > { > > > >>> > e.printStackTrace(); > > > >>> > } > > > >>> > } > > > >>> > ...... > > > >>> > > > > >>> > Thanks so much! > > > >>> > > > > >>> > Best regards, > > > >>> > Bing > > > >>> > > > > >>> > > > >> > > > >> > > > > > > > > > >
