Shahab: How does your application deal with KeyValue whose value is empty ?
Can you insert rows with empty value whose keys correspond to the splits ? Cheers On Tue, Aug 19, 2014 at 1:29 PM, Shahab Yunus <[email protected]> wrote: > So the situation here is that we are trying to bulk load data in to a > table. But each load of data has such range of keys that it will go to a > specific continuous chunk of the region servers. > > In other other words, at each bulk load, we face hot-spotting but not at > the end like the conventional case but it can be any where in between the > row-key range of our table. > > Please note that the split point that I am trying to split on does not > exist in the table yet. I am trying to prepare the existing table with > data, by splitting into regions into which I will then bulk import my new > data, to avoid hotspotting on one region server. > > The proof-of-concept code is below. Trying to split data into 16 regions > ('0' to 'f' of the guid since each row in this current load shares the same > value for the first 2 fields of the row key). > > Key is: > data_source + time-in-long + 32-bytes-random-guid > > /*****/ > > byte[][] splits = new byte[16][]; > byte[] dataSourceId = Bytes.toBytes(dataSource.getDataSourceID()); > byte[] loadTime = Bytes.toBytes(batchLoadTime); > byte[] guidPrefix = null; > > for(int i=0; i<splitPointsPrefixes.length; i++) { > > guidPrefix = Bytes.toBytes(splitPointsPrefixes[i]); > splits[i] = new byte[dataSourceId.length + loadTime.length + guidPrefix. > length]; > ByteBuffer splitBuffer = ByteBuffer.wrap(splits[i]); > splitBuffer.put(dataSourceId); > splitBuffer.put(loadTime); > splitBuffer.put(guidPrefix); > } > > byte[] tableNameInBytes = Bytes.toBytes(tableName); > HBaseAdmin admin = new HBaseAdmin(HBaseConfiguration.create(getConf())); > > for(byte[] split : splits) { > //This is asynchronous. Should I wait here after each split to move onto > next one? > admin.split(tableNameInBytes, split); > } > /*****/ > > Regards, > Shahab > > > On Tue, Aug 19, 2014 at 4:13 PM, Jean-Marc Spaggiari < > [email protected]> wrote: > > > Hi Shahab, > > > > can you sahre your code? Seems that the RS you reached did not have the > > expected region. How is your table status in the web interface? > > > > JM > > > > > > 2014-08-19 16:11 GMT-04:00 Shahab Yunus <[email protected]>: > > > > > I have a table already created and with some data. I want to split it > > > trough code using HBaseAdmin api into multiple regions, while > specifying > > > keys that do not exist in the table. > > > > > > I am getting the exception below which makes sense because the key > > doesn't > > > exist yet. But at the time of creation of the table we can indeed > > pre-split > > > it using keys that don't exist. > > > > > > Is it possible to do it for table that already exists and has data? > > > > > > *Caused by: > > > > > > > > > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.NotServingRegionException): > > > org.apache.hadoop.hbase.NotServingRegionException: * > > > > > > > > > Using Hbase: 0.98.1-cdh5.1.0 > > > > > > Thanks a lot. > > > > > > Regards, > > > Shahab > > > > > >
