>Is it possible that the put method call on Htable does not actually put the record in the database while also not throwing an exception?
You can. Implement a region CP (implementing RegionObserver) and implement prePut() . In this u can bypass the operation using ObserverContext#bypass(). So core will not throw exception and wont add data also -Anoop- On Fri, Aug 22, 2014 at 10:23 PM, Ted Yu <[email protected]> wrote: > bq. the result from the RowCounter program is far fewer records than I > expected. > > Can you give more detailed information about the gap ? > > Which hbase release are you running ? > > Cheers > > > On Fri, Aug 22, 2014 at 9:26 AM, Magana-zook, Steven Alan < > [email protected]> wrote: > > > Hello, > > > > I have written a program in Java that is supposed to update rows in a > > Hbase table that do not yet have a value in a certain column (blob values > > of between 5k and 50k). The program keeps track of how many puts have > been > > added to the table along with how long the program is running. These > pieces > > of information are used to calculate a speed for data ingestion (records > > per second). After running the program for multiple days, and based on > the > > average speed reported, the result from the RowCounter program is far > fewer > > records than I expected. The essential parts of the code are shown below > > (error handling and other potentially not important code omitted) along > > with the command I use to see how many rows have been updated. > > > > Is it possible that the put method call on Htable does not actually put > > the record in the database while also not throwing an exception? > > Could the output of RowCounter be incorrect? > > Am I doing something below that is obviously incorrect? > > > > Row counter command (does frequently report > OutOfOrderScannerNextException > > during execution): hbase org.apache.hadoop.hbase.mapreduce.RowCounter > > mytable cf:BLOBDATACOLUMN > > > > Code that is essentially what I am doing in my program: > > ... > > Scan scan = new Scan(); > > scan.setCaching(200); > > > > HTable targetTable = new HTable(hbaseConfiguration, > > Bytes.toBytes(tblTarget)); > > targetTable.getScanner(scan); > > > > int batchSize = 10; > > Date startTime = new Date(); > > numFilesSent = 0; > > > > Result[] rows = resultScanner.next(batchSize); > > while (rows != null) { > > for (Result row : rows) { > > byte[] rowKey = row.getRow(); > > byte[] byteArrayBlobData = getFileContentsForRow(rowKey); > > > > Put put = new Put(rowKey); > > put.add(COLUMN_FAMILY, BLOB_COLUMN, byteArrayBlobData); > > targetTable.put(put); // Auto-flush is on by default > > numFilesSent++; > > float elapsedSeconds = (new Date().getTime() - startTime.getTime()) / > > 1000.0f; > > float speed = numFilesSent / elapsedSeconds; > > System.out.println("Speed(rows/sec): " + speed); // routinely says from > 80 > > to 200+ > > } > > rows = resultScanner.next(batchSize); > > } > > ... > > > > Thanks, > > Steven > > >
