Thanks for taking a look Anoop. I've just filed HBASE-14630. On Fri, Oct 16, 2015 at 6:34 AM, Anoop John <[email protected]> wrote:
> I believe the issue with the order with the per cell TTL calc and avoid > expired cells and versions control is the issue. When the scan happens > after the TTL time after second put, there will be still 2 cells in the > system. The 2nd one will not come out as it is TTL expired. But the 1st > one as such is not expired..n If the version check and select only latest > one happens 1st, and the TTL check, u would have got the desired behavior. > Mind raising a jira. We can discuss there how/whether to solve it. > > -Anoop- > > On Wed, Oct 14, 2015 at 9:43 AM, Emre Colak <[email protected]> wrote: > > > Yes, I'm trying to use the per cell TTL feature. I've tried releases > 1.0.2 > > and 1.1.2. > > > > Here's some Scala code that I've written: > > =============================== > > > > def makePut(rowKey: Array[Byte], cf: Array[Byte], qual: Array[Byte], > value: > > Array[Byte]): Put = { > > val put = new Put(rowKey) > > put.addColumn(cf, qual, value) > > put > > } > > > > def getIndex(table: Table, indexName: Array[Byte], cfName: Array[Byte]): > > Seq[(String, Array[Byte], Long)] = { > > val result = MutableList[(String, Array[Byte], Long])]() > > > > val queryResult = table.get(new Get(indexName)) > > val cellScanner: CellScanner = queryResult.cellScanner() > > while (cellScanner.advance()) { > > val cell = cellScanner.current() > > > > if (CellUtil.matchingFamily(cell, cfName)) { > > val tuple = (Bytes.toStringBinary(cell.getQualifierArray, > > cell.getQualifierOffset, cell.getQualifierLength), > > Bytes.copy(cell.getValueArray, cell.getValueOffset, > > cell.getValueLength), > > cell.getTimestamp) > > result += tuple > > } > > } > > > > result > > } > > > > def printIndices(table: Table, indexName: Array[Byte], cfName: > > Array[Byte]): Unit = { > > getIndex(table, indexName, cfName).foreach { > > case (q, v, ts) => { > > println("qualifier: %s, value: %s, ts: %d".format(q, v, ts)) > > } > > } > > } > > > > // Establish connection > > > > println("Inserting indices into the database") > > val table = connection.getTable(TableName.valueOf(tableName)) > > table.put(makePut(rowKeyBytes, cfBytes, Bytes.toBytes("idx1"), > > Array[Byte](0,0,0,0,1))) > > table.put(makePut(rowKeyBytes, cfBytes, Bytes.toBytes("idx2"), > > Array[Byte](0,0,0,1,0))) > > table.put(makePut(rowKeyBytes, cfBytes, Bytes.toBytes("idx3"), > > Array[Byte](0,0,1,0,0))) > > > > println("Indices in the database: ") > > val putList = MutableList[Put]() > > getIndex(table, rowKeyBytes, cfBytes).foreach { > > case (q, v, ts) => { > > println("qualifier: %s, value: %s, ts: %d".format(q, v, ts)) > > > > val put = makePut(rowKeyBytes, cfBytes, Bytes.toBytes(q), v) > > put.setTTL(30000) // 30 second TTL > > putList += put > > } > > putList += makePut(rowKeyBytes, cfBytes, Bytes.toBytes("idxMerged"), > > Array[Byte](0,0,1,1,1)) > > } > > > > println("Merging existing cells and setting TTLs") > > table.put(putList) > > > > println("Table contents right after the merge: ") > > printIndices(table, rowKeyBytes, cfBytes) > > > > Thread.sleep(10000) > > > > println("Table contents 10 seconds after the merge: ") > > printIndices(table, rowKeyBytes, cfBytes) > > > > Thread.sleep(30000) > > > > println("Table contents 40 seconds after the merge: ") > > printIndices(table, rowKeyBytes, cfBytes) > > > > // close table and connection > > > > And here's what it prints out: > > ========================= > > > > Inserting indices into the database > > Indices in the database: > > key: idx1, value: 0,0,0,0,1, ts: 1444791952201 > > key: idx2, value: 0,0,0,1,0, ts: 1444791952214 > > key: idx3, value: 0,0,1,0,0, ts: 1444791952218 > > Merging existing cells and setting TTLs > > Table contents right after the merge: > > key: idxMerged, value: 0,0,1,1,1, ts: 1444791952341 > > key: idx1, value: 0,0,0,0,1, ts: 1444791952341 > > key: idx2, value: 0,0,0,1,0, ts: 1444791952341 > > key: idx3, value: 0,0,1,0,0, ts: 1444791952341 > > Table contents 10 seconds after the merge: > > key: idxMerged, value: 0,0,1,1,1, ts: 1444791952341 > > key: idx1, value: 0,0,0,0,1, ts: 1444791952341 > > key: idx2, value: 0,0,0,1,0, ts: 1444791952341 > > key: idx3, value: 0,0,1,0,0, ts: 1444791952341 > > Table contents 40 seconds after the merge: > > key: idxMerged, value: 0,0,1,1,1, ts: 1444791952341 > > key: idx1, value: 0,0,0,0,1, ts: 1444791952201 > > key: idx2, value: 0,0,0,1,0, ts: 1444791952214 > > key: idx3, value: 0,0,1,0,0, ts: 1444791952218 > > > > > > On Tue, Oct 13, 2015 at 8:25 PM, Ted Yu <[email protected]> wrote: > > > > > Looks like you are using per cell TTL feature. > > > > > > Which hbase release are you using ? > > > > > > Can you formulate your description with either sequence of shell > commands > > > or a unit test ? > > > > > > Thanks > > > > > > On Tue, Oct 13, 2015 at 8:13 PM, Colak, Emre < > [email protected]> > > > wrote: > > > > > > > Hi, > > > > > > > > I have an HBase table with the following description: > > > > > > > > {NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', > > > > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', > > > > MIN_VERSIONS => '0' , TTL => 'FOREVER', KEEP_DELETED_CELLS => > 'FALSE', > > > > BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'} > > > > > > > > I put some values in it and then set TTL (30s) on those values with > > > another > > > > put operation. First thing I notice is that the timestamps of the > cells > > > get > > > > updated after the 2nd put. And 30 seconds later, when I do a scan on > > the > > > > table, I still see those cells in the table, however this time with > > their > > > > timestamps updated to the original timestamps. > > > > > > > > I understand that these cells won't necessarily be deleted until a > > > > compaction, but why do they still come up in my scan even though the > > TTL > > > > that I set on them has expired? > > > > > > > > Best, > > > > > > > > Emre > > > > > > > > > >
