[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16191023#comment-16191023
 ] 

Chia-Ping Tsai commented on HBASE-18752:
----------------------------------------

bq.  after this change the min and max timeRange both will be same?
No, what this patch try to fix is to correct the {{TimeRange}} in the hfile. 
See {{TestHStore#testTimeRangeIfSomeCellsAreDroppedInFlush}}
{code}
+  @Test
+  public void testTimeRangeIfSomeCellsAreDroppedInFlush() throws IOException {
+    init(this.name.getMethodName(), TEST_UTIL.getConfiguration(),
+        
ColumnFamilyDescriptorBuilder.newBuilder(family).setMaxVersions(1).build());
+    long currentTs = 100;
+    final long minTs = currentTs;
+    // this cell won't be flushed to disk
+    this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), 
null);
+    // this cell won't be flushed to disk
+    this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), 
null);
+    this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), 
null);
+    flushStore(store, id++);
+
+    Collection<HStoreFile> files = store.getStorefiles();
+    assertEquals(1, files.size());
+    HStoreFile f = files.iterator().next();
+    f.initReader();
+    StoreFileReader reader = f.getReader();
+    assertEquals(currentTs - 1, reader.timeRange.getMin());
+    assertEquals(currentTs - 1, reader.timeRange.getMax());
+  }
{code}
Before this change, the min of timerange is {{currentTs}} but the cell having 
the {{currentTs}} don't be stored in the hfiles because it is dropped. That is 
a bug causing we can't filter the unnecessary file before staring reading the 
data block. After this patch, we can get the correct min of timerange.


> Recalculate the TimeRange in flushing snapshot to store file
> ------------------------------------------------------------
>
>                 Key: HBASE-18752
>                 URL: https://issues.apache.org/jira/browse/HBASE-18752
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Chia-Ping Tsai
>            Assignee: Chia-Ping Tsai
>             Fix For: 2.0.0-beta-1
>
>         Attachments: HBASE-18752.v0.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to