[jira] [Commented] (HBASE-6861) HFileOutputFormat set TIMERANGE wrongly
[ https://issues.apache.org/jira/browse/HBASE-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585367#comment-13585367 ] Anoop Sam John commented on HBASE-6861: --- [~emorozov] Pls check with the latest Trunk code. With HBASE-3776 this issue also get fixed. [~stack], [~yuzhih...@gmail.com] Can we close this issue? HFileOutputFormat set TIMERANGE wrongly --- Key: HBASE-6861 URL: https://issues.apache.org/jira/browse/HBASE-6861 Project: HBase Issue Type: Bug Reporter: Eugene Morozov Priority: Minor Attachments: diff In case if timestamps for KeyValues specified differently for different column families, then TIMERANGEs of both HFiles would be wrong. Example (in pseudo code): my reducer has a condition if ( condition ) { keyValue = new KeyValue(.., CF1, .., timestamp, ..); } else { keyValue = new KeyValue(.., CF2, .., ..); // - no timestamp } context.write( keyValue ); These two keyValues would be written into two different HFiles. But the code, which is actually write do the following: // we now have the proper HLog writer. full steam ahead kv.updateLatestStamp(this.now); trt.includeTimestamp(kv); wl.writer.append(kv); Basically, two HFiles shares the same instance of trt (TimeRangeTracker), which leads to the same TIMERANGEs of both of them. Which is definitely incorrect, because first HFile must have TIMERANGE=timestamp...timestamp, cause we do not write any other timestamps there. And another HFile must have TIMERANGE=now...now by same meaning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6861) HFileOutputFormat set TIMERANGE wrongly
[ https://issues.apache.org/jira/browse/HBASE-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466897#comment-13466897 ] Ted Yu commented on HBASE-6861: --- @Eugene: Consider using the formatter from HBASE-5961. HFileOutputFormat set TIMERANGE wrongly --- Key: HBASE-6861 URL: https://issues.apache.org/jira/browse/HBASE-6861 Project: HBase Issue Type: Bug Reporter: Eugene Morozov Priority: Minor Attachments: diff In case if timestamps for KeyValues specified differently for different column families, then TIMERANGEs of both HFiles would be wrong. Example (in pseudo code): my reducer has a condition if ( condition ) { keyValue = new KeyValue(.., CF1, .., timestamp, ..); } else { keyValue = new KeyValue(.., CF2, .., ..); // - no timestamp } context.write( keyValue ); These two keyValues would be written into two different HFiles. But the code, which is actually write do the following: // we now have the proper HLog writer. full steam ahead kv.updateLatestStamp(this.now); trt.includeTimestamp(kv); wl.writer.append(kv); Basically, two HFiles shares the same instance of trt (TimeRangeTracker), which leads to the same TIMERANGEs of both of them. Which is definitely incorrect, because first HFile must have TIMERANGE=timestamp...timestamp, cause we do not write any other timestamps there. And another HFile must have TIMERANGE=now...now by same meaning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6861) HFileOutputFormat set TIMERANGE wrongly
[ https://issues.apache.org/jira/browse/HBASE-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465475#comment-13465475 ] Eugene Morozov commented on HBASE-6861: --- Basically yes, but it looks more like hack, than patch. I believe it's not correct to commit it to trunk. Do you still want me to share it? HFileOutputFormat set TIMERANGE wrongly --- Key: HBASE-6861 URL: https://issues.apache.org/jira/browse/HBASE-6861 Project: HBase Issue Type: Bug Reporter: Eugene Morozov Priority: Minor In case if timestamps for KeyValues specified differently for different column families, then TIMERANGEs of both HFiles would be wrong. Example (in pseudo code): my reducer has a condition if ( condition ) { keyValue = new KeyValue(.., CF1, .., timestamp, ..); } else { keyValue = new KeyValue(.., CF2, .., ..); // - no timestamp } context.write( keyValue ); These two keyValues would be written into two different HFiles. But the code, which is actually write do the following: // we now have the proper HLog writer. full steam ahead kv.updateLatestStamp(this.now); trt.includeTimestamp(kv); wl.writer.append(kv); Basically, two HFiles shares the same instance of trt (TimeRangeTracker), which leads to the same TIMERANGEs of both of them. Which is definitely incorrect, because first HFile must have TIMERANGE=timestamp...timestamp, cause we do not write any other timestamps there. And another HFile must have TIMERANGE=now...now by same meaning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6861) HFileOutputFormat set TIMERANGE wrongly
[ https://issues.apache.org/jira/browse/HBASE-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465751#comment-13465751 ] stack commented on HBASE-6861: -- I'd suggest adding the patch here anyways. It could inspire a patch that would get committed to trunk. Thanks. HFileOutputFormat set TIMERANGE wrongly --- Key: HBASE-6861 URL: https://issues.apache.org/jira/browse/HBASE-6861 Project: HBase Issue Type: Bug Reporter: Eugene Morozov Priority: Minor In case if timestamps for KeyValues specified differently for different column families, then TIMERANGEs of both HFiles would be wrong. Example (in pseudo code): my reducer has a condition if ( condition ) { keyValue = new KeyValue(.., CF1, .., timestamp, ..); } else { keyValue = new KeyValue(.., CF2, .., ..); // - no timestamp } context.write( keyValue ); These two keyValues would be written into two different HFiles. But the code, which is actually write do the following: // we now have the proper HLog writer. full steam ahead kv.updateLatestStamp(this.now); trt.includeTimestamp(kv); wl.writer.append(kv); Basically, two HFiles shares the same instance of trt (TimeRangeTracker), which leads to the same TIMERANGEs of both of them. Which is definitely incorrect, because first HFile must have TIMERANGE=timestamp...timestamp, cause we do not write any other timestamps there. And another HFile must have TIMERANGE=now...now by same meaning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6861) HFileOutputFormat set TIMERANGE wrongly
[ https://issues.apache.org/jira/browse/HBASE-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465032#comment-13465032 ] stack commented on HBASE-6861: -- You have a patch Eugene? Thanks. HFileOutputFormat set TIMERANGE wrongly --- Key: HBASE-6861 URL: https://issues.apache.org/jira/browse/HBASE-6861 Project: HBase Issue Type: Bug Reporter: Eugene Morozov Priority: Minor In case if timestamps for KeyValues specified differently for different column families, then TIMERANGEs of both HFiles would be wrong. Example (in pseudo code): my reducer has a condition if ( condition ) { keyValue = new KeyValue(.., CF1, .., timestamp, ..); } else { keyValue = new KeyValue(.., CF2, .., ..); // - no timestamp } context.write( keyValue ); These two keyValues would be written into two different HFiles. But the code, which is actually write do the following: // we now have the proper HLog writer. full steam ahead kv.updateLatestStamp(this.now); trt.includeTimestamp(kv); wl.writer.append(kv); Basically, two HFiles shares the same instance of trt (TimeRangeTracker), which leads to the same TIMERANGEs of both of them. Which is definitely incorrect, because first HFile must have TIMERANGE=timestamp...timestamp, cause we do not write any other timestamps there. And another HFile must have TIMERANGE=now...now by same meaning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira