[jira] [Commented] (HBASE-3646) When mapper writes multiple values for a key keep chronological order of values
[ https://issues.apache.org/jira/browse/HBASE-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13469703#comment-13469703 ] Lars Hofhansl commented on HBASE-3646: -- Any update on this? When mapper writes multiple values for a key keep chronological order of values --- Key: HBASE-3646 URL: https://issues.apache.org/jira/browse/HBASE-3646 Project: HBase Issue Type: New Feature Components: Client Affects Versions: 0.90.1 Environment: Cloudera 3.5 VM TableMapperImmutableBytesWritable,IntWritable TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable Reporter: Bob Cummins Priority: Minor When mapper writes multiple values for a key, the underlying collection class maps each of the values to the key, but not always in chronological order. If chronological order were guaranteed each of the values mapped to the key, each of the values could be understood as specific and different parameters between the mapper and the reducer. I've done little tricks like having the mapper flag one a the values by making it a negative number, which the reducer recognizes and can write it to hbase as a unique column value.This is a kluge workaround which it would be nice to not have to do. Used to formulate this suggestion: TableMapperImmutableBytesWritable,IntWritable TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3646) When mapper writes multiple values for a key keep chronological order of values
[ https://issues.apache.org/jira/browse/HBASE-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13469720#comment-13469720 ] Jesse Yates commented on HBASE-3646: Nope - I have no idea what Bob was after with this. I'm okay if we want to close this as won't fix/can't reproduce. When mapper writes multiple values for a key keep chronological order of values --- Key: HBASE-3646 URL: https://issues.apache.org/jira/browse/HBASE-3646 Project: HBase Issue Type: New Feature Components: Client Affects Versions: 0.90.1 Environment: Cloudera 3.5 VM TableMapperImmutableBytesWritable,IntWritable TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable Reporter: Bob Cummins Priority: Minor When mapper writes multiple values for a key, the underlying collection class maps each of the values to the key, but not always in chronological order. If chronological order were guaranteed each of the values mapped to the key, each of the values could be understood as specific and different parameters between the mapper and the reducer. I've done little tricks like having the mapper flag one a the values by making it a negative number, which the reducer recognizes and can write it to hbase as a unique column value.This is a kluge workaround which it would be nice to not have to do. Used to formulate this suggestion: TableMapperImmutableBytesWritable,IntWritable TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3646) When mapper writes multiple values for a key keep chronological order of values
[ https://issues.apache.org/jira/browse/HBASE-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109420#comment-13109420 ] Bob Cummins commented on HBASE-3646: Jesse, I'll get back to you soon. Thanks, Bob Robert T. Cummins, Jr.CEH, LPIC-1, CREA, CPT, GSEC, Network+ When mapper writes multiple values for a key keep chronological order of values --- Key: HBASE-3646 URL: https://issues.apache.org/jira/browse/HBASE-3646 Project: HBase Issue Type: New Feature Components: client Affects Versions: 0.90.1 Environment: Cloudera 3.5 VM TableMapperImmutableBytesWritable,IntWritable TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable Reporter: Bob Cummins Priority: Minor When mapper writes multiple values for a key, the underlying collection class maps each of the values to the key, but not always in chronological order. If chronological order were guaranteed each of the values mapped to the key, each of the values could be understood as specific and different parameters between the mapper and the reducer. I've done little tricks like having the mapper flag one a the values by making it a negative number, which the reducer recognizes and can write it to hbase as a unique column value.This is a kluge workaround which it would be nice to not have to do. Used to formulate this suggestion: TableMapperImmutableBytesWritable,IntWritable TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3646) When mapper writes multiple values for a key keep chronological order of values
[ https://issues.apache.org/jira/browse/HBASE-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107210#comment-13107210 ] Jesse Yates commented on HBASE-3646: @Bob (or @stack) what is meant by 'chronological order' - the timestamp from HBase or the time that the key is written into the context? Also, in what context would the index argument be used? Would you be mapping values back to a row? When passing ImmutableBytesWritable to the reducer, are you just passing the Row returned from TableRecordReader or your own version? Sending the KeyValue pairs from the Result seems to make more sense, at least at first blush, to me. I'm looking at this issue sight unseen (rather than actually having the problem myself), so a little in the dark. Been thinking about this for a while (and dug into the code a bit), and I'm thinking this may be a straight patch to org.apahe.hadoop.mapreduce.Mapper.Context (est. in the first comment), if we need to do anything at all. Then given that, do we even need to maintain this ticket? Is it just going to be used track the change we would need to make in Hadoop-core? It seems like if we were going to add something to hbase it would be a class that would bind kVs together and be comparable (a KeyValue that is WritableComparable) including sorting wrt timestamp (it that is what is meant be chronological). So just adding compareTo(KV) using the KeyValueComparator. When mapper writes multiple values for a key keep chronological order of values --- Key: HBASE-3646 URL: https://issues.apache.org/jira/browse/HBASE-3646 Project: HBase Issue Type: New Feature Components: client Affects Versions: 0.90.1 Environment: Cloudera 3.5 VM TableMapperImmutableBytesWritable,IntWritable TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable Reporter: Bob Cummins Priority: Minor When mapper writes multiple values for a key, the underlying collection class maps each of the values to the key, but not always in chronological order. If chronological order were guaranteed each of the values mapped to the key, each of the values could be understood as specific and different parameters between the mapper and the reducer. I've done little tricks like having the mapper flag one a the values by making it a negative number, which the reducer recognizes and can write it to hbase as a unique column value.This is a kluge workaround which it would be nice to not have to do. Used to formulate this suggestion: TableMapperImmutableBytesWritable,IntWritable TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3646) When mapper writes multiple values for a key keep chronological order of values
[ https://issues.apache.org/jira/browse/HBASE-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106751#comment-13106751 ] stack commented on HBASE-3646: -- @Jesse I don't think anything in here is changed so I'd say this patch still needed. When mapper writes multiple values for a key keep chronological order of values --- Key: HBASE-3646 URL: https://issues.apache.org/jira/browse/HBASE-3646 Project: HBase Issue Type: New Feature Components: client Affects Versions: 0.90.1 Environment: Cloudera 3.5 VM TableMapperImmutableBytesWritable,IntWritable TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable Reporter: Bob Cummins Priority: Minor When mapper writes multiple values for a key, the underlying collection class maps each of the values to the key, but not always in chronological order. If chronological order were guaranteed each of the values mapped to the key, each of the values could be understood as specific and different parameters between the mapper and the reducer. I've done little tricks like having the mapper flag one a the values by making it a negative number, which the reducer recognizes and can write it to hbase as a unique column value.This is a kluge workaround which it would be nice to not have to do. Used to formulate this suggestion: TableMapperImmutableBytesWritable,IntWritable TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3646) When mapper writes multiple values for a key keep chronological order of values
[ https://issues.apache.org/jira/browse/HBASE-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007073#comment-13007073 ] stack commented on HBASE-3646: -- @Bob Is that just a matter of changing the data structure that is at core of Context? When mapper writes multiple values for a key keep chronological order of values --- Key: HBASE-3646 URL: https://issues.apache.org/jira/browse/HBASE-3646 Project: HBase Issue Type: New Feature Components: client Affects Versions: 0.90.1 Environment: Cloudera 3.5 VM TableMapperImmutableBytesWritable,IntWritable TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable Reporter: Bob Cummins Priority: Minor When mapper writes multiple values for a key, the underlying collection class maps each of the values to the key, but not always in chronological order. If chronological order were guaranteed each of the values mapped to the key, each of the values could be understood as specific and different parameters between the mapper and the reducer. I've done little tricks like having the mapper flag one a the values by making it a negative number, which the reducer recognizes and can write it to hbase as a unique column value.This is a kluge workaround which it would be nice to not have to do. Used to formulate this suggestion: TableMapperImmutableBytesWritable,IntWritable TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira