[
https://issues.apache.org/jira/browse/HBASE-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107210#comment-13107210
]
Jesse Yates commented on HBASE-3646:
------------------------------------
@Bob (or @stack) what is meant by 'chronological order' - the timestamp from
HBase or the time that the key is written into the context? Also, in what
context would the index argument be used? Would you be mapping values back to a
row? When passing ImmutableBytesWritable to the reducer, are you just passing
the Row returned from TableRecordReader or your own version? Sending the
KeyValue pairs from the Result seems to make more sense, at least at first
blush, to me.
I'm looking at this issue sight unseen (rather than actually having the problem
myself), so a little in the dark.
Been thinking about this for a while (and dug into the code a bit), and I'm
thinking this may be a straight patch to
org.apahe.hadoop.mapreduce.Mapper.Context (est. in the first comment), if we
need to do anything at all. Then given that, do we even need to maintain this
ticket? Is it just going to be used track the change we would need to make in
Hadoop-core?
It seems like if we were going to add something to hbase it would be a class
that would bind kVs together and be comparable (a KeyValue that is
WritableComparable) including sorting wrt timestamp (it that is what is meant
be chronological). So just adding compareTo(KV) using the KeyValueComparator.
> When mapper writes multiple values for a key keep chronological order of
> values
> -------------------------------------------------------------------------------
>
> Key: HBASE-3646
> URL: https://issues.apache.org/jira/browse/HBASE-3646
> Project: HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.90.1
> Environment: Cloudera 3.5 VM
> TableMapper<ImmutableBytesWritable,IntWritable>
> TableReducer<ImmutableBytesWritable,IntWritable, ImmutableBytesWritable>
> Reporter: Bob Cummins
> Priority: Minor
>
> When mapper writes multiple values for a key, the underlying collection class
> maps each of the values to the key, but not always in chronological order. If
> chronological order were guaranteed each of the values mapped to the key,
> each of the values could be understood as specific and different parameters
> between the mapper and the reducer.
> I've done little tricks like having the mapper flag one a the values by
> making it a negative number, which the reducer recognizes and can write it to
> hbase as a unique column value.This is a kluge workaround which it would be
> nice to not have to do.
> Used to formulate this suggestion:
> TableMapper<ImmutableBytesWritable,IntWritable>
> TableReducer<ImmutableBytesWritable,IntWritable, ImmutableBytesWritable>
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira