Re: question about writing to columns with lots of versions in map task

Jean-Daniel Cryans Mon, 03 Oct 2011 11:32:15 -0700

I would advise against setting the timestamps yourself and instead
reduce in order to prune the versions you don't need to insert in
HBase.


J-D

On Sat, Oct 1, 2011 at 11:05 AM, Christopher Dorner
<[email protected]> wrote:
> Hi again,
>
> i think i solved my issue.
>
> I simply use the byte offset of the row currently read by the Mapper as the
> timestamp for the Put. This is unique for my input file, which contains one
> triple for each row. So the timestamps are unique.
>
> Regards,
> Christopher
>
>
> Am 01.10.2011 13:19, schrieb Christopher Dorner:
>>
>> Hallo,
>>
>> I am reading a File containing RDF triples in a Map-job. the RDF triples
>> then are stored in a table, where columns can have lots of versions.
>> So i need to store many values for one rowKey in the same column.
>>
>> I made the observation, that reading the file is very fast and thus some
>> values are put into the table with the same timestamp and therefore
>> overriding an existing value.
>>
>> How can i avoid that? The timestamps are not necessary for later usage.
>>
>> Could i simply use some sort of custom counter?
>>
>> How would that work in fully distributed mode? I am working on
>> pseudo-distributed-mode for testing purpose right now.
>>
>> Thank You and Regards,
>> Christopher
>
>

Re: question about writing to columns with lots of versions in map task

Reply via email to