Re: question about writing to columns with lots of versions in map task

Christopher Dorner Sat, 01 Oct 2011 11:06:14 -0700

Hi again,

i think i solved my issue.

I simply use the byte offset of the row currently read by the Mapper asthe timestamp for the Put. This is unique for my input file, whichcontains one triple for each row. So the timestamps are unique.


Regards,
Christopher


Am 01.10.2011 13:19, schrieb Christopher Dorner:

Hallo,

I am reading a File containing RDF triples in a Map-job. the RDF triples
then are stored in a table, where columns can have lots of versions.
So i need to store many values for one rowKey in the same column.

I made the observation, that reading the file is very fast and thus some
values are put into the table with the same timestamp and therefore
overriding an existing value.

How can i avoid that? The timestamps are not necessary for later usage.

Could i simply use some sort of custom counter?

How would that work in fully distributed mode? I am working on
pseudo-distributed-mode for testing purpose right now.

Thank You and Regards,
Christopher

Re: question about writing to columns with lots of versions in map task

Reply via email to