Hallo,

I am reading a File containing RDF triples in a Map-job. the RDF triples then are stored in a table, where columns can have lots of versions.
So i need to store many values for one rowKey in the same column.

I made the observation, that reading the file is very fast and thus some values are put into the table with the same timestamp and therefore overriding an existing value.

How can i avoid that? The timestamps are not necessary for later usage.

Could i simply use some sort of custom counter?

How would that work in fully distributed mode? I am working on pseudo-distributed-mode for testing purpose right now.

Thank You and Regards,
Christopher

Reply via email to