Ah! I had some misunderstandings implanted in me, and good to get corrected.

For

connector.tableOperations.create(String tableName, boolean limitVersion);


Will limitVersion=false disable versioning completely and I will always
only have one version, or will it have a "no limit" and "no removal" policy
of versions?

Well, to be clear, I am looking for "versions never to be removed", a
requirement that made me smile and remember "Accumulo can do that
automatically", rather than implement that at a higher level.

Thanks

On Tue, Apr 14, 2020 at 12:55 AM Adam J. Shook <[email protected]> wrote:

> Hi Niclas,
>
> 1. Accumulo uses a VersioningIterator for all tables which ensures that
> you see the latest version of a particular entry, defined as the entry that
> has the highest value for the timestamp.  Older versions of the same key
> (row ID + family + qualifier + visibility) are compacted away by Accumulo
> and will eventually be deleted.  You can set the number of versions you
> want to keep to something other than the default of 1 (see
> https://accumulo.apache.org/1.9/accumulo_user_manual.html#_versioning_iterators_and_timestamps
> ).
>
> 2. Related to #1, Accumulo will update the value to the latest version of
> entry.  I believe if you keep writing the same entry with the same data
> over and over again, you'll see them if you are keeping more than one
> version of the same entry.  AFAIK there is no "put if absent" behavior
> without reading for every write.  You can, of course, configure an existing
> iterator or write your own to achieve whatever logic you want as far as
> what versions to keep of what columns of your data model.
>
> 3. The "Scanner" will return entries in order.  Related to #1, it will
> only return the latest version of an entry (by default).  If you are
> keeping more versions of the same entry, then you would see the newest
> entry first.  The "BatchScanner" is multi-threaded and communicates to
> several tablets at once, returning entries out of order.  One common
> pattern is to use the WholeRowIterator when scanning.  This iterator
> serializes all entries with the same row into one entry on the server side,
> then you can deserialize the row on the client side to view the entire
> contents of a row at once.  The order of the rows themselves is still
> undefined when using a BatchScanner due to the multi-threaded nature of the
> scanner.
>
> Hope this helps!
> --Adam
>
> On Mon, Apr 13, 2020 at 12:57 AM Niclas Hedhman <[email protected]> wrote:
>
>> Hi,
>> I am steaming new on Accumulo, but tasked to put it into what used to be
>> Apache Polygene (now in Attic) as a entity store, one that keeps history.
>>
>> I have a couple of questions;
>> 1. Assuming that I can guarantee that no one executes any explicit
>> deletes, can I rely on the mutation sequences not disappearing over time?
>>
>> 2. Part of storing a row, I have a "metadata" qualifier, that contains
>> static information. But since I don't know whether the row exists without
>> reading it first, then IIUIC I will fill the "metadata" with the same
>> information over and over again.... OR, does Accumulo realize that this is
>> the same byte[] as before and won't update the value, alternatively
>> creating a new Key, but pointing to the same Value?  I effectively want a
>> "putIfAbsent()"
>>
>> 3. The Scanner can fetch multiple rows, and constrained by CF and
>> qualifier. I think that is quite clear. But what does the iterator()
>> actually return? I presume that it is many key/value paris, of ALL
>> timestamped values. But what is the order guarantees here? I get the
>> impression that within a row->cf->qualifier, the returned values are in
>> timestamp order, newest first. And I think that within a row, I am
>> guaranteed that the order maintained, i.e. row -> cf -> qualifier (all
>> ascending). But am I also guaranteed that the iterator is "done" with a row
>> when the has changed? Or can rows be interleaved in the iterator?
>>
>> Thanks in advance
>> Niclas
>>
>

Reply via email to