Re: Noob questions

2020-04-15 Thread Niclas Hedhman
I guess hdfs has overhead, so I don't worry about that. So in my case, I had stored some dozens of rows, and heaps of columns in each, with values in the 50-100 character range. When doing "scan -t dataTable" I got back a dozen or more pages filled with more than 100 characters per line, and "du"

Re: Noob questions

2020-04-14 Thread Christopher
The `du` command should show in bytes. Keep in mind that Accumulo compresses data in its files. If the number doesn't match what you see for the *.rf files in Hadoop, there may be a bug. Please let us know if you find this to be the case. On Tue, Apr 14, 2020 at 10:30 PM Niclas Hedhman wrote: >

Re: Noob questions

2020-04-14 Thread Niclas Hedhman
Yes, a bit of experimentation and I figured that out. As for the "putIfAbsent"; I can actually figure that out from the data being written in this case, effectively an event store, and all rows starts with a "created" event. One more small question; there is a "du" command, does it really report

Re: Noob questions

2020-04-14 Thread Emilio Lahr-Vivaz
You should be able to use a conditional writer to support 'put if absent': https://accumulo.apache.org/docs/2.x/getting-started/clients#conditionalwriter Generally you would not want to repeatedly write the same key/value, as you will have to scan every single versioned entry when you want to

Re: Noob questions

2020-04-14 Thread Adam J. Shook
limitVersion = false would *not* set the default VersioningIterator, effectively keeping every entry you write to Accumulo. Sounds like it hits your requirement of "versions never to be removed", though keep in mind that your static "metadata" qualifier would also never be versioned/deleted. On

Re: Noob questions

2020-04-13 Thread Niclas Hedhman
Ah! I had some misunderstandings implanted in me, and good to get corrected. For connector.tableOperations.create(String tableName, boolean limitVersion); Will limitVersion=false disable versioning completely and I will always only have one version, or will it have a "no limit" and "no

Re: Noob questions

2020-04-13 Thread Adam J. Shook
Hi Niclas, 1. Accumulo uses a VersioningIterator for all tables which ensures that you see the latest version of a particular entry, defined as the entry that has the highest value for the timestamp. Older versions of the same key (row ID + family + qualifier + visibility) are compacted away by

Noob questions

2020-04-12 Thread Niclas Hedhman
Hi, I am steaming new on Accumulo, but tasked to put it into what used to be Apache Polygene (now in Attic) as a entity store, one that keeps history. I have a couple of questions; 1. Assuming that I can guarantee that no one executes any explicit deletes, can I rely on the mutation sequences not