from:"Dylan Hutchison"

Re: Key Refactroing

2017-06-21 Thread Dylan Hutchison

Hi Sven,

There are other solutions that depend on what your Key schema
transformation is.

If the new schema is order-compatible with the old one, meaning that the
new Keys have the same sort order as the old keys, then you could (1) clone
the table and (2) attach a server-side SortedKeyValueIterator (SKVI) that
performs the transformation on all iterator scopes.  This will change the
schema "on the fly".  Even if your new schema is order-compatible with the
old schema *up to a prefix* (say, up to the Row), you could use this trick
inside your SKVI by (1) gathering all keys within that prefix (e.g.,
WholeRowIterator), (2) transforming each gathered Key, and (3) emitting the
new Keys in sorted order.

If your Key schema transformation non-monotonically changes the Key sort
order, there are fewer built-in Accumulo options.  You might look at the
iterator framework provided by the Graphulo library
.  Graphulo is built to do complex server-side
data processing, reading in entries from some number of tables and writing
them out to a new table at the server (see RemoteWriteIterator
).
Disclaimer: I authored Graphulo.

If you decide to go with your original solution, you might consider running
multiple such Accumulo clients in parallel.

Cheers, Dylan

On Wed, Jun 21, 2017 at 1:49 AM, Sven Hodapp  wrote:

> Hi there,
>
> I would like to select a subset of a Accumulo talbe and refactor the keys
> to create a new table.
> There are about 30M records with a value size about 5-20KB each.
> I'm using Accumulo 1.8.0 and Java accumulo-core client library 1.8.0.
>
> I've written client code like that:
>
>  * create a scanner fetching a specific column in a specific range
>  * transforming the key into the new schema
>  * using a batch writer to write the new generated mutations into the new
> table
>
> scan = createScanner(FROM, auths)
> // range, fetchColumn
> writer = createBatchWriter(TO, configWriter)
> iter = scan.iterator()
> while (iter.hasNext()) {
> entry = iter.next()
> // create mutation with new key schema, but unaltered value
> writer.addMutation(mutation)
> }
> writer.close()
>
> But this is slow and error prone (hiccups, ...).
> Is it possible to use the Accumulo shell for such a task?
> Are there another solutions I can use or some tricks?
>
> Thank you very much for any advices!
>
> Regards,
> Sven
>
> --
> Sven Hodapp, M.Sc.,
> Fraunhofer Institute for Algorithms and Scientific Computing SCAI,
> Department of Bioinformatics
> Schloss Birlinghoven, 53754 Sankt Augustin, Germany
> sven.hod...@scai.fraunhofer.de
> www.scai.fraunhofer.de
>

Re: Teardown and deepCopy

2017-01-04 Thread Dylan Hutchison

During a batch scan, many tablets are scanned in parallel.  If I understand
your scenario correctly, each tablet scan will build a set of column IDs
seen so far, so that each scan can skip IDs that the scan has already seen
rather than re-transmit them.  The goal is to find the unique column IDs
across the whole scan.

In this case, when an iterator is torn down, it drops its set of already
seen IDs and starts from scratch.

This sounds fine, as long as you have the ability to do final
de-duplication at the client.  The same ID might be retrieved from
different tablets.  Check to see if this meets your performance
requirements.

If you need to retrieve the unique column IDs faster, you might consider
storing them in a secondary index table where the column IDs are placed in
the row.  Scanning unique IDs from the row is easy because they are sorted.

On Wed, Jan 4, 2017 at 8:42 AM, Roshan Punnoose  wrote:

> I have a tablet with an unsorted list of IDs in the Column Qualifier,
> these IDs can repeat sporadically. So I was hoping to keep a set of these
> IDs around in memory to check if I have seen an ID or not. There is some
> other logic to ensure that the set does not grow unbounded, but just trying
> to figure out if I can keep this ID set around. With the teardown, even
> though I know which was the last Key to return from the new seek Range, I
> don't know if I have seen the upcoming IDs. Not sure if that makes sense...
>
> Was thinking that on teardown, we could use either the deepCopy or init
> method to rollover state from the torn down iterator to the new iterator.
>
> On Wed, Jan 4, 2017 at 11:14 AM Keith Turner  wrote:
>
>> On Wed, Jan 4, 2017 at 10:44 AM, Roshan Punnoose 
>> wrote:
>> > Keith,
>> >
>> > If an iterator has state that it is maintaining, what is the best way to
>> > transfer that state to the new iterator after a tear down?  For example,
>> > MyIterator might have a Boolean flag of some sort. After tear down, is
>> there
>> > a way to copy that state to the new iterator before it starts seeking
>> again?
>>
>> There is nothing currently built in to help with this.
>>
>> What are you trying to accomplish?  Are you interested in maintaining
>> this state for a scan or batch scan?
>>
>>
>> >
>> > Roshan
>> >
>> > On Wed, Jan 4, 2017 at 10:33 AM Keith Turner  wrote:
>> >>
>> >> Josh,
>> >>
>> >> Deepcopy is not called when an iterator is torn down.  It has an
>> >> entirely different use. Deepcopy allows cloning of an iterator during
>> >> init().  The clones allow you to have multiple pointers into a tablets
>> >> data which allows things like server side joins.
>> >>
>> >> Keith
>> >>
>> >> On Wed, Dec 28, 2016 at 12:50 PM, Josh Clum 
>> wrote:
>> >> > Hi,
>> >> >
>> >> > I have a question about iterator teardown. It seems from
>> >> >
>> >> > https://github.com/apache/accumulo/blob/master/docs/src/
>> main/asciidoc/chapters/iterator_design.txt#L383-L390
>> >> > that deepCopy should be called when an iterator is torn down. I'm not
>> >> > seeing
>> >> > that behavior. Below is a test that sets table.scan.max.memory to 1
>> >> > which
>> >> > should force a tear down for each kv returned. I should see deepCopy
>> >> > being
>> >> > called 3 times but when I tail the Tserver logs I'm not seeing it
>> being
>> >> > called. Below is the test and the Tserver output.
>> >> >
>> >> > What am I missing here?
>> >> >
>> >> > Josh
>> >> >
>> >> > ➜  tail -f -n200 ./accumulo/logs/TabletServer_*.out | grep
>> >> > MyIterator
>> >> > MyIterator: init
>> >> > MyIterator: seek
>> >> > MyIterator: hasTop
>> >> > MyIterator: getTopKey
>> >> > MyIterator: getTopValue
>> >> > MyIterator: init
>> >> > MyIterator: seek
>> >> > MyIterator: hasTop
>> >> > MyIterator: getTopKey
>> >> > MyIterator: getTopValue
>> >> > MyIterator: init
>> >> > MyIterator: seek
>> >> > MyIterator: hasTop
>> >> > MyIterator: getTopKey
>> >> > MyIterator: getTopValue
>> >> > MyIterator: init
>> >> > MyIterator: seek
>> >> > MyIterator: hasTop
>> >> >
>> >> > public static class MyIterator implements SortedKeyValueIterator> >> > Value>
>> >> > {
>> >> >
>> >> > private SortedKeyValueIterator source;
>> >> >
>> >> > public MyIterator() { }
>> >> >
>> >> > @Override
>> >> > public void init(SortedKeyValueIterator source,
>> >> >  Map options,
>> >> >  IteratorEnvironment env) throws IOException {
>> >> > System.out.println("MyIterator: init");
>> >> > this.source = source;
>> >> > }
>> >> >
>> >> > @Override
>> >> > public boolean hasTop() {
>> >> > System.out.println("MyIterator: hasTop");
>> >> > return source.hasTop();
>> >> > }
>> >> >
>> >> > @Override
>> >> > public void next() throws IOException {
>> >> > System.out.println("MyIterator: next");
>> >> >

Accumulo Tip: Batch your Mutations

2016-11-30 Thread Dylan Hutchison

Hi folks,

I'd like to share a tip that ~doubled BatchWriter ingest performance in my
application.

When inserting multiple entries to the same Accumulo row, put them into the
same Mutation object.  Add that one large Mutation to a BatchWriter rather
than an individual Mutation for each entry. The result reduces the amount
of data transferred.

The tip seems obvious enough, but hey, I used Accumulo for a couple years
without realizing it, so I thought y'all might benefit too.

Enjoy!
Dylan

Re: Running low on memory and Zookeeper Session expired / disconnected

2016-11-11 Thread Dylan Hutchison

Oh no, do keep using the native maps.  I meant to say that the memory and
Zookeeper problems during ingest can be a result of *not *using native
maps.

The Zookeeper expire results from the tservers having to spend so much time
garbage collecting that the tservers can't respond to the Zookeeper
heartbeat, so Zookeeper expires their session and the tservers subsequently
kill themselves.

If you are using native maps and having these problems, then something else
may be the source.

On Fri, Nov 11, 2016 at 2:24 AM, Mario Pastorelli <
mario.pastore...@teralytics.ch> wrote:

> Yes we are as they are supposed to be faster. Good to know they could be
> the problem but what would be the solution, disabling them? Also, I guess
> there isn't an easy way to see that they are the problem, right?
>
> On Fri, Nov 11, 2016 at 11:21 AM, Dylan Hutchison <
> dhutc...@cs.washington.edu> wrote:
>
>> Mario, are you using native in-memory maps?  I've seen these problems run
>> rampant under Java memory maps.
>>
>> On Fri, Nov 11, 2016 at 2:11 AM, Mario Pastorelli <
>> mario.pastore...@teralytics.ch> wrote:
>>
>>> Hi all,
>>>
>>> I have two recurring errors with Accumulo in my cluster and I would like
>>> to know more about them. The first, usually happening at ingestion time
>>> when I write with the batch writers many records, is the "Running low on
>>> memory". We keep adding memory to Accumulo but this is a blind guess and I
>>> was wondering if there is a way to understand how much memory Accumulo
>>> would need considering the amount of data that will be written. Should we
>>> write slowly to Accumulo to avoid this? What is filling all the memory at
>>> ingestion time?
>>> Secondly, we have these zookeeper session expired and other zookeeper
>>> timeouts. Zookeeper on our cluster works quite well, we have many systems
>>> using it. How can I debug a "zookeeper session expired" in Accumulo?
>>>
>>> Thanks,
>>> Mario
>>>
>>> --
>>> Mario Pastorelli | TERALYTICS
>>>
>>> *software engineer*
>>>
>>> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
>>> phone: +41794381682
>>> email: mario.pastore...@teralytics.ch
>>> www.teralytics.net
>>>
>>> Company registration number: CH-020.3.037.709-7 | Trade register Canton
>>> Zurich
>>> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz,
>>> Yann de Vries
>>>
>>> This e-mail message contains confidential information which is for the
>>> sole attention and use of the intended recipient. Please notify us at once
>>> if you think that it may not be intended for you and delete it immediately.
>>>
>>
>>
>
>
> --
> Mario Pastorelli | TERALYTICS
>
> *software engineer*
>
> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
> phone: +41794381682
> email: mario.pastore...@teralytics.ch
> www.teralytics.net
>
> Company registration number: CH-020.3.037.709-7 | Trade register Canton
> Zurich
> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann
> de Vries
>
> This e-mail message contains confidential information which is for the
> sole attention and use of the intended recipient. Please notify us at once
> if you think that it may not be intended for you and delete it immediately.
>

Re: Running low on memory and Zookeeper Session expired / disconnected

2016-11-11 Thread Dylan Hutchison

Mario, are you using native in-memory maps?  I've seen these problems run
rampant under Java memory maps.

On Fri, Nov 11, 2016 at 2:11 AM, Mario Pastorelli <
mario.pastore...@teralytics.ch> wrote:

> Hi all,
>
> I have two recurring errors with Accumulo in my cluster and I would like
> to know more about them. The first, usually happening at ingestion time
> when I write with the batch writers many records, is the "Running low on
> memory". We keep adding memory to Accumulo but this is a blind guess and I
> was wondering if there is a way to understand how much memory Accumulo
> would need considering the amount of data that will be written. Should we
> write slowly to Accumulo to avoid this? What is filling all the memory at
> ingestion time?
> Secondly, we have these zookeeper session expired and other zookeeper
> timeouts. Zookeeper on our cluster works quite well, we have many systems
> using it. How can I debug a "zookeeper session expired" in Accumulo?
>
> Thanks,
> Mario
>
> --
> Mario Pastorelli | TERALYTICS
>
> *software engineer*
>
> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
> phone: +41794381682
> email: mario.pastore...@teralytics.ch
> www.teralytics.net
>
> Company registration number: CH-020.3.037.709-7 | Trade register Canton
> Zurich
> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann
> de Vries
>
> This e-mail message contains confidential information which is for the
> sole attention and use of the intended recipient. Please notify us at once
> if you think that it may not be intended for you and delete it immediately.
>

Re: Write data to a file from inside of an iterator

2016-11-05 Thread Dylan Hutchison

Ah, the use case of Graphulo 's OneTable

call.
Internally the OneTable call sets up a special iterator
(RemoteWriteIterator) that does open a BatchWriter.  The main trick that
allows it to write entries safely is pushing row/column filters into the
iterator, so that the iterator controls re-seeking rather than Accumulo.
This allows the iterator to write all its entries and close() without
having to worry about Accumulo tearing it down.  See the docs

for a starter.

*cue Josh to warn against the evils of re-purposing tablet servers for
MapReduce cycles* =)

Really, this is advanced stuff.  Graphulo's iterators have been shown to
scale up to 16 nodes for matrix multiply in the last HPEC conference, but
it is possible your use case could break Accumulo, in the worst case
causing deadlock if you don't use it properly.  You're also free to write
your own code using Graphulo's code as a starting point, if you're more
comfortable with that.  You may also decide on another approach such as
launching a MapReduce job against Accumulo's RFiles, which could be better
or worse depending on your use case.

On Sat, Nov 5, 2016 at 10:28 AM, Yamini Joshi  wrote:

> Hello all
>
> As per https://github.com/apache/accumulo/blob/master/docs/src/
> main/asciidoc/chapters/iterator_design.txt
> "
> Implementations of Iterator might be tempted to open BatchWriters inside
> of an Iterator as a means
> to implement triggers for writing additional data outside of their client
> application. The lifecycle of an Iterator
> is *not* managed in such a way that guarantees that this is safe nor
> efficient. Specifically, there
> is no way to guarantee that the internal ThreadPool inside of the
> BatchWriter is closed (and the thread(s)
> are reaped) without calling the close() method. `close`'ing and recreating
> a `BatchWriter` after every
> Key-Value pair is also prohibitively performance limiting to be considered
> an option."
>
> If I need to write a subset of records generated from an iterator to a
> file/table, I can't use a batch writer inside of an iterator? Is there any
> other way to go about it?
>
> Best regards,
> Yamini Joshi
>

Re: MultiIterator Class

2016-10-21 Thread Dylan Hutchison

The MultiIterator is used internally in Accumulo to merge sorted streams of
data together.  For example, merging sorted data from several RFiles and an
in-memory map.  It does not sort, nor could it without materializing part
or all of the data stream.

Poking inside Accumulo is fun, isn't it?  Do write down your experiences
and thoughts as you explore Accumulo's architecture.  We're always open for
suggestions and contributions.  I was in exactly your place when I worked
on the Graphulo library.

Cheers, Dylan

On Fri, Oct 21, 2016 at 11:56 AM, Yamini Joshi 
wrote:

> Hello All
>
> I just came across this iterator:
> https://github.com/apache/accumulo/blob/e900e67425d950bd4c0c5288a6270d
> 7b362ac458/core/src/main/java/org/apache/accumulo/core/iterators/system/
> MultiIterator.java
>
> Can someone tell me what exactly can it be used for?
> Can it be used to sort data acquired from batch_scan before passing the
> data to other iterators?
>
> Best regards,
> Yamini Joshi
>

Re: Iterator as a Filter

2016-10-20 Thread Dylan Hutchison

Hi Yamini,

If you have a finite, known list of column families, you can use locality
groups
 to
store them in separate files in Hadoop.   Scans that only reference the
column families within a locality group need not open data in other
locality groups' files.

Apart from locality groups, setting "fetch column families and/or
qualifiers" on the scanner sets up a standard Filter iterator on the scan.
If you need to obtain these columns from every row, then the whole table is
scanned and filtered server-side.  (Seeking will occur during the scan if
the selected columns are far apart in the table.)  I guess that is too
inefficient for your use case.  For reference, these iterators are here for
families

and here for qualifiers

.

If locality groups are not an option and you must filter on families and
columns, then you may want to consider maintaining an index table, in which
the columns are stored as rows, or otherwise moving the columns into the
rows.

Regards, Dylan

On Thu, Oct 20, 2016 at 3:45 PM, Yamini Joshi  wrote:

> Hello all
>
> Is it possible to configure an iterator that works as a filter? As per
> Accumulo docs:
> As such, the `Filter` class functions well for filtering small amounts of
> data, but is
> inefficient for filtering large amounts of data. The decision to use a
> `Filter` strongly
> depends on the use case and distribution of data being filtered.
>
> I have a huge corpus to be filtered with a small amount of data selected.
> I want to select column families from a list of col families. I have a
> rough idea of using 'seek' to bypass cfs that don't exist in the list. I
> was hoping I could exploit the 'seek'ing in iterator and go to the range in
> the list of cf and check if it exists. I am not sure if this will work or
> if it is a good approach. Any feedback is much appreciated.
>
> Best regards,
> Yamini Joshi
>

Re: Count RowIDs with a common Prefix

2016-10-18 Thread Dylan Hutchison

Write an iterator to compute partial sums. Make sure the iterator does not
return a key outside if the range to which it was seeked. (You don't have
to modify the keys you return; just sum the keys below the first one for
each key prefix into the first one.)

Collect the partial sums at the client from the batch scan. Compute full
sums at the client.

On Oct 17, 2016 11:02 PM, "Yamini Joshi"  wrote:

> Hello all
>
> My keys are of the form rowID:otherID where there are multiple otherIDs
> for a RowID. I want to know the count of all the otherIDs within a rowID.
> What would be the most optimal way to implement this?
>
> Best regards,
> Yamini Joshi
>

Re: Accumulo Equivalent of Mongo Aggr Query

2016-09-25 Thread Dylan Hutchison

Hi Yamini,

Could you further describe the computation you have in mind, for those of
us not familiar with MongoDB's "Aggr" function?  You may want to look at
Accumulo's built-in Combiner iterators
.  They
seem more relevant than Filters.

I don't know what you mean when you write that your output is not visible
to "the complete Database".

Regards, Dylan

On Sun, Sep 25, 2016 at 11:34 AM, Yamini Joshi 
wrote:

>
> Hello everyone
>
> I wanted to know if there is any equivalent of Mongo Aggr queries in
> Acuumulo. I have a complex query in form of a Mongo aggregate
> (multi-staged) query. I'm trying to model the same in Accumulo. As of know,
> with the limited knowledge that I have, I have created a class extending
> Filter class. My question is: since my queries depend on a input, is there
> any other way of using the iterators/filters only for one query or change
> their input with every single query? As of now, my filter is getting
> attached to the table on 'SCAN' that means the output will be visible to
> the subsequent queries and not the complete Database.
>
> Best regards,
> Yamini Joshi
>
>

Re: Accumulo Seek performance

2016-09-12 Thread Dylan Hutchison

Nice setup Josh.  Thank you for putting together the tests.  A few
questions:

The serial scanner implementation uses 6 threads: one for each thread in
the thread pool.
The batch scanner implementation uses 60 threads: 10 for each thread in the
thread pool, since the BatchScanner was configured with 10 threads and
there are 10 (9?) tablets.

Isn't 60 threads of communication naturally inefficient?  I wonder if we
would see the same performance if we set each BatchScanner to use 1 or 2
threads.

Maybe this would motivate a *MultiTableBatchScanner*, which maintains a
fixed number of threads across any number of concurrent scans, possibly to
the same table.


On Sat, Sep 10, 2016 at 3:01 PM, Josh Elser  wrote:

> Sven, et al:
>
> So, it would appear that I have been able to reproduce this one (better
> late than never, I guess...). tl;dr Serially using Scanners to do point
> lookups instead of a BatchScanner is ~20x faster. This sounds like a pretty
> serious performance issue to me.
>
> Here's a general outline for what I did.
>
> * Accumulo 1.8.0
> * Created a table with 1M rows, each row with 10 columns using YCSB
> (workloada)
> * Split the table into 9 tablets
> * Computed the set of all rows in the table
>
> For a number of iterations:
> * Shuffle this set of rows
> * Choose the first N rows
> * Construct an equivalent set of Ranges from the set of Rows, choosing a
> random column (0-9)
> * Partition the N rows into X collections
> * Submit X tasks to query one partition of the N rows (to a thread pool
> with X fixed threads)
>
> I have two implementations of these tasks. One, where all ranges in a
> partition are executed via one BatchWriter. A second where each range is
> executed in serial using a Scanner. The numbers speak for themselves.
>
> ** BatchScanners **
> 2016-09-10 17:51:38,811 [joshelser.YcsbBatchScanner] INFO : Shuffled all
> rows
> 2016-09-10 17:51:38,843 [joshelser.YcsbBatchScanner] INFO : All ranges
> calculated: 3000 ranges found
> 2016-09-10 17:51:38,846 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 40178 ms
> 2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 42296 ms
> 2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:53:47,414 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 46094 ms
> 2016-09-10 17:53:47,415 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:54:35,118 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 47704 ms
> 2016-09-10 17:54:35,119 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:55:24,339 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 49221 ms
>
> ** Scanners **
> 2016-09-10 17:57:23,867 [joshelser.YcsbBatchScanner] INFO : Shuffled all
> rows
> 2016-09-10 17:57:23,898 [joshelser.YcsbBatchScanner] INFO : All ranges
> calculated: 3000 ranges found
> 2016-09-10 17:57:23,903 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 2833 ms
> 2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 2536 ms
> 2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 2150 ms
> 2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 2061 ms
> 2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:57:35,628 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 2140 ms
>
> Query code is available https://github.com/joshelser/a
> ccumulo-range-binning
>
>
> Sven Hodapp wrote:
>
>> Hi Keith,
>>
>> I've tried it with 1, 2 or 10 threads. Unfortunately there where no
>> amazing differences.
>> Maybe it's a problem with the table structure? For example it may happen
>> that one row id (e.g. a sentence) has several thousand column families. Can
>> this affect the seek performance?
>>
>> So for my initial example it has about 3000 row ids to seek, which will
>> return about 500k entries. If I filter for specific column families (e.g. a
>> document without annotations) it will return about 5k entries, but the seek
>> time will only

Re: Accumulo Seek performance

2016-08-31 Thread Dylan Hutchison

Hi Sven,
  Without locality groups, your filtered scan may be reading nearly the
entire table.  The process looks like this:

   1. For each tablet that has one of the 3000 row ids (assuming sufficient
   tablet servers),
  1. *Seek* to the first column family of the first row id out of the
  target row ids in the tablet.
  2. *Read* that row+cf prefix.
  3. Find the next cf (out of the 5k cf's in your filter).
 1. *Read* the next entry and see if it is in the cf.  If it is,
 then you are lucky and go back to step 2.  Repeat this process for 10
 entries (a heuristic number).
 2. If none of the next 10 entries match the cf (or the next row in
 your target ranges), then *seek* to the next target row+cf, as in
 step 1.
  4. Continue until all target row ids in the tablet are scanned.

In the worst case, if the 5k target cf's in your filter are uniformly
spread out among the 500k total cf's (and each row has all 500k cf's, which
is probably not the case in your document-sentence table), then Accumulo
performs 5k seeks per row id, or 5k * 3k rows = 15M seeks, to be divided
among your tablet servers (assuming no significant skew).  You can adjust
this for the actual distribution of column families in your table to get an
idea of how many seeks Accumulo performs.

(On the other hand in the best case, if the 5k target cf's are all clumped
together, then Accumulo need only seek 3k times, or less if some row ids
are consecutive.)

Perhaps others could extend the model by estimating a "seconds/seek"
figure?  If we can estimate this, it would tell you whether your
BatchScanner times are in the right ballpark.  Or it might be sufficient to
compare the number of seeks.

Cheers, Dylan

On Wed, Aug 31, 2016 at 12:06 AM, Sven Hodapp <
sven.hod...@scai.fraunhofer.de> wrote:

> Hi Keith,
>
> I've tried it with 1, 2 or 10 threads. Unfortunately there where no
> amazing differences.
> Maybe it's a problem with the table structure? For example it may happen
> that one row id (e.g. a sentence) has several thousand column families. Can
> this affect the seek performance?
>
> So for my initial example it has about 3000 row ids to seek, which will
> return about 500k entries. If I filter for specific column families (e.g. a
> document without annotations) it will return about 5k entries, but the seek
> time will only be halved.
> Are there to much column families to seek it fast?
>
> Thanks!
>
> Regards,
> Sven
>
> --
> Sven Hodapp, M.Sc.,
> Fraunhofer Institute for Algorithms and Scientific Computing SCAI,
> Department of Bioinformatics
> Schloss Birlinghoven, 53754 Sankt Augustin, Germany
> sven.hod...@scai.fraunhofer.de
> www.scai.fraunhofer.de
>
> - Ursprüngliche Mail -
> > Von: "Keith Turner" 
> > An: "user" 
> > Gesendet: Montag, 29. August 2016 22:37:32
> > Betreff: Re: Accumulo Seek performance
>
> > On Wed, Aug 24, 2016 at 9:22 AM, Sven Hodapp
> >  wrote:
> >> Hi there,
> >>
> >> currently we're experimenting with a two node Accumulo cluster (two
> tablet
> >> servers) setup for document storage.
> >> This documents are decomposed up to the sentence level.
> >>
> >> Now I'm using a BatchScanner to assemble the full document like this:
> >>
> >> val bscan = instance.createBatchScanner(ARTIFACTS, auths, 10) //
> ARTIFACTS table
> >> currently hosts ~30GB data, ~200M entries on ~45 tablets
> >> bscan.setRanges(ranges)  // there are like 3000 Range.exact's in
> the ranges-list
> >>   for (entry <- bscan.asScala) yield {
> >> val key = entry.getKey()
> >> val value = entry.getValue()
> >> // etc.
> >>   }
> >>
> >> For larger full documents (e.g. 3000 exact ranges), this operation will
> take
> >> about 12 seconds.
> >> But shorter documents are assembled blazing fast...
> >>
> >> Is that to much for a BatchScanner / I'm misusing the BatchScaner?
> >> Is that a normal time for such a (seek) operation?
> >> Can I do something to get a better seek performance?
> >
> > How many threads did you configure the batch scanner with and did you
> > try varying this?
> >
> >>
> >> Note: I have already enabled bloom filtering on that table.
> >>
> >> Thank you for any advice!
> >>
> >> Regards,
> >> Sven
> >>
> >> --
> >> Sven Hodapp, M.Sc.,
> >> Fraunhofer Institute for Algorithms and Scientific Computing SCAI,
> >> Department of Bioinformatics
> >> Schloss Birlinghoven, 53754 Sankt Augustin, Germany
> >> sven.hod...@scai.fraunhofer.de
> > > www.scai.fraunhofer.de
>

Re: Accumulo Limiting Iterator Help

2016-08-12 Thread Dylan Hutchison

Hi Ryan,

I think you could achieve the behavior you described more simply by
overriding hasTop() and returning `false` once your iterator has seen and
emitted N entries.  No need to re-seek the parent iterator to a singleton
range, since that will have the same effect as hasTop() == false.  Also,
you're not guaranteed that the singleton range is valid, if the seek range
has an exclusive end key.

I couldn't follow the meaning behind the `numEntriesPerRange` and
`numScans` variables, but hopefully the above advice helps.  I'm also not
sure about the IOException.

Keep in mind that Accumulo can take down your iterator and re-create it at
any point.  When it does so, it re-inits and then re-seeks your iterator to
a position immediately after the last key returned.  If this happens in the
middle of your iterator counting N entries then it will start counting
again from 0.  See the iterator design section
 in
the manual for more info on when this happens.

Cheers, Dylan

On Fri, Aug 12, 2016 at 3:02 PM, Ryan Cunningham 
wrote:

> Hello,
>
>
>
> I'm trying to write an iterator that gets the top N sorted entries for a
> given range over sharded data. I created a custom iterator that extends
> SkippingIterator and made it so that it will return the first N entries for
> each tablet. After N entries, I have the source iterator seek to the end
> key of the specific range since it shouldn't return any other entries for
> that tablet.
>
>
>
> @Override
>
>   public void init(SortedKeyValueIterator source,
> Map options, IteratorEnvironment env) throws IOException {
>
> super.init(source, options, env);
>
> String o = options.get(NUM_SCANS_STRING_NAME);
>
> numScans = o == null ? 10 : Integer.parseInt(o);
>
> String n = options.get(NUM_ENTRIES_STRING_NAME);
>
> numEntriesPerRange = n == null ? Integer.MAX_VALUE :
> Integer.parseInt(n);
>
> numEntries = 0;
>
>   }
>
>
>
>   // this is only ever called immediately after getting "next" entry
>
>   @Override
>
>   protected void consume() throws IOException {
>
> if (numEntries < numEntriesPerRange) {
>
>++numEntries;
>
>return;
>
> }
>
> int count = 0;
>
> while (getSource().hasTop()) {
>
>if (count < numScans) {
>
>   ++count;
>
> getSource().next(); // scan
>
> } else {
>
> // too many scans, just seek to end of range
>
>Key lastKey = latestRange.getEndKey() == null ? new Key(new
> Text(String.valueOf(Character.MAX_VALUE))) :
> latestRange.getEndKey().followingKey(PartialKey.ROW);
>
>getSource().seek(new Range(lastKey, true, lastKey,
> false), latestColumnFamilies, latestInclusive);
>
> }
>
> }
>
>   }
>
>
>
>   @Override
>
>   public void seek(Range range, Collection columnFamilies,
> boolean inclusive) throws IOException {
>
> // save parameters for future internal seeks
>
> latestRange = range;
>
> latestColumnFamilies = columnFamilies;
>
> latestInclusive = inclusive;
>
>
>
> super.seek(range, columnFamilies, inclusive);
>
>
>
> if (getSource().hasTop()) {
>
>   if (range.beforeStartKey(getSource().getTopKey()))
>
> consume();
>
> }
>
>   }
>
>
>
> I did some initial testing and it seems to work as expected, bringing back
> N * number of tablets results. However, when I increase the limit past a
> certain point something seems to be messing up and I get all entries back
> instead of the limited count. I also sometimes see this error but I looked
> online and I'm not sure if it's related:
>
>
>
> 16/08/12 20:54:22 WARN transport.TIOStreamTransport: Error closing output
> stream.
>
> java.io.IOException: The stream is closed
>
> at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputS
> tream.java:118)
>
> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStrea
> m.java:82)
>
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java
> :140)
>
> at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
>
> at org.apache.thrift.transport.TIOStreamTransport.close(TIOStre
> amTransport.java:110)
>
> at org.apache.thrift.transport.TFramedTransport.close(TFramedTr
> ansport.java:89)
>
> at org.apache.accumulo.core.client.impl.ThriftTransportPool$Cac
> hedTTransport.close(ThriftTransportPool.java:312)
>
> at org.apache.accumulo.core.client.impl.ThriftTransportPool.ret
> urnTransport(ThriftTransportPool.java:584)
>
> at org.apache.accumulo.core.util.ThriftUtil.returnClient(Thrift
> Util.java:134)
>
> at org.apache.accumulo.core.client.impl.TabletServerBatchReader
> Iterator.doLookup(TabletServerBatchReaderIterator.java:714)
>
> at org.apache.accumulo.core.client.impl.TabletServerBatchReader
>

Re: Making a RowCounterIterator

2016-07-15 Thread Dylan Hutchison

Hi Mario,

As you gain more experience with Accumulo, feel free to write or modify
Accumulo's documentation in the places you find it lacking and send a PR.
If you find a topic confusing, probably many others do too.

Cheers, Dylan

On Fri, Jul 15, 2016 at 4:04 PM, Christopher  wrote:

> Ah, I thought you were doing WholeRowIterator -> RowCounterIterator
> I now understand you're doing WholeRowIterator -> SomeCustomFilter (column
> predicate) -> RowCounterIterator
>
> That's okay to do, but it may be better to have an iterator that creates a
> clone of its source at the beginning of each row, advances to do the
> filtering, and then informs the spawning iterator to either accept or
> reject. This is, admittedly, far more complicated than WholeRowIterator,
> but it can safer if you have really big rows which don't fit in memory.
>
> To your question about WholeRowIterator, yes, it's fine. The iterator will
> always see sorted data (unless it's sitting on top of another iterator
> which breaks this... which is possible, but not recommended at all), even
> though the client may not. And yes, rows are never split (but if the query
> range doesn't include the full row, it may return early). Their usage is
> orthogonal, and can be used together or not.
>
> On Fri, Jul 15, 2016 at 6:35 PM Mario Pastorelli <
> mario.pastore...@teralytics.ch> wrote:
>
>> The WholeRowIterator is for filtering: I need all the columns that the
>> filter requires so that the filter can see if the row matches or not the
>> query. That's the only proper way I found to implement logic operators on
>> predicated over columns of the same row.
>>
>> Actually I do have a question about WholeRowIterator, while we are
>> talking about them. Do they make sense when used with a BatchScanner? My
>> guess is yes because while the BatchScanner can return data non-sorted to
>> the client, when it is scanning a single tablet the data is sorted. Because
>> the data of the same rowId is never split (right?) then there is no problem
>> in using a WholeRowIterator with a BatchScanner. Is this correct? I really
>> can't find much documentation for Accumulo and the book doesn't help enough.
>>
>> On Sat, Jul 16, 2016 at 12:29 AM, Christopher 
>> wrote:
>>
>>> It'd be more efficient to use the FirstEntryInRowIterator to just grab
>>> one each, rather than the WholeRowIterator which could use up a lot of
>>> memory unnecessarily.
>>>
>>> On Fri, Jul 15, 2016 at 6:20 PM Mario Pastorelli <
>>> mario.pastore...@teralytics.ch> wrote:
>>>
 I'm actually using this after a wholerowiterator, which is used to
 filter rows with the same rowId.

 On Fri, Jul 15, 2016 at 10:02 PM, William Slacum 
 wrote:

> The iterator in the gist also counts cells/entries/KV pairs, not
> unique rows. You'll want to have some way to skip to the next row value if
> you want the count to be reflective of the number of rows being read.
>
> On Fri, Jul 15, 2016 at 3:34 PM, Shawn Walker <
> accum...@shawn-walker.net> wrote:
>
>> My read is that you're mistaking the sequence of calls Accumulo will
>> be making to your iterator.  The sequence isn't quite the same as a Java
>> iterator (initially positioned "before" the first element), and is more
>> like a C++ iterator:
>>
>> 0. Accumulo calls seek(...)
>> 1. Is there more data? Accumulo calls hasTop(). You return yes.
>> 2. Ok, so there's data.  Accumulo calls getTopKey(), getTopValue() to
>> retrieve the data. You return a key indicating 0 columns seen (since 
>> next()
>> hasn't yet been called)
>> 3. First datum done, Accumulo calls next()
>> ...
>>
>> I imagine that if you pull the second item out of your scan result,
>> it'll have the number you expect.  Alternately, you might consider
>> performing the count computation during an override of the seek(...)
>> method, instead of in the next(...) method.
>>
>> --
>> Shawn Walker
>>
>>
>>
>> On Fri, Jul 15, 2016 at 2:24 PM, Mario Pastorelli <
>> mario.pastore...@teralytics.ch> wrote:
>>
>>> I'm trying to create a RowCounterIterator that counts all the rows
>>> and returns only one key-value with the counter inside. The problem is 
>>> that
>>> I can't get it work. The Scala code is available in the gist
>>> 
>>> together with some pseudo-code of a test. The problem is that if I add 
>>> an
>>> entry to my table, this iterator will return 0 instead of 1 and 
>>> apparently
>>> the reason is that super.hasTop() is always false. I've tried without 
>>> the
>>> iterator and the scanner returns 1 elements. Any idea of what I'm doing
>>> wrong here? Is WrappingIterator the right class to extend for this kind 
>>> of
>>> behaviour?
>>>
>>> Thanks,

Re: Making a RowCounterIterator

2016-07-15 Thread Dylan Hutchison

Hi Mario,
  You can reuse or adapt the RowCountingIterator

code here.

The main trick is understanding how each tablet needs to emit a row within
its seek range.  An iterator should not emit an entry whose row lies
outside the seek range of the tablet the iterator is running on.  Instead,
you can emit *partial sums* whose row stays within the seek range.  Each
tablet server communicates one partial sum.  Then sum the partial sums at
the client.  (I am probably mixing up tablet vs. tablet server.)

Cheers, Dylan


On Fri, Jul 15, 2016 at 1:02 PM, William Slacum  wrote:

> The iterator in the gist also counts cells/entries/KV pairs, not unique
> rows. You'll want to have some way to skip to the next row value if you
> want the count to be reflective of the number of rows being read.
>
> On Fri, Jul 15, 2016 at 3:34 PM, Shawn Walker 
> wrote:
>
>> My read is that you're mistaking the sequence of calls Accumulo will be
>> making to your iterator.  The sequence isn't quite the same as a Java
>> iterator (initially positioned "before" the first element), and is more
>> like a C++ iterator:
>>
>> 0. Accumulo calls seek(...)
>> 1. Is there more data? Accumulo calls hasTop(). You return yes.
>> 2. Ok, so there's data.  Accumulo calls getTopKey(), getTopValue() to
>> retrieve the data. You return a key indicating 0 columns seen (since next()
>> hasn't yet been called)
>> 3. First datum done, Accumulo calls next()
>> ...
>>
>> I imagine that if you pull the second item out of your scan result, it'll
>> have the number you expect.  Alternately, you might consider performing the
>> count computation during an override of the seek(...) method, instead of in
>> the next(...) method.
>>
>> --
>> Shawn Walker
>>
>>
>>
>> On Fri, Jul 15, 2016 at 2:24 PM, Mario Pastorelli <
>> mario.pastore...@teralytics.ch> wrote:
>>
>>> I'm trying to create a RowCounterIterator that counts all the rows and
>>> returns only one key-value with the counter inside. The problem is that I
>>> can't get it work. The Scala code is available in the gist
>>> 
>>> together with some pseudo-code of a test. The problem is that if I add an
>>> entry to my table, this iterator will return 0 instead of 1 and apparently
>>> the reason is that super.hasTop() is always false. I've tried without the
>>> iterator and the scanner returns 1 elements. Any idea of what I'm doing
>>> wrong here? Is WrappingIterator the right class to extend for this kind of
>>> behaviour?
>>>
>>> Thanks,
>>> Mario
>>>
>>> --
>>> Mario Pastorelli | TERALYTICS
>>>
>>> *software engineer*
>>>
>>> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
>>> phone: +41794381682
>>> email: mario.pastore...@teralytics.ch
>>> www.teralytics.net
>>>
>>> Company registration number: CH-020.3.037.709-7 | Trade register Canton
>>> Zurich
>>> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz,
>>> Yann de Vries
>>>
>>> This e-mail message contains confidential information which is for the
>>> sole attention and use of the intended recipient. Please notify us at once
>>> if you think that it may not be intended for you and delete it immediately.
>>>
>>
>>
>

Re: Is there a way to keep from writing data that I can't read?

2016-05-26 Thread Dylan Hutchison

Try the VisibilityConstraint

=)

On Thu, May 26, 2016 at 4:07 PM, Russ Weeks 
wrote:

> I seem to remember some way to configure a check that would prevent me
> from writing mutations that I wouldn't be able to scan with my current
> maximal set of authorizations.
>
> Is that just my fevered imagination, or is it actually a thing?
>
> Thanks!
> -Russ
>

Re: Searching d4m based table

2016-02-09 Thread Dylan Hutchison

Hi Jamie,

Take a look at the examples under directories 3Scaling and 2Apps in the D4M
distribution , or at a demo on baseball data
here .  Whether you use Matlab/Octave or
not, these examples demonstrate the d4m schema and queries on the d4m
schema that you can use when talking to Accumulo.

Regards, Dylan

On Tue, Feb 9, 2016 at 2:10 PM, Jamie Johnson  wrote:

> Thanks Jeremy, I will give this a read.  Are there any sample projects
> that demonstrate these types of queries?
>
> On Tue, Feb 9, 2016 at 9:59 AM, Jeremy Kepner  wrote:
>
>> Graphs and graph traversals.
>> Exact match and range queries.
>>
>> If the data set is large and you are concerned about a particular
>> query returning a lot of data then it is important to create a degree
>> table
>> that maintains the count of each unique entry in the d4m table.
>> You can then query the degree table first to get an estimate of how
>> big the results will be prior to actually performing the query.
>>
>> Here are some papers that might be helpful:
>>
>> http://arxiv.org/abs/1407.3859
>> http://arxiv.org/abs/1507.01066
>> http://arxiv.org/abs/1407.6923
>> http://arxiv.org/abs/1406.4923
>>
>>
>> On Tue, Feb 09, 2016 at 07:01:05AM -0500, Jamie Johnson wrote:
>> > Is there documentation describing what types of searches perform well
>> on a
>> > d4m based table?   Any examples?
>>
>
>

Re: how to maintain versioning in D4M schema?

2015-11-30 Thread Dylan Hutchison

>
> 2. Instead of putting each "value" in its own field, you could combine
> them into an ordered set: field|{time1:value1,time2:value2,time3:value3}.
> For this to work well, you'd have to write a custom combining iterator that
> kept only the most recent 3 during scans and compactions, based on time (or
> whatever you use to denote version).
>

If you don't mind writing a custom iterator, then you can write an iterator
for the original schema (where colq is "field|value1") which acts as
follows.  Don't forget that entries include a timestamp field.  Let V be
the total number of versions you want to retain, and let ARR be an array of
values and timestamps of size V.

   1. Save the "field" of the first entry in the column qualifier.  Store
   the value and timestamp as the first entry in ARR.
   2. While the next entry has the same "field", store its value and
   timestamp in the next empty entry in ARR.
  1. If there are no more empty slots in ARR, then remove the entry
  with the least recent timestamp from ARR, and add the new value and
  timestamp to ARR (or don't add the new entry if it has the least recent
  timestamp).
   3. When the next entry does not have the same "field", emit all the
   entries in ARR, clear ARR, and go back to step 1 with new entry.
   4. When there are no more entries, emit ARR and no more (set hasTop() to
   false).

This approach works because a row is guaranteed to be stored on the same
tablet server, and we see all the entries for a "field" consecutively.  Let
us know how it works for you if you choose to go this route.

Regards, Dylan

On Mon, Nov 30, 2015 at 10:58 AM, Christopher <ctubb...@apache.org> wrote:

> I can think of two options:
>
> 1. Instead of "field|value", use "field|value", where version
> behaves similarly to Accumulo's timestamp field, and add a custom iterator
> which achieves the same effect as the VersioningIterator using this part of
> the colq.
>
> 2. Instead of putting each "value" in its own field, you could combine
> them into an ordered set: field|{time1:value1,time2:value2,time3:value3}.
> For this to work well, you'd have to write a custom combining iterator that
> kept only the most recent 3 during scans and compactions, based on time (or
> whatever you use to denote version).
>
> Of the two, I think the second is simpler and fits best within the
> existing D4M schema. At the most, it just adds some structure to the value,
> which can be processed with an additional combining iterator, but doesn't
> fundamentally change the the table structure.
>
> On Sun, Nov 29, 2015 at 11:10 PM shweta.agrawal <shweta.agra...@orkash.com>
> wrote:
>
>> The example which I am working is:
>>
>> rowidcolf  colq  value
>>idfield|value1  1
>>idfield|value2  1
>>idfield|value3  1
>>idfield|value4  1
>>idfield|value5  1
>>idfield|value6  1
>>
>> This is my schema in D4M style. Here one field has multiple values. And
>> I want to keep latest 3 values and I want that automatically other
>> values to be deleted as in case of versioning iterator.
>>
>> So after versioning my table should look like this:
>>
>> rowidcolf  colq  value
>>idfield|value1  1
>>idfield|value2  1
>>idfield|value3  1
>>
>> Thanks
>> Shweta
>>
>> On Friday 27 November 2015 07:15 PM, Jeremy Kepner wrote:
>> > Can you provide a made up specific example?  I think that will
>> > make the discussion easier.
>> >
>> >
>> > On Fri, Nov 27, 2015 at 02:46:33PM +0530, shweta.agrawal wrote:
>> >> Thanks for the answer.
>> >> But I am asking about versioning in D4M style. How can I use
>> >> versioning iterator in D4M style as in D4M style, in Rowid id is
>> >> strored and field|value is stored in ColumnQualifier. So as value is
>> >> stored in columnQualifier I cannot maintain versions through
>> >> versioning iterator. So I am asking how will I maintain versioning
>> >> in D4M style?
>> >>
>> >> Thanks
>> >> Shweta
>> >>
>> >> On Friday 27 November 2015 12:45 PM, Dylan Hutchison wrote:
>> >>> In order to store five versions of a key but return only one of
>> >>> them during a scan, set the minc and ma

Re: how to maintain versioning in D4M schema?

2015-11-26 Thread Dylan Hutchison

Hi Shweta,

You have lots of options.  You could append or prepend a timestamp to the
rowid or column qualifier.  When prepending to the rowid, you may want to
reverse the timestamp in order to better shard your data (that is, prevent
all updates at a particular time from going to a single tablet server), at
the expense of not being able to do range queries on time periods.  You
could also disable or relax the VersioningIterator.  It depends on what you
want to do.

On Thu, Nov 26, 2015 at 1:26 AM, shweta.agrawal 
wrote:

> Hi,
>
> I have my data stored in D4M style. I also want to maintain versions of
> different value on the basis of time.  As in D4M style  data is only in
> rowid and colQualifier only.
>
> Is there any way to achieve versioning in D4M schema?
>
> Thanks
> Shweta
>
>

Re: how to maintain versioning in D4M schema?

2015-11-26 Thread Dylan Hutchison

Suppose your rowid is of the form

node_timestamp

Doing a prefix scan with Range.prefix("node") will cover all the nodes for
any timestamp.  You could choose to put the timestamp somewhere else if you
need to.


On Thu, Nov 26, 2015 at 3:41 AM, shweta.agrawal <shweta.agra...@orkash.com>
wrote:

> If I append timestamp on rowid. The id which acts as a node will act as a
> different entity.
> How this will be maintained?
>
> Thanks
> Shweta
>
>
> On Thursday 26 November 2015 04:50 PM, Dylan Hutchison wrote:
>
> Hi Shweta,
>
> You have lots of options.  You could append or prepend a timestamp to the
> rowid or column qualifier.  When prepending to the rowid, you may want to
> reverse the timestamp in order to better shard your data (that is, prevent
> all updates at a particular time from going to a single tablet server), at
> the expense of not being able to do range queries on time periods.  You
> could also disable or relax the VersioningIterator.  It depends on what you
> want to do.
>
> On Thu, Nov 26, 2015 at 1:26 AM, shweta.agrawal <shweta.agra...@orkash.com
> > wrote:
>
>> Hi,
>>
>> I have my data stored in D4M style. I also want to maintain versions of
>> different value on the basis of time.  As in D4M style  data is only in
>> rowid and colQualifier only.
>>
>> Is there any way to achieve versioning in D4M schema?
>>
>> Thanks
>> Shweta
>>
>>
>
>

Re: how to maintain versioning in D4M schema?

2015-11-26 Thread Dylan Hutchison

In order to store five versions of a key but return only one of them during
a scan, set the minc and majc VersioningIterator to 5 and set the scan
VersioningIterator to 1.  You can set scanning iterators on a per-scan
basis if this helps.

It is not necessary to put the timestamp in the column family if you are
going with the VersioningIterator approach.

There are many ways to achieve versioning in Accumulo.  As the
designer/programmer, you must choose one that fits your application, of
which we do not know the full details.  It sounds like you have narrowed
your choice to (1) putting the timestamp in the column family, or (2) not
putting the timestamp anywhere else but instead changing the
VersioningIterator such that Accumulo stores more versions than the latest
version of a (row,colfam,colqual,colvis) key.



On Thu, Nov 26, 2015 at 8:45 PM, mohit.kaushik 
wrote:

> David,
>
> But this is the case when we store versions based on timestamp field. The
> point is, in D4M schema we can not achieve it by doing this. In this case
> we are considering CF to store timestamp in reverse order as described by
> Dylan. Then how can we configure Accumulo to return only latest version and
> store only 5 versions?
>
> Thanks
> Mohit Kaushik
>
> On 11/27/2015 09:54 AM, David Medinets wrote:
>
> From the user manual:
>
> user@myinstance mytable> config -t mytable -s 
> table.iterator.scan.vers.opt.maxVersions=5user@myinstance mytable> config -t 
> mytable -s table.iterator.minc.vers.opt.maxVersions=5user@myinstance mytable> 
> config -t mytable -s table.iterator.majc.vers.opt.maxVersions=5
>
>
> On Thu, Nov 26, 2015 at 11:10 PM, shweta.agrawal <
> shweta.agra...@orkash.com> wrote:
>
>> I want to maintain 5 versions only and user can enter any number of
>> versions but I want to keep only 5 latest version.
>>
>>
>> On Friday 27 November 2015 09:38 AM, David Medinets wrote:
>>
>> Do you want five versions of every entry or will the number of versions
>> vary?
>>
>> On Thu, Nov 26, 2015 at 10:53 PM, shweta.agrawal <
>> shweta.agra...@orkash.com> wrote:
>>
>>> Thanks Dylan and David.
>>> I can store version information in column family. But my problem is when
>>> I have many versions of the same key how will I manage that. In Accumulo
>>> versioning I can specify that how many versions I want to manage.
>>>
>>> Suppose I have 10 versions and I only want 5 versions to store, how to
>>> manage this in a big table?
>>>
>>> Thanks
>>> Shweta
>>>
>>> On Thursday 26 November 2015 10:22 PM, David Medinets wrote:
>>>
>>> What are the query patterns? If you are versioning for auditing then
>>> changing the VersioningIterator seems the easiest approach. You could also
>>> store application-specific version information in the column family. One of
>>> the reasons that D4M does not use it is to allow application-specific uses.
>>> Using the CF means that any applications that understand D4M would not need
>>> to change their queries to adjust for the version information.
>>>
>>> On Thu, Nov 26, 2015 at 4:26 AM, shweta.agrawal <
>>> shweta.agra...@orkash.com> wrote:
>>>
 Hi,

 I have my data stored in D4M style. I also want to maintain versions of
 different value on the basis of time.  As in D4M style  data is only in
 rowid and colQualifier only.

 Is there any way to achieve versioning in D4M schema?

 Thanks
 Shweta


>>>
>>>
>>
>>
>
>
> --
>
> * Mohit Kaushik*
> Software Engineer
> A Square,Plot No. 278, Udyog Vihar, Phase 2, Gurgaon 122016, India
> *Tel:* +91 (124) 4969352 | *Fax:* +91 (124) 4033553
>
> interactive social intelligence at
> work...
>
> 
> 
>   
> 
>  ... ensuring Assurance in complexity and
> uncertainty
>
> *This message including the attachments, if any, is a confidential
> business communication. If you are not the intended recipient it may be
> unlawful for you to read, copy, distribute, disclose or otherwise use the
> information in this e-mail. If you have received it in error or are not the
> intended recipient, please destroy it and notify the sender immediately.
> Thank you *
>

Re: Is there a sensible way to do this? Sequential Batch Scanner

2015-10-27 Thread Dylan Hutchison

Hi Rob,

One solution is to use an Accumulo iterator.  Suppose you want to scan a
set of non-overlapping ranges R.  Use a (non-batch) Scanner, with range
spanning the least start key in R to the greatest end key in R, and a
server-side iterator that works as follows:

   - Pass R to the server-side iterator via iterator options.
   - On a call to seek(Range r, ..., ...) in the iterator: let the iterator
   seek its parent for the first range in R that intersects with r.
   - On a call to next(), if the current seek'ed range is finished, seek
   its parent to the next range in R that intersects with r, until no more
   ranges in R intersect with r.  At that point the scan is finished.

The result is that you can scan a number of non-disjoint ranges with "one
Scanner call" whose results come back in order.  We did this "moving seek
control" into the land of iterators.  One word of caution: if the number of
ranges is very large, you might run into ACCUMULO-3710
 -- too many range
objects get materialized at the tablet server which results in an out of
memory error.

I have implemented something like this in the Graphulo project under
SeekFilterIterator

and its related classes.  Take a look at that if you want to try this idea,
and feel free to follow up with questions.

Cheers, Dylan




On Tue, Oct 27, 2015 at 3:21 PM, Rob Povey  wrote:

> What I want is something that behaves like a BatchScanner (I.e. Takes a
> collection of Ranges in a single RPC), but preserves the scan ordering.
> I understand this would greatly impact performance, but in my case I can
> manually partition my request on the client, and send one request per
> tablet.
> I can’t use scanners, because in some cases I have 10’s of thousands of
> none consecutive ranges.
> If I use a single threaded BatchScanner, and only request data from a
> single Tablet, am I guaranteed ordering?
> This appears to work correctly in my small tests (albeit slower than a
> single 1 thread Batch scanner call), but I don’t really want to have to
> rely on it if the semantic isn’t guaranteed.
> If not Is there another “efficient” way to do this.
>
> Thanks
>
> Rob Povey
>
>

Re: Anybody ever used the HDFS NFS Gateway?

2015-10-06 Thread Dylan Hutchison

Hi Russ,
  I'm curious what you have in mind.  Are you looking for a solution more
efficient than running clients that read the CSV files and open
BatchWriters?

Regards, Dylan

On Tue, Oct 6, 2015 at 4:56 PM, Christopher  wrote:

> I haven't tried it, but it sounds like a cool use case. Might be a good
> alternative to distcp, more interoperable with tools which don't speak
> hadoop.
>
> On Tue, Oct 6, 2015, 18:41 Russ Weeks  wrote:
>
>> I hope this isn't too off-topic. Any opinions re. its
>> completeness/quality/reliability?
>>
>> (The use case is, CSV files -> NFS -> HDFS -> Spark -> RFiles ->
>> Accumulo. Relevance established!)
>>
>> Thanks,
>> -Russ
>>
>

Re: Accumulo @ IEEE HPEC this week

2015-09-14 Thread Dylan Hutchison

Sure, how's this?

@IEEE_HPEC Accumulo events! Accumulo BoF user meeting, Graphulo server-side
sparse matrix multiply, Lustre+Hadoop+Accumulo hybrid HPC stack


On Mon, Sep 14, 2015 at 11:33 PM, Josh Elser <josh.el...@gmail.com> wrote:

> Awesome stuff!
>
> If you have a condensed 140-char version, I'd be happy to post this on the
> @ApacheAccumulo Twitter account.
>
> Dylan Hutchison wrote:
>
>> Hi folks,
>>
>> If you're heading to the IEEE HPEC <http://www.ieee-hpec.org/>
>> conference this week, do check out the following events that have
>> Accumulo in their title. Full agenda here
>> <http://abstractbook.ieee-hpec.org/index.htm>.  Disclaimer: one of these
>> is my own =)
>>
>> 1:00-2:40 session, Wednesday Sept 16
>> /Graphulo Implementation of Server-Side Sparse Matrix Multiply in the
>> Accumulo Database /
>> Dylan Hutchison, University of Washington, Jeremy Kepner, Vijay
>> Gadepally, MIT Lincoln Laboratory, Adam Fuchs, Sqrrl
>> [Best Student Paper Finalist]
>>
>> 6:00-7:00, Wednesday Sept 16
>> /Accumulo BoF (tentative) /
>> Chair: Adam Fuchs / Sqrrl
>>
>> 1:00-2:40 session, Thursday Sept 17
>> /Lustre, Hadoop, Accumulo /
>> Jeremy Kepner, William Arcand, David Bestor, Bill Bergeron, Chansup
>> Byun, Lauren Edwards, Vijay Gadepally, Matthew Hubbell, Peter
>>   Michaleas, Julie Mullen, Andrew Prout, Antonio Rosa, Charles Yee,
>> Albert Reuther, MIT Lincoln Laboratory
>>
>>
>> Cheers, Dylan
>>
>

Accumulo @ IEEE HPEC this week

2015-09-14 Thread Dylan Hutchison

Hi folks,

If you're heading to the IEEE HPEC <http://www.ieee-hpec.org/> conference
this week, do check out the following events that have Accumulo in their
title.  Full agenda here <http://abstractbook.ieee-hpec.org/index.htm>.
Disclaimer: one of these is my own =)

1:00-2:40 session, Wednesday Sept 16
*Graphulo Implementation of Server-Side Sparse Matrix Multiply in the
Accumulo Database  *
Dylan Hutchison, University of Washington, Jeremy Kepner, Vijay Gadepally,
MIT Lincoln Laboratory, Adam Fuchs, Sqrrl
[Best Student Paper Finalist]

6:00-7:00, Wednesday Sept 16
*Accumulo BoF (tentative)  *
Chair: Adam Fuchs / Sqrrl

1:00-2:40 session, Thursday Sept 17
*Lustre, Hadoop, Accumulo  *
Jeremy Kepner, William Arcand, David Bestor, Bill Bergeron, Chansup Byun,
Lauren Edwards, Vijay Gadepally, Matthew Hubbell, Peter  Michaleas, Julie
Mullen, Andrew Prout, Antonio Rosa, Charles Yee, Albert Reuther, MIT
Lincoln Laboratory


Cheers, Dylan

[Accumulo Contrib Proposal] Graphulo: Server-side Matrix Math library

2015-08-28 Thread Dylan Hutchison

 thought and discussion before
   implementing, since this instrumentation will go everywhere.  It would be
   nice if Graphulo and Accumulo mirror instrumentation strategies, so it
   would be good to have that discussion in the same venue.

   - Rigorous *scale testing*.  Good instrumentation is key.  With
   successful scale testing, we paint a clear picture for which operations
   Graphulo excels to potential adopters, ultimately plotting where Graphulo
   stands in the world of big data software.

   - Explicitly supporting the GraphBLAS http://graphblas.org/ spec, once
   it is agreed upon.  Graphulo was designed from the ground up with the
   GraphBLAS in mind, so this should be an easy task.  Aligning with this
   upcoming industry standard bodes well for ease of developing Graphulo
   algorithms.

Developing more algorithms and applications will follow too, and I imagine
this as an excellent place where newcomer users can get involved.

Some other places Graphulo needs work worth mentioning are creating a
proper release framework (the release here
https://github.com/Accla/graphulo/releases could use improvement,
starting with signed artifacts) and reviewing the way Graphulo runs tests
(currently centered around a critical file called TEST_CONFIG.java which is
great for one developer, whereas a config file probably works better).
Both of these are places more experienced developers could help.  I should
also mention that Graphulo has groundwork in place for operations between
Accumulo instances, but I doubt many users would need that level of control.

Regarding IP, I'm happy to donate my commits to the ASF, which covers 99%
of the Graphulo code base.  I'm sure other issues will arise and we can
sort them out.  Sean Busbey, perhaps I could ask your assistance as someone
more knowledgeable in this area.  Regarding dependencies, effectively every
direct dependency is org.apache, so nothing to worry about here.

I acknowledge that I will lose dictatorial power and gain some bureaucratic
/ discussion overhead by moving from sole developer to an Apache model.
The benefits of a community are well worth it.

If we as a community decide that contrib is the right place for Graphulo,
then there are lots of logistical questions to decide like where the code
will live, where JIRA will live, what mailing lists to use, what URL to
give Graphulo within apache.org, etc.  We can tackle these at our leisure.
Let's discuss Graphulo and Accumulo here first.

Warmly,
Dylan Hutchison

Re: [Accumulo Contrib Proposal] Graphulo: Server-side Matrix Math library

2015-08-28 Thread Dylan Hutchison


 place this in the contrib area or create a sub-project?


Ah ha, I indeed had the two avenues mistakenly equated in my head since
both involve Incubator approval and the same proposal and IP template.

I intend Graphulo as a sub-project of Accumulo.  There are enough use cases
unrelated to Accumulo's core development (algorithms, Graphulo client,
Graphulo-specific iterators) that it makes sense to form a dedicated
project for Graphulo.  That said, Graphulo is coupled to Accumulo by design
and purpose, and there is large opportunity for synergy in that Graphulo
development may help Accumulo development and vice versa.  We're in that
happy middle spot where a sub-project makes sense.  That said, this is a
community decision, and so I'm open to other opinions.

Regards, Dylan

On Fri, Aug 28, 2015 at 8:08 AM, dlmarion dlmar...@comcast.net wrote:

 Dylan,

   I am a little confused about whether you want to place this in the
 contrib area or whether you want to create a sub-project as both are
 mentioned in your proposal. Also, if you intend for this to be a
 sub-project, have you looked at the incubator process? From what I
 understand given that this is a code contribution,it will have to go
 through that process.



  Original message 
 From: Dylan Hutchison dhutc...@uw.edu
 Date: 08/28/2015 2:43 AM (GMT-05:00)
 To: Accumulo Dev List d...@accumulo.apache.org
 Cc: Accumulo User List user@accumulo.apache.org
 Subject: [Accumulo Contrib Proposal] Graphulo: Server-side Matrix Math
 library

 Dear Accumulo community,

 I humbly ask your consideration of Graphulo
 https://github.com/Accla/graphulo as a new contrib project to
 Accumulo.  Let's use this thread to discuss what Graphulo is, how it fits
 into the Accumulo community, where we can take it together as a new
 community, and how you can use it right now.  Please see the README at
 Graphulo's Github, and for a more in-depth look see the docs/ folder or
 the examples.

 https://github.com/Accla/graphulo

 Graphulo is a Java library for the Apache Accumulo database delivering
 server-side sparse matrix math primitives that enable higher-level graph
 algorithms and analytics.

 Pitch: Organizations use Accumulo for high performance indexed and
 distributed data storage.  What do they do after their data is stored?
 Many use cases perform analytics and algorithms on data in Accumulo, which
 aside from simple iterators uses, require scanning data out from Accumulo
 to a computation engine, only to write computation results back to
 Accumulo.  Graphulo enables a class of algorithms to run inside the
 Accumulo server like a stored procedure, especially (but not restricted to)
 those written in the language of graphs and linear algebra.  Take breadth
 first search as a simple use case and PageRank as one more complex.  As a
 stretch goal, imagine analysts and mathematicians executing PageRank and
 other high level algorithms on top of the Graphulo library on top of
 Accumulo at high performance.

 I have developed Graphulo at the MIT Lincoln Laboratory with support from
 the NSF since last March.  I owe thanks to Jeremy Kepner, Vijay Gadepally,
 and Adam Fuchs for high level comments during design and performance
 testing phases.  I proposed a now-obsolete design document last Spring to
 the Accumulo community too which received good feedback.

 The time is ripe for Graphulo to graduate my personal development into
 larger participation.  Beyond myself and beyond the Lincoln Laboratory,
 Graphulo is for the Accumulo community.  Users need a place where they can
 interact, developers need a place where they can look, comment, and debate
 designs and diffs, and both users and developers need a place where they
 can interact and see Graphulo alongside its Accumulo base.

 The following outlines a few reasons why I see contrib to Accumulo as
 Graphulo's proper place:

1. Establishing Graphulo as an Apache (sub-)project is a first step
toward building a community.  The spirit of Apache--its mailing list
discussions, low barrier to interactions between users and developers new
and old, open meritocracy and more--is a surefire way to bring Graphulo to
the people it will help and the people who want to help it in turn.

2. Committing to core Accumulo doesn't seem appropriate for all of
Graphulo, because Graphulo uses Accumulo in a specific way (server-side
computation) in support of algorithms and applications.  Parts of Graphulo
that are useful for all Accumulo users (not just matrix math for
algorithms) could be transferred from Graphulo to Accumulo, such as
ApplyIterator or SmallLargeRowFilter or DynamicIterator.

3. Leaving Graphulo as an external project leaves Graphulo too
decoupled from Accumulo.  Graphulo has potential to drive features in core
Accumulo such as ACCUMULO-3978, ACCUMULO-3710
https://issues.apache.org/jira/browse/ACCUMULO-3710, and
ACCUMULO-3751 https

Re: [Accumulo Contrib Proposal] Graphulo: Server-side Matrix Math library

2015-08-28 Thread Dylan Hutchison

Hi Dave,
  Thank you for directing my attention to comparing contrib and Apache
sub-project status.  They are different paths indeed.

Contrib is a better route for Graphulo as a low-entry-barrier way to
promote a relationship between Accumulo and Graphulo development.  As
outlined in the bullets of the first email, there are a finite number of
tasks remaining before Graphulo may be considered complete software.
I've realized that Graphulo has a niche enough user and developer base that
attempting expansion to as many users and developers as an Apache project
entails is expecting too much.  We can always investigate the option of
becoming a sub-project later should the community support arise.

Regards, Dylan Hutchison


On Fri, Aug 28, 2015 at 12:48 PM, dlmar...@comcast.net wrote:

 [adding dev@ back in, I forgot to reply-all last time]

  I think the process and exit criteria would be different for a code
 contrbution vs a sub-project. [1] talks about projects and sub-projects,
 [2] talks about contributions. I don't know if the exit criteria for a
 sub-project is the same as a top level project; will you be required to
 show community growth, understanding the process, setting up the
 infrastructure, etc. If so, who is going to shepherd Graphulo through this
 process? I'm not an expert in this area. I just wanted to point out that
 they are likely hurdles of different height.

 [1] incubator.apache.org/incubation/Incubation_Policy.html
 [2] http://incubator.apache.org/ip-clearance/

 --
 *From: *Dylan Hutchison dhutc...@uw.edu
 *To: *Accumulo User List user@accumulo.apache.org, Accumulo Dev
 List d...@accumulo.apache.org
 *Sent: *Friday, August 28, 2015 11:25:11 AM
 *Subject: *Re: [Accumulo Contrib Proposal] Graphulo: Server-side Matrix
 Math library


 place this in the contrib area or create a sub-project?


 Ah ha, I indeed had the two avenues mistakenly equated in my head since
 both involve Incubator approval and the same proposal and IP template.

 I intend Graphulo as a sub-project of Accumulo.  There are enough use
 cases unrelated to Accumulo's core development (algorithms, Graphulo
 client, Graphulo-specific iterators) that it makes sense to form a
 dedicated project for Graphulo.  That said, Graphulo is coupled to Accumulo
 by design and purpose, and there is large opportunity for synergy in that
 Graphulo development may help Accumulo development and vice versa.  We're
 in that happy middle spot where a sub-project makes sense.  That said, this
 is a community decision, and so I'm open to other opinions.

 Regards, Dylan

 On Fri, Aug 28, 2015 at 8:08 AM, dlmarion dlmar...@comcast.net wrote:

 Dylan,

   I am a little confused about whether you want to place this in the
 contrib area or whether you want to create a sub-project as both are
 mentioned in your proposal. Also, if you intend for this to be a
 sub-project, have you looked at the incubator process? From what I
 understand given that this is a code contribution,it will have to go
 through that process.




  Original message 
 From: Dylan Hutchison dhutc...@uw.edu
 Date: 08/28/2015 2:43 AM (GMT-05:00)
 To: Accumulo Dev List d...@accumulo.apache.org
 Cc: Accumulo User List user@accumulo.apache.org
 Subject: [Accumulo Contrib Proposal] Graphulo: Server-side Matrix Math
 library

 Dear Accumulo community,

 I humbly ask your consideration of Graphulo
 https://github.com/Accla/graphulo as a new contrib project to
 Accumulo.  Let's use this thread to discuss what Graphulo is, how it fits
 into the Accumulo community, where we can take it together as a new
 community, and how you can use it right now.  Please see the README at
 Graphulo's Github, and for a more in-depth look see the docs/ folder or
 the examples.

 https://github.com/Accla/graphulo

 Graphulo is a Java library for the Apache Accumulo database delivering
 server-side sparse matrix math primitives that enable higher-level graph
 algorithms and analytics.

 Pitch: Organizations use Accumulo for high performance indexed and
 distributed data storage.  What do they do after their data is stored?
 Many use cases perform analytics and algorithms on data in Accumulo, which
 aside from simple iterators uses, require scanning data out from Accumulo
 to a computation engine, only to write computation results back to
 Accumulo.  Graphulo enables a class of algorithms to run inside the
 Accumulo server like a stored procedure, especially (but not restricted to)
 those written in the language of graphs and linear algebra.  Take breadth
 first search as a simple use case and PageRank as one more complex.  As a
 stretch goal, imagine analysts and mathematicians executing PageRank and
 other high level algorithms on top of the Graphulo library on top of
 Accumulo at high performance.

 I have developed Graphulo at the MIT Lincoln Laboratory with support from
 the NSF since last March.  I owe thanks to Jeremy Kepner, Vijay Gadepally,
 and Adam

Re: Abnormal behaviour of custom iterator in getting entries

2015-06-24 Thread Dylan Hutchison

Chiming in on one of Josh's comments

Since you're passing in what are likely multiple, disjoint ranges, I'm not
sure you're going to get much of a performance optimization out of a custom
iterator in this case. After each seek, your iterator would need to return
the entries that it summed in the provided Range (the Iterator framework
isn't designed to know the overall state of the scan -- you might have more
data to read or you might be done. You must return the data when the data
you're reading moves outside of the current range).

The way that you'd see the real optimization an Iterator provides is if
you are scanning over a large, contiguous set of rows specified by a single
Range (you can get the reduction of reading many key/values into a single
pair returned).

FYI, it is possible to obtain better custom iterator performance in the
case of scanning with multiple, disjoint ranges. The trick is to call
BatchScanner's setRanges() with an infinite range, causing Accumulo to run
your iterator on every tablet. Then, pass your desired ranges to the
iterator directly via iterator options, and let the iterator control
seeking itself. This is kind of advanced and needs more detailed study,
but you can see a prototype of how I do it in the Graphulo
https://github.com/Accla/d4m_api_java library:

https://github.com/Accla/d4m_api_java/blob/master/src/main/java/edu/mit/ll/graphulo/skvi/RemoteSourceIterator.java#L264

https://github.com/Accla/d4m_api_java/blob/master/src/main/java/edu/mit/ll/graphulo/skvi/RemoteWriteIterator.java#L360

Cheers, Dylan

On Tue, Jun 23, 2015 at 6:53 AM, madhvi madhvi.gu...@orkash.com wrote:

Thanks Josh. It really worked for me.

On Wednesday 17 June 2015 08:43 PM, Josh Elser wrote:

Madhvi,

Understood. A few more questions..

How are you passing these IDs to the batch scanner? Are you providing
individual Ranges for each ID (e.g. `new Range(new Key(row1, , id1),
true, new Key(row1, , id1\x00), false))`)? Or are you providing an
entire row (or set of rows) and using the fetchColumns(Text,Text) method
(or similar) on the BatchScanner?

Are you trying to sum across all rows that you queried? Or is your sum
per-row? If the former, that is going to cause you problems. The quick
explanation is that you can't reliably know the tablet boundaries so you
should try to perform an initial sum, per row. If you want, you can put a
second iterator above the first and do a summation across all rows to
reduce the amount of data sent to a client. However, if you use a
BatchScanner, you will still have to perform a final summation at the
client.

Check out
https://blogs.apache.org/accumulo/entry/thinking_about_reads_over_accumulo
for more details on that..

madhvi wrote:

Hi Josh,

Sorry, my company policy doesn't allow me to share full source.What we
are tryng to do is summing over a unique field stored in column
qualifier for IDs passed to batch scanner.Can u suggest how it can be
done in accumulo.

Thanks
Madhvi
On Wednesday 17 June 2015 10:32 AM, Josh Elser wrote:

You put random values in the family and qualifier? Do I misunderstand
you?

Also, if you can put up the full source for the iterator, that will be
much easier if you need help debugging it. It's hard for us to guess
at why your code might not be working as you expect.

madhvi wrote:

Hi Josh,

I have changed HashMap to TreeMap which sorts lexicographically and I
have inserted random values in column family and qualifier.Value of
TreeMap in value.
Used scanner and batch scanner but getting results only with scanner.

Thanks
Madhvi

On Tuesday 16 June 2015 08:42 PM, Josh Elser wrote:

Additionally, you're placing the Value into the ColumnQualifier and
dropping the ColumnFamily completely. Granted, that may not be a
problem for the specific data in your table, but it's not going to
work for any data.

Christopher wrote:

You're iterating over a HashMap. That's not sorted.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii

On Tue, Jun 16, 2015 at 1:58 AM, madhvimadhvi.gu...@orkash.com
wrote:

Hi Josh,
Thanks for replying. I will enable remote debugger on my Accumulo
server.

However I am slightly confused with your statement you are not
returning
your data in sorted order. Can you point the part in my iterator
code which
seems innapropriate and any possible solution for that?

Thanks
Madhvi

On Tuesday 16 June 2015 11:07 AM, Josh Elser wrote:

//matched the condition and put values to holder map.

Re: Connection pooling in accumulo

2015-06-24 Thread Dylan Hutchison

Don't forget the MultiTableBatchWriter
http://accumulo.apache.org/1.7/apidocs/org/apache/accumulo/core/client/MultiTableBatchWriter.html
=)

On Wed, Jun 24, 2015 at 2:58 PM, Josh Elser josh.el...@gmail.com wrote:

 Yep, connections to TabletServers and the Master are automatically
 pooled. There shouldn't be anything you have to do yourself -- it
 should just work out of the box.

 On Wed, Jun 24, 2015 at 2:54 PM, vaibhav thapliyal
 vaibhav.thapliyal...@gmail.com wrote:
  Hi everyone,
 
  I wanted to ask if Accumulo supports connection pooling?
 
  If yes is there something in the JAVA api that can be used to make use of
  it?
 
  Thanks
  Vaibhav

Re: BatchScanner taking too much time to scan rows

2015-05-14 Thread Dylan Hutchison

Sorry, just remembered that my setup was to scan an index table and gather
rowIDs, then scan a main data table using the rowIDs as the BatchScan
ranges.  Effectively it is a join of part of the index table to a main data
table.

The scan rate I achieved is therefore double the value I cited previously:
I showed about 76k entries/second.  Still not the best but it is more
within Accumulo standards.


On Thu, May 14, 2015 at 2:15 PM, Dylan Hutchison dhutc...@mit.edu wrote:

 I didn't have an average query time-- the tablet server crashed.  A quick
 solution is to batch the ranges into groups of 50k (or 500k, I forgot which
 one) and do many BatchScans-- not ideal.  I think I achieved 33k
 entries/second retrieval on a single-node Accumulo.  Accumulo is better for
 sequential lookup than random.

 On Thu, May 14, 2015 at 1:57 PM, vaibhav thapliyal 
 vaibhav.thapliyal...@gmail.com wrote:

 Dylan could you elaborate on the average query time you had?
 Thanks
 Vaibhav
 On 14-May-2015 11:03 pm, Dylan Hutchison dhutc...@mit.edu wrote:

 I think this is the same issue I found for ACCUMULO-3710
 https://issues.apache.org/jira/browse/ACCUMULO-3710, only in my case
 the tserver ran out of memory.  Accumulo doesn't handle large numbers of
 small, disjoint ranges well.  I bet there's room for improvement on both
 the client and tablet server.
 ~Dylan

 On Wed, May 13, 2015 at 3:13 PM, Eric Newton eric.new...@gmail.com
 wrote:

 Yes, hot-spotting does affect accumulo because you have fewer servers
 and caches handling your request.

 Let's say your data is spread out, in a normal distribution from
 0..9.

 What if you have only 1 split?  You would want it at 5, to divide the
 data in half, and you could host the halves on different servers.  But if
 you split at 1, now 10% of your queries go to one tablet, and 90% go to the
 other.

 -Eric


 On Wed, May 13, 2015 at 1:56 PM, vaibhav thapliyal 
 vaibhav.thapliyal...@gmail.com wrote:

 Thank you Eric. I will surely do the same. Should uneven distribution
 across the tablets affect querying in accumulo?  If this case, it is. Is
 this behaviour normal?
 On 13-May-2015 10:58 pm, Eric Newton eric.new...@gmail.com wrote:

 Yes, that's a great way to split the data evenly.

 Also, since the data set is so small, turn on data caching for your
 table:

 shell config -t mytable -s table.cache.block.enable=true

 You may want to increase the size of your tserver JVM, and increase
 the size of the cache:

 shell config -s tserver.cache.data.size=1G

 This will help with repeated random look-ups.

 -Eric

 On Wed, May 13, 2015 at 11:31 AM, vaibhav thapliyal 
 vaibhav.thapliyal...@gmail.com wrote:

 Thank you Eric.

 One thing I would like to know. Does pre-splitting the data play a
 part in querying accumulo?

 Because I managed to somewhat decrease the querying time.
 I did the following steps:
 My table was around 1.47gb so I explicity set the split parameter to
 256mb instead of the default 1gb.

 So I had just 8 tablets. Now when I carried out the same query, it
 finished in 15s.

 Is it because of the split points are more evenly distributed?

 The previous table on which the query took 50s had entries unevenly
 distributed across the tablets.
 Thanks
 Vaibhav
 On 13-May-2015 7:43 pm, Eric Newton eric.new...@gmail.com wrote:

 This use case is one of the things Accumulo was designed to handle
 well. It's the reason there is a BatchScanner.

 I've created:

 https://issues.apache.org/jira/browse/ACCUMULO-3813

 so we can investigate and track down any problems or improvements.

 Feel free to add any other details to the JIRA ticket.

 -Eric


 On Wed, May 13, 2015 at 10:03 AM, Emilio Lahr-Vivaz 
 elahrvi...@ccri.com wrote:

  It sounds like each of your ranges is an ID, e.g. a single row.
 I've found that scanning lots of non-sequential single-row ranges is 
 pretty
 slow in accumulo. Your best approach is probably to create an index 
 table
 on whatever you are originally trying to query (assuming those 1 
 ids
 came from some other query).

 Thanks,

 Emilio


 On 05/13/2015 09:14 AM, vaibhav thapliyal wrote:

  The rf files per tablet vary between 2 to 5 per tablet. The
 entries returned to me by the batchScanner is 46. The approx. 
 average
 data rate is 0.5 MB/s as seen on the accumulo monitor page.

  A simple scan on the table has an average data rate of about 7-8
 MB/s.

  All the ids exist in the accumulo table.

 On 12 May 2015 at 23:39, Keith Turner ke...@deenlo.com wrote:

 Do you know how much data is being brought back (i.e. 100
 megabytes)? I am wondering what the data rate is in MB/s.  Do you 
 know how
 many files per tablet you have?  Do most of the 10,000 ids you are 
 querying
 for exist?

 On Tue, May 12, 2015 at 1:58 PM, vaibhav thapliyal 
 vaibhav.thapliyal...@gmail.com wrote:

 I have 194 tablets. Currently I am using 20 threads to create
 the batchscanner inside the createBatchScanner method.
  On 12-May-2015 11:19 pm, Keith Turner ke

Re: BatchScanner taking too much time to scan rows

2015-05-14 Thread Dylan Hutchison

I didn't have an average query time-- the tablet server crashed.  A quick
solution is to batch the ranges into groups of 50k (or 500k, I forgot which
one) and do many BatchScans-- not ideal.  I think I achieved 33k
entries/second retrieval on a single-node Accumulo.  Accumulo is better for
sequential lookup than random.

On Thu, May 14, 2015 at 1:57 PM, vaibhav thapliyal 
vaibhav.thapliyal...@gmail.com wrote:

 Dylan could you elaborate on the average query time you had?
 Thanks
 Vaibhav
 On 14-May-2015 11:03 pm, Dylan Hutchison dhutc...@mit.edu wrote:

 I think this is the same issue I found for ACCUMULO-3710
 https://issues.apache.org/jira/browse/ACCUMULO-3710, only in my case
 the tserver ran out of memory.  Accumulo doesn't handle large numbers of
 small, disjoint ranges well.  I bet there's room for improvement on both
 the client and tablet server.
 ~Dylan

 On Wed, May 13, 2015 at 3:13 PM, Eric Newton eric.new...@gmail.com
 wrote:

 Yes, hot-spotting does affect accumulo because you have fewer servers
 and caches handling your request.

 Let's say your data is spread out, in a normal distribution from
 0..9.

 What if you have only 1 split?  You would want it at 5, to divide the
 data in half, and you could host the halves on different servers.  But if
 you split at 1, now 10% of your queries go to one tablet, and 90% go to the
 other.

 -Eric


 On Wed, May 13, 2015 at 1:56 PM, vaibhav thapliyal 
 vaibhav.thapliyal...@gmail.com wrote:

 Thank you Eric. I will surely do the same. Should uneven distribution
 across the tablets affect querying in accumulo?  If this case, it is. Is
 this behaviour normal?
 On 13-May-2015 10:58 pm, Eric Newton eric.new...@gmail.com wrote:

 Yes, that's a great way to split the data evenly.

 Also, since the data set is so small, turn on data caching for your
 table:

 shell config -t mytable -s table.cache.block.enable=true

 You may want to increase the size of your tserver JVM, and increase
 the size of the cache:

 shell config -s tserver.cache.data.size=1G

 This will help with repeated random look-ups.

 -Eric

 On Wed, May 13, 2015 at 11:31 AM, vaibhav thapliyal 
 vaibhav.thapliyal...@gmail.com wrote:

 Thank you Eric.

 One thing I would like to know. Does pre-splitting the data play a
 part in querying accumulo?

 Because I managed to somewhat decrease the querying time.
 I did the following steps:
 My table was around 1.47gb so I explicity set the split parameter to
 256mb instead of the default 1gb.

 So I had just 8 tablets. Now when I carried out the same query, it
 finished in 15s.

 Is it because of the split points are more evenly distributed?

 The previous table on which the query took 50s had entries unevenly
 distributed across the tablets.
 Thanks
 Vaibhav
 On 13-May-2015 7:43 pm, Eric Newton eric.new...@gmail.com wrote:

 This use case is one of the things Accumulo was designed to handle
 well. It's the reason there is a BatchScanner.

 I've created:

 https://issues.apache.org/jira/browse/ACCUMULO-3813

 so we can investigate and track down any problems or improvements.

 Feel free to add any other details to the JIRA ticket.

 -Eric


 On Wed, May 13, 2015 at 10:03 AM, Emilio Lahr-Vivaz 
 elahrvi...@ccri.com wrote:

  It sounds like each of your ranges is an ID, e.g. a single row.
 I've found that scanning lots of non-sequential single-row ranges is 
 pretty
 slow in accumulo. Your best approach is probably to create an index 
 table
 on whatever you are originally trying to query (assuming those 1 
 ids
 came from some other query).

 Thanks,

 Emilio


 On 05/13/2015 09:14 AM, vaibhav thapliyal wrote:

  The rf files per tablet vary between 2 to 5 per tablet. The
 entries returned to me by the batchScanner is 46. The approx. 
 average
 data rate is 0.5 MB/s as seen on the accumulo monitor page.

  A simple scan on the table has an average data rate of about 7-8
 MB/s.

  All the ids exist in the accumulo table.

 On 12 May 2015 at 23:39, Keith Turner ke...@deenlo.com wrote:

 Do you know how much data is being brought back (i.e. 100
 megabytes)? I am wondering what the data rate is in MB/s.  Do you 
 know how
 many files per tablet you have?  Do most of the 10,000 ids you are 
 querying
 for exist?

 On Tue, May 12, 2015 at 1:58 PM, vaibhav thapliyal 
 vaibhav.thapliyal...@gmail.com wrote:

 I have 194 tablets. Currently I am using 20 threads to create the
 batchscanner inside the createBatchScanner method.
  On 12-May-2015 11:19 pm, Keith Turner ke...@deenlo.com
 wrote:

   How many tablets do you have?  The batch scanner does not
 parallelize operations within a tablet.

  If you give the batch scanner more threads than there are
 tservers, it will make multilple parallel rpc calls to each tserver 
 if the
 tserver has multiple tablets.  Each rpc may include multiple 
 tablets and
 ranges for each tablet.

  If the batch scanner has less threads than tservers, it will
 make one rpc per tserver per thread.  Each rpc call will include

Re: BatchScanner taking too much time to scan rows

2015-05-14 Thread Dylan Hutchison

I think this is the same issue I found for ACCUMULO-3710
https://issues.apache.org/jira/browse/ACCUMULO-3710, only in my case the
tserver ran out of memory.  Accumulo doesn't handle large numbers of small,
disjoint ranges well.  I bet there's room for improvement on both the
client and tablet server.
~Dylan

On Wed, May 13, 2015 at 3:13 PM, Eric Newton eric.new...@gmail.com wrote:

 Yes, hot-spotting does affect accumulo because you have fewer servers and
 caches handling your request.

 Let's say your data is spread out, in a normal distribution from 0..9.

 What if you have only 1 split?  You would want it at 5, to divide the
 data in half, and you could host the halves on different servers.  But if
 you split at 1, now 10% of your queries go to one tablet, and 90% go to the
 other.

 -Eric


 On Wed, May 13, 2015 at 1:56 PM, vaibhav thapliyal 
 vaibhav.thapliyal...@gmail.com wrote:

 Thank you Eric. I will surely do the same. Should uneven distribution
 across the tablets affect querying in accumulo?  If this case, it is. Is
 this behaviour normal?
 On 13-May-2015 10:58 pm, Eric Newton eric.new...@gmail.com wrote:

 Yes, that's a great way to split the data evenly.

 Also, since the data set is so small, turn on data caching for your
 table:

 shell config -t mytable -s table.cache.block.enable=true

 You may want to increase the size of your tserver JVM, and increase the
 size of the cache:

 shell config -s tserver.cache.data.size=1G

 This will help with repeated random look-ups.

 -Eric

 On Wed, May 13, 2015 at 11:31 AM, vaibhav thapliyal 
 vaibhav.thapliyal...@gmail.com wrote:

 Thank you Eric.

 One thing I would like to know. Does pre-splitting the data play a part
 in querying accumulo?

 Because I managed to somewhat decrease the querying time.
 I did the following steps:
 My table was around 1.47gb so I explicity set the split parameter to
 256mb instead of the default 1gb.

 So I had just 8 tablets. Now when I carried out the same query, it
 finished in 15s.

 Is it because of the split points are more evenly distributed?

 The previous table on which the query took 50s had entries unevenly
 distributed across the tablets.
 Thanks
 Vaibhav
 On 13-May-2015 7:43 pm, Eric Newton eric.new...@gmail.com wrote:

 This use case is one of the things Accumulo was designed to handle
 well. It's the reason there is a BatchScanner.

 I've created:

 https://issues.apache.org/jira/browse/ACCUMULO-3813

 so we can investigate and track down any problems or improvements.

 Feel free to add any other details to the JIRA ticket.

 -Eric


 On Wed, May 13, 2015 at 10:03 AM, Emilio Lahr-Vivaz 
 elahrvi...@ccri.com wrote:

  It sounds like each of your ranges is an ID, e.g. a single row. I've
 found that scanning lots of non-sequential single-row ranges is pretty 
 slow
 in accumulo. Your best approach is probably to create an index table on
 whatever you are originally trying to query (assuming those 1 ids 
 came
 from some other query).

 Thanks,

 Emilio


 On 05/13/2015 09:14 AM, vaibhav thapliyal wrote:

  The rf files per tablet vary between 2 to 5 per tablet. The entries
 returned to me by the batchScanner is 46. The approx. average data 
 rate
 is 0.5 MB/s as seen on the accumulo monitor page.

  A simple scan on the table has an average data rate of about 7-8
 MB/s.

  All the ids exist in the accumulo table.

 On 12 May 2015 at 23:39, Keith Turner ke...@deenlo.com wrote:

 Do you know how much data is being brought back (i.e. 100
 megabytes)? I am wondering what the data rate is in MB/s.  Do you know 
 how
 many files per tablet you have?  Do most of the 10,000 ids you are 
 querying
 for exist?

 On Tue, May 12, 2015 at 1:58 PM, vaibhav thapliyal 
 vaibhav.thapliyal...@gmail.com wrote:

 I have 194 tablets. Currently I am using 20 threads to create the
 batchscanner inside the createBatchScanner method.
  On 12-May-2015 11:19 pm, Keith Turner ke...@deenlo.com wrote:

   How many tablets do you have?  The batch scanner does not
 parallelize operations within a tablet.

  If you give the batch scanner more threads than there are
 tservers, it will make multilple parallel rpc calls to each tserver 
 if the
 tserver has multiple tablets.  Each rpc may include multiple tablets 
 and
 ranges for each tablet.

  If the batch scanner has less threads than tservers, it will make
 one rpc per tserver per thread.  Each rpc call will include all 
 tablets and
 associated ranges for that tserver.

  Keith



 On Tue, May 12, 2015 at 1:39 PM, vaibhav thapliyal 
 vaibhav.thapliyal...@gmail.com wrote:

 Hi,

  I am using BatchScanner to scan rows from a accumulo table. The
 table has around 187m entries and I am using a 3 node cluster which 
 has
 accumulo 1.6.1.

  I have passed 1 ids which are stored as row id in my table
 as a list in the setRanges() method.

  This whole process takes around 50 secs(from adding the ids in
 the list to scanning the whole table using the BatchScanner).

  I tried

Re: Custom Iterator output

2015-04-17 Thread Dylan Hutchison

Hi Vaibhav,

It sounds like you want to emit a single value that is a function of all
the entries in the parent iterator.  In that case, the following template
should solve your problem, using the example of summing Values interpreted
as Longs:

/**
 * Emit one value that is a function of entries from the parent iterator.
 */
public class SingleOutputIterator extends WrappingIterator {
  private static final TypedValueCombiner.EncoderLong encoder = new
LongCombiner.StringEncoder();
  private Key emitKey;
  private Value emitValue;

  @Override
  public void seek(Range range, CollectionByteSequence
columnFamilies, boolean inclusive) throws IOException {
super.seek(range, columnFamilies, inclusive);
myFunction();
  }

  /**
   * Reads all entries from the parent iterator, computing the value
you want to emit.
   * Example given is summing the Values of parent entries,
interpreted as Longs.
   */
  private void myFunction() throws IOException {
Long val = 0l;
while (super.hasTop()) {
  val += encoder.decode(super.getTopValue().get());
  super.next();
}
emitKey = new Key(); // replace this with the key you want to emit
emitValue = new Value(encoder.encode(val));
  }

  @Override
  public Key getTopKey() {
return emitKey;
  }

  @Override
  public Value getTopValue() {
return emitValue;
  }

  @Override
  public boolean hasTop() {
return emitKey != null;
  }

  @Override
  public void next() throws IOException {
emitKey = null;
emitValue = null;
  }
}

Regards,
Dylan Hutchison



On Fri, Apr 17, 2015 at 8:05 PM, vaibhav thapliyal 
vaibhav.thapliyal...@gmail.com wrote:

 Hi,

 I also had this query that might be similar to shweta.

 What I want to do is process the key value pairs that I get from
 getTopKey() and getTopValue() methods and I want to output that value.

 Currently I was writing these values to tables from inside the iterators,
 but I read in the new manual that says that doing this isn't a good
 practice.

 For eg:

 If I have these entries in my table:

 1 cf1:cq1 value1
 2 cf2:cq2 value2
 3 cf3:cq3 value3
 And suppose I sum the values(or do any opeation) of the row ids using the
 values that I get from the getTopKey().getRow() function and store this sum
 in a variable called sum.

 So I want to output this variable. How do I go about this?

 Thanks
 Vaibhav
 On 17-Apr-2015 6:40 pm, dlmar...@comcast.net wrote:

 via the getTopKey() and getTopValue() methods. [1] should be a simple
 example.

 [1]
 https://git-wip-us.apache.org/repos/asf?p=accumulo.git;a=blob;f=core/src/main/java/org/apache/accumulo/core/iterators/user/GrepIterator.java;h=043a729a778fc34d2ee87a0227056ffac81b7fe7;hb=refs/heads/master

 --
 *From: *shweta.agrawal shweta.agra...@orkash.com
 *To: *user@accumulo.apache.org
 *Sent: *Friday, April 17, 2015 8:50:26 AM
 *Subject: *Custom Iterator output

 Hi,

 I am working on custom iterator. I want to know, how do i get the output
 from the custom iterators?

 Thanks and Regards
 Shweta

Re: Unexpected aliasing from RFile getTopValue()

2015-04-14 Thread Dylan Hutchison

@Josh, the specific location is core/file/rfile/RFile.java, line 577 on
branch 1.6
https://github.com/apache/accumulo/blob/1.6/core/src/main/java/org/apache/accumulo/core/file/rfile/RFile.java#L577.


@Christopher, good point. I've avoided rethinking the entire SKVI interface
because it would be a gargantuan change, but it's worth doing.  And it may
lead to a better outcome than pointing out a bunch of local thoughts on the
iterator design.  Let's see if I can collect some thoughts before the
Accumulo summit.

~Dylan

On Tue, Apr 14, 2015 at 9:42 PM, Josh Elser josh.el...@gmail.com wrote:

 Do you mean to say you see this from the RFile Reader?


 Dylan Hutchison wrote:

 While debugging a custom iterator today to find the source of a logical
 error, I discovered something an iterator developer may not expect.  The
 getTopValue() of RFile returns a reference to the RFile's internal Value
 private variable.  The private variable is modified inside RFile via


 val.readFields(currBlock);


 which means that if an iterator stores the reference from getTopValue(),
 that is, without copying the Value to a new Object, then the value will
 be updated in the iterator when the RFile's next() method is called.

 Here is an example snippet to demonstrate:

 Value v1 = source.getTopValue();

 source.next();   // v1 is modified!


 The following code would not have a problem:

 Value v1 = new Value(source.getTopValue());

 source.next();


 I bet this is done for performance reasons.  Is this expected?

 Regards, Dylan

Local Combiners to pre-sum at BatchWriter

2015-04-04 Thread Dylan Hutchison

I've been thinking about a scenario that seems common among high-ingest
Accumulo users. Suppose we have a combiner-type iterator on a table on
all scopes.  One technique to increase ingest performance is pre-summing:
run the combiner on local entries before they are sent through a
BatchWriter, in order to reduce the number of entries sent to the tablet
server.

One way to do pre-summing is to create a MapKey,Value of entries to send
to the server on the local client. This equates to the following client
code, run for each entry to send to Accumulo:

  Key k = nextKeyToSend();
  Value v = nextValueToSend();
  Value vPrev = map.get(k);
  if (vPrev != null)
v = combiner.combine(vPrev, v);
  map.put(k, v);

Each time our map size exceeds a threshold (don't want to run out of memory
on the client),

  BatchWriter bw; // setup previously from connector
  for (Map.EntryKey,Value entry : map.entrySet()) {
Key k = entry.getKey();
Mutation m = new Mutation(k.getRow());
m.put(k.getColumnFamily(), k.getColumnQualifier(), entry.getValue());
bw.addMutation(m);
  }

(side note: using one entry change per mutation.  I've never investigated
whether it would be more efficient to put all the updates to a single row
[i.e. chaining multiple columns in the same row] in one mutation instead.)

This solution works, but it duplicates the purpose of the BatchWriter and
adds complexity to the client.  If we have to create a separate cache
collection, track its size and dump to a BatchWriter once it gets too big,
then it seems like we're reimplementing the behavior of the BatchWriter
that provides an internal cache of size set by
BatchWriterConfig.setMaxMemory() (that starts flushing once half the
maximum memory is used), and we're using two caches (user-created map + the
BatchWriter) where one should be sufficient.

I'm wondering whether there is a way to pre-sum mutations added to a
BatchWriter automatically, so that we can add entries to a BatchWriter and
trust that it will apply a combiner function to them before transmitting to
the tablet server. Something to the effect of:

  BatchWriter bw; // setup previously from connector
  Combiner combiner = new SummingCombiner();
  MapString, String combinerOptions = new HashMap();
  combinerOptions.put(all, true); // or some other column subset option
  bw.addCombiner(combiner);
  // or perhaps more generally/ambitiously: bw.addWriteIterator(combiner);

  // effect: combiner will be applied right before flushing data to server
  // if the combiner throws an exception, then throw a
MutationsRejectedException

Is there a better way to accomplish this, without duplicating BatchWriter's
buffer?  Or would this make a nice addition to the API?  If I understand
the BatchWriter correctly, it already sorts entries before sending to the
tablet server, because the tablet server can process them more efficiently
that way.  If so, the overhead cost seems small to add a combining step
after the sorting phase and before the network transmit phase, especially
if it reduces network traffic anyway.

Regards,
Dylan Hutchison

Scanning with many singleton ranges?

2015-04-02 Thread Dylan Hutchison

A friend of mine has a use case where he wants to scan ~1M individual rows,
scattered across a ~15GB table.  He performed the following:

1. Gather a List of Range objects, each one a singleton range spanning an
entire row.
2. Create a BatchScanner with one read thread.
3. Set the ranges via BatchScanner.setRanges()
4. Start iterating through the scanner.

Performing these steps crashed the TabletServer for my friend (haven't had
time to verify it myself yet). We're using a single-node standalone 1.6.1
Accumulo instance.

Is this a bad way to use Accumulo?  I advised my friend to batch the reads
into groups of ~10k ranges and see if that helps.  I wanted to check with
the community and see if we're doing something weird.  If the behavior
should have worked, I can try to put together a test case reproducing it,
that creates a table with many entries and then scans with many ranges.

Thanks,
Dylan Hutchison

Log4j: Removing Audit messages

2015-03-15 Thread Dylan Hutchison

Hi there,

I'd like to stop appending INFO-level AUDIT log messages to the regular and
debug tserver log files. Here is an example log:

2015-03-15 20:39:26,970 [Audit   ] INFO : operation: permitted; user: root;
client: 127.0.0.1:41256;


I'm confused how log4j is setup (see ACCUMULO-3546
https://issues.apache.org/jira/browse/ACCUMULO-3546).  I tried appending
variants of

log4j.logger.Audit=WARN


to $ACCUMULO_HOME/conf/log4j.properties, but no luck.  There is also the
generic_logger.xml and generic_logger.properties.

This line of code
from org.apache.accumulo.server.security.AuditedSecurityOperation is
relevant:


public static final String AUDITLOG = Audit;
public static final Logger audit = Logger.getLogger(AUDITLOG);


I want to stop this logger.

Regards,
Dylan Hutchison

Re: Log4j: Removing Audit messages

2015-03-15 Thread Dylan Hutchison

Yes, here are the contents of conf/auditLog.xml:

appender name=Audit
class=org.apache.log4j.DailyRollingFileAppender
param name=File
value=${org.apache.accumulo.core.dir.log}/${org.apache.accumulo.core.ip.localhost.hostname}.audit/
param name=MaxBackupIndex value=10/
param name=DatePattern value='.'-MM-dd/
layout class=org.apache.log4j.PatternLayout
param name=ConversionPattern value=%d{-MM-dd
HH:mm:ss,SSS/Z} [%c{2}] %-5p: %m%n/
/layout
/appender
logger name=Audit  additivity=false
appender-ref ref=Audit /
level value=*OFF*/
/logger


I don't think this file is parsed because if the level were truly OFF, then
I wouldn't see any audit messages.

On Sun, Mar 15, 2015 at 9:03 PM, Billie Rinaldi billie.rina...@gmail.com
wrote:

 Do you have an auditLog.xml?
 On Mar 15, 2015 8:58 PM, Dylan Hutchison dhutc...@mit.edu wrote:

 Hi there,

 I'd like to stop appending INFO-level AUDIT log messages to the regular
 and debug tserver log files. Here is an example log:

 2015-03-15 20:39:26,970 [Audit   ] INFO : operation: permitted; user:
 root; client: 127.0.0.1:41256;


 I'm confused how log4j is setup (see ACCUMULO-3546
 https://issues.apache.org/jira/browse/ACCUMULO-3546).  I tried
 appending variants of

 log4j.logger.Audit=WARN


 to $ACCUMULO_HOME/conf/log4j.properties, but no luck.  There is also the
 generic_logger.xml and generic_logger.properties.

 This line of code
 from org.apache.accumulo.server.security.AuditedSecurityOperation is
 relevant:


 public static final String AUDITLOG = Audit;
 public static final Logger audit = Logger.getLogger(AUDITLOG);


 I want to stop this logger.

 Regards,
 Dylan Hutchison

Re: Log4j: Removing Audit messages

2015-03-15 Thread Dylan Hutchison

Thanks, changing generic_logger.xml worked like a charm!  I set Audit to
WARN.
~Dylan

On Sun, Mar 15, 2015 at 9:39 PM, Josh Elser josh.el...@gmail.com wrote:

 On the contrary, if you don't have a *.audit file, it's probably getting
 parsed as the audit file wasn't made.

 log4j.properties is only used for client applications -- not the server
 processes.

 Server processes configure their logging by first looking for a
 %s_logger.xml and then %s_logger.properties file where %s is the name of
 the process (e.g. tserver, gc).

 Generally, there should be the following in your generic_logger.xml:

 logger name=Audit
   level value=OFF/
 /logger

 or in generic_logger.properties (if you remove *_logger.xml):

 log4j.logger.Audit=OFF

 These should probably be in the templates/examples that we provide. It
 doesn't appear that they presently are.

 Dylan Hutchison wrote:

 Yes, here are the contents of conf/auditLog.xml:

 appender name=Audit
 class=org.apache.log4j.DailyRollingFileAppender
 param name=File
 value=${org.apache.accumulo.core.dir.log}/${org.apache.
 accumulo.core.ip.localhost.hostname}.audit/
 param name=MaxBackupIndex value=10/
 param name=DatePattern value='.'-MM-dd/
 layout class=org.apache.log4j.PatternLayout
 param name=ConversionPattern value=%d{-MM-dd HH:mm:ss,SSS/Z}
 [%c{2}] %-5p: %m%n/
 /layout
 /appender
 logger name=Audit  additivity=false
 appender-ref ref=Audit /
 level value=*OFF*/
 /logger


 I don't think this file is parsed because if the level were truly OFF,
 then I wouldn't see any audit messages.

 On Sun, Mar 15, 2015 at 9:03 PM, Billie Rinaldi
 billie.rina...@gmail.com mailto:billie.rina...@gmail.com wrote:

 Do you have an auditLog.xml?

 On Mar 15, 2015 8:58 PM, Dylan Hutchison dhutc...@mit.edu
 mailto:dhutc...@mit.edu wrote:

 Hi there,

 I'd like to stop appending INFO-level AUDIT log messages to the
 regular and debug tserver log files. Here is an example log:

 2015-03-15 20:39:26,970 [Audit   ] INFO : operation:
 permitted; user: root; client: 127.0.0.1:41256
 http://127.0.0.1:41256;


 I'm confused how log4j is setup (see ACCUMULO-3546
 https://issues.apache.org/jira/browse/ACCUMULO-3546).  I tried
 appending variants of

 log4j.logger.Audit=WARN


 to $ACCUMULO_HOME/conf/log4j.properties, but no luck.  There is
 also the generic_logger.xml and generic_logger.properties.

 This line of code
 from org.apache.accumulo.server.security.AuditedSecurityOperation
 is
 relevant:


 public static final String AUDITLOG = Audit;
 public static final Logger audit = Logger.getLogger(AUDITLOG);


 I want to stop this logger.

 Regards,
 Dylan Hutchison

Re: Design-for-comment: Accumulo Server-Side Computation: Stored Procedure Tables starring SpGEMM

2015-02-26 Thread Dylan Hutchison

Hi Christopher, responses after yours:

So far I have had success implementing a variant of the design doc:
everything except the RemoteWriteIterator. I ran two RemoteSourceIterator
https://github.com/Accla/d4m_api_java/blob/master/src/main/java/edu/mit/ll/graphulo/RemoteSourceIterator.javas
and a DotMultIterator
https://github.com/Accla/d4m_api_java/blob/master/src/main/java/edu/mit/ll/graphulo/DotRemoteSourceIterator.java
on a new Accumulo table via a one-time manual major compaction. It works
as expected. Test code is here for RemoteSource
https://github.com/Accla/d4m_api_java/blob/master/src/test/java/edu/mit/ll/graphulo/RemoteIteratorTest.java
and here for the Dot
https://github.com/Accla/d4m_api_java/blob/master/src/test/java/edu/mit/ll/graphulo/DotRemoteIteratorTest.java.
That said, the tests are a very small size. I expect unforeseen issues,
maybe the ones you describe, will arise when we scale up.

Later this week I will prototype the design doc as written, using a stored
procedure table with table splits and iterators on major compaction and
does *not* store any data.

What do you think the similarities and differences are with other parallel
execution methods that one could use to achieve the same results (like
Map/Reduce)?

Similar to Yarn, MapReduce is great if we wanted to run computations over
an entire table, i.e. if we valued high throughput over low latency. The
design doc addresses the reverse use case, when we want to select subsets
of a table and value low latency over high throughput. This is where
Accumulo lends us strength. There is also a chance we can stream results
if our computation preserves sorted order-- see the temporary table section
https://github.com/Accla/accumulo_stored_procedure_design#temporary-tables
.

Also, do you have any code available for an example RemoteSourceIterator,
which we might be able to try? The Transpose one seemed simple enough, but
any others would be neat to try, also.

See above and the links in the design doc. Here is a RemoteMergeIterator
https://github.com/Accla/d4m_api_java/blob/master/src/main/java/edu/mit/ll/graphulo/RemoteMergeIterator.java.
There is some mess in the code as I shifted between designs.

Do you have any thoughts on whether there should be some abstract base
class available in Accumulo (vs. as part of the contrib) to support these
iterators and handle the boiler-plate stuff of setting up/serializing the
client configuration when the procedure executes, or utilities to help
create a stored procedure table?

Not sure yet. I can report back to the Accumulo community after one pass
implementing the design. For now, I'd like to see the Accumulo community's
opinion on the design and merit.
The temporary table concept may be worth Accumulo core consideration.

I wonder if one result of this project is writing a guideline /
best-practices document on *where to place iterators, *or in other words,
where to place computation.

Regards,
Dylan Hutchison

On Thu, Feb 26, 2015 at 2:34 PM, Christopher ctubb...@apache.org wrote:

Hi Dylan,

It's a clever way to leverage existing Accumulo behaviors (full major
compaction) to act as clients in order to perform a parallel operation to
populate a new table. Have you tried this method in practice at all, yet?
What pitfalls have you run into, perhaps regarding client-side static state
in the JVM, or resource management issues within the tablet servers? What
do you think the similarities and differences are with other parallel
execution methods that one could use to achieve the same results (like
Map/Reduce)?

Also, do you have any code available for an example RemoteSourceIterator,
which we might be able to try? The Transpose one seemed simple enough, but
any others would be neat to try, also.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii

On Thu, Feb 26, 2015 at 12:42 AM, Dylan Hutchison dhutc...@mit.edu
wrote:

Hello all,

As promised
https://mail-archives.apache.org/mod_mbox/accumulo-user/201502.mbox/%3CCAPx%3DJkakO3ice7vbH%2BeUo%2B6AP1JPebVbTDu%2Bg71KV8SvQ4J9WA%40mail.gmail.com%3E,
here is a design doc open for comments on implementing server-side
computation in Accumulo.

https://github.com/Accla/accumulo_stored_procedure_design

Would love to hear

Re: Design-for-comment: Accumulo Server-Side Computation: Stored Procedure Tables starring SpGEMM

2015-02-26 Thread Dylan Hutchison

?

Regards,
Dylan Hutchison

On Thu, Feb 26, 2015 at 3:43 PM, Josh Elser josh.el...@gmail.com wrote:

 Thanks for taking the time to write this up, Dylan.

 I'm a little worried about the RemoteWriteIterator. Using a BatchWriter
 implies that you'll need some sort of resource management - both ensuring
 that the BatchWriter is close()'ed whenever a compaction/procedure ends and
 handling rejected mutations. Have you put any thought into how you would
 address these?

 I'm not familiar enough with the internals anymore, but I remember that I
 had some pains trying to write to another table during compactions when I
 was working on replication. I think as long as it's not triggered off of
 the metadata table, it wouldn't have any deadlock issues.

 Architecturally, it's a little worrisome, because it feels a bit like
 using a wrench as a hammer -- iterators are great for performing some
 passing computation, but not really for doing some arbitrary read/writes.
 It gets back to how Accumulo/HBase comparisons where people try to compare
 Iterators and Coprocessors. They can sometimes do the same thing, but
 they're definitely different features.

 Anyways, I need to stew on it some more and give it a few more reads.
 Thanks again for sharing!

 Dylan Hutchison wrote:

 Hello all,

 As promised
 https://mail-archives.apache.org/mod_mbox/accumulo-user/
 201502.mbox/%3CCAPx%3DJkakO3ice7vbH%2BeUo%2B6AP1JPebVbTDu%
 2Bg71KV8SvQ4J9WA%40mail.gmail.com%3E,
 here is a design doc open for comments on implementing server-side
 computation in Accumulo.

 https://github.com/Accla/accumulo_stored_procedure_design

 Would love to hear your opinion, especially if the proposed design
 pattern matches one of /your use cases/.

 Regards,
 Dylan Hutchison

Scans during Compaction

2015-02-23 Thread Dylan Hutchison

Hello all,

When I initiate a full major compaction (with flushing turned on) manually via
the Accumulo API
https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/client/admin/TableOperations.html#compact(java.lang.String,
org.apache.hadoop.io.Text, org.apache.hadoop.io.Text, java.util.List,
boolean, boolean), how does the table appear to

   1. clients that started scanning the table before the major compaction
   began;
   2. clients that start scanning during the major compaction?

I'm interested in the case where there is an iterator attached to the full
major compaction that modifies entries (respecting sorted order of entries).

The best possible answer for my use case, with case #2 more important than
case #1 and *low latency* more important than high throughput, is that

   1. clients that started scanning before the compaction began would not
   see entries altered by the compaction-time iterator;
   2. clients that start scanning during the major compaction stream back
   entries as they finish processing from the major compaction, such that the
   clients *only* see entries that have passed through the compaction-time
   iterator.

How accurate are these descriptions?  If #2 really were as I would like it
to be, then a scan on the range (-inf,+inf) started after compaction would
monitor compaction progress, such that the first entry batch transmits to
the scanner as soon as it is available from the major compaction, and the
scanner finishes (receives all entries) exactly when the compaction
finishes.  If this is not possible, I may make something to that effect by
calling the blocking version of compact().

Bonus: how does cancelCompaction()
https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/client/admin/TableOperations.html#cancelCompaction(java.lang.String)
affect clients scanning in case #1 and case #2?

Regards,
Dylan Hutchison

Re: Scans during Compaction

2015-02-23 Thread Dylan Hutchison

Thanks Adam and Keith.

I see the following as a potential solution that achieves (1) low latency
for clients that want to see entries after an iterator and (2) the entries
from that iterator persisting in the Accumulo table.

1. Start a major compaction in thread T1 of a client with the iterator
set, blocking until the compaction completes.
2. Start scanning in thread T2 of the client with the same iterator now
set at scan-time scope. Use an isolated scanner to make sure we do not read
the results of the major compaction committing, though this is not
full-proof due to timing and that the isolated scanner is row-wise.
3. Eventually, T1 unblocks and signals that the compaction completes.
T1 interrupts T2.
4. Thread T2 stops scanning, removes the scan-time iterator, and starts
scanning again at the point it last left off, now seeing the results of the
major compaction which already passed through the iterator.

The whole scheme is only necessary if the client wants results faster than
the major compaction completes. A disadvantage is duplicated work -- the
iterator runs at scan-time and at compaction-time until the compaction
finishes. This may strain server resources.

Will think about other schemes. If only we could attach an apply-once
scan-time iterator, that also persists its results to an Accumulo table in
a streaming fashion. Or on the flip side, a one-time compaction iterator
that streams results, such that we could scan from them right away instead
of needing to wait for the entire compaction to complete.

Regards,
Dylan Hutchison

On Mon, Feb 23, 2015 at 12:48 PM, Adam Fuchs afu...@apache.org wrote:

Dylan,

The effect of a major compaction is never seen in queries before the major
compaction completes. At the end of the major compaction there is a
multi-phase commit which eventually replaces all of the old files with the
new file. At that point the major compaction will have completely processed
the given tablet's data (although other tablets may not be synchronized).
For long-running non-isolated queries (more than a second or so) the
iterator tree is occasionally rebuilt and re-seeked. When it is rebuilt it
will use whatever is the latest file set, which will include the results of
a completed major compaction.

In your case #1 that's a tricky guarantee to make across a whole tablet,
but it can be made one row at a time by using an isolated iterator.

To make your case #2 work, you probably will have to implement some
higher-level logic to only start your query after the major compaction has
completed, using an external mechanism to track the completion of your
transformation.

Adam

On Mon, Feb 23, 2015 at 12:35 PM, Dylan Hutchison dhutc...@stevens.edu
wrote:

Hello all,

When I initiate a full major compaction (with flushing turned on)
manually via the Accumulo API
https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/client/admin/TableOperations.html#compact(java.lang.String,%20org.apache.hadoop.io.Text,%20org.apache.hadoop.io.Text,%20java.util.List,%20boolean,%20boolean),
how does the table appear to

1. clients that started scanning the table before the major
compaction began;
2. clients that start scanning during the major compaction?

I'm interested in the case where there is an iterator attached to the
full major compaction that modifies entries (respecting sorted order of
entries).

The best possible answer for my use case, with case #2 more important
than case #1 and *low latency* more important than high throughput, is
that

1. clients that started scanning before the compaction began would
not see entries altered by the compaction-time iterator;
2. clients that start scanning during the major compaction stream
back entries as they finish processing from the major compaction, such
that
the clients *only* see entries that have passed through the
compaction-time iterator.

How accurate are these descriptions? If #2 really were as I would like
it to be, then a scan on the range (-inf,+inf) started after compaction
would monitor compaction progress, such that the first entry batch
transmits to the scanner as soon as it is available from the major
compaction, and the scanner finishes (receives all entries) exactly when
the compaction finishes. If this is not possible, I may make something to
that effect by calling the blocking version of compact().

Bonus: how does cancelCompaction()
https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/client/admin/TableOperations.html#cancelCompaction(java.lang.String)
affect clients scanning in case #1 and case #2?

Regards,
Dylan Hutchison

--
www.cs.stevens.edu/~dhutchis

Re: Scans during Compaction

2015-02-23 Thread Dylan Hutchison

Good suggestion; I will follow up with a design document in the next few
days.

Creating idempotentency via indicator entries (in the column family,
timestamp or something else) is one option to work in an iterator that
should run once over a table's entries.  I think we may have the
opportunity to solve a more general problem---scans merging data from
multiple table sources with user-defined merge  compute functions---in
addition to my use case by re-approaching the problem.  Think
selective-scan-transform-join-transform-write-out.  Analytics on the
server.

My specific use case is table-table multiplication, treating table rows,
column qualifiers and values as the components of a matrix.  We view a
table as a *sparse* matrix by treating non-present (row, colQ, value)
entries as zeros in the matrix.  Accumulo offers an advantage when we only
want to run *on selected ranges* from the input tables, as opposed to
running on whole tables where Yarn/Mapreduce may work better.  The
table-table multiplication should run on tables in Accumulo, the result
*persisting* to Accumulo, so that we never need return values to the
client.  We value *low latency* over high throughput, so that we can
perform multiplications interactively.  The user should have a way to
*monitor* multiplication progress, perhaps by a live count of the number of
entries processed or (harder) a live sample of result entries from the
multiplication.  The user should be able to *stop* an operation midway once
he decides enough entries processed.  In addition to interactivity, we may
want to perform multiplications *in series* and parallel.  They form
base *building
blocks* for higher-level algorithms.

I promise I will write these details up more formally, including how I made
them work so far and putting them in more general context.  Will post in a
separate thread.

Regards,
Dylan Hutchison

On Mon, Feb 23, 2015 at 2:16 PM, Adam Fuchs afu...@apache.org wrote:

 Dylan,

 I think the way this is generally solved is by using an idempotent
 iterator that can be applied at both full major compaction and query scopes
 to give a consistent view. Aggregation, age-off filtering, and all the
 other standard iterators have the property that you can leave them in
 place and get a consistent answer even if they are applied multiple times.
 Major compaction and query-time iterators are even simpler than the general
 case, since you don't really need to worry about partial views of the
 underlying data. In your case I think you are trying to use an iterator
 that needs to be applied exactly once to a complete stream of data (either
 at query time or major compaction time). What we should probably do is look
 at options for more generally supporting that type of iterator. You could
 help us a ton by describing exactly what you want your iterator to do, and
 we can all propose a few ideas for how this might be implemented. Here are
 a couple off the top of my head:

 1. If you can reform your iterator so that it is idempotent then you can
 apply it liberally. This might be possible using some sort of flag that the
 major compactor puts in the data and the query-time iterator looks for to
 determine if the compaction has already happened. We often use version
 numbers in column families to this effect. Special row keys at the
 beginning of the tablet might also be an option. This would be doable
 without changes to Accumulo.

 2. We could build a mechanism into core accumulo that applies an iterator
 with exactly once semantics, such that the user submits a transformation as
 an iterator and it gets applied similarly to how you described. The
 query-time reading of results of the major compaction might be overkill,
 but that would be a possible optimization that we could think about
 engineering in a second pass.

 Adam



 On Mon, Feb 23, 2015 at 1:42 PM, Dylan Hutchison dhutc...@stevens.edu
 wrote:

 Thanks Adam and Keith.

 I see the following as a potential solution that achieves (1) low latency
 for clients that want to see entries after an iterator and (2) the entries
 from that iterator persisting in the Accumulo table.

1. Start a major compaction in thread T1 of a client with the
iterator set, blocking until the compaction completes.
2. Start scanning in thread T2 of the client with the same iterator
now set at scan-time scope. Use an isolated scanner to make sure we do not
read the results of the major compaction committing, though this is not
full-proof due to timing and that the isolated scanner is row-wise.
3. Eventually, T1 unblocks and signals that the compaction
completes.  T1 interrupts T2.
4. Thread T2 stops scanning, removes the scan-time iterator, and
starts scanning again at the point it last left off, now seeing the 
 results
of the major compaction which already passed through the iterator.

 The whole scheme is only necessary if the client wants results faster
 than the major compaction

Re: Iterators adding data: IteratorEnvironment.registerSideChannel?

2015-02-16 Thread Dylan Hutchison

 why you want to use a side channel instead of implementing the merge in
 your own iterator

Here is a picture showing the difference--

Fig. A: Using a side channel to add a top-level iterator.

RfileIter1  RfileIter2  InjectIterator ...
|  /   /
|_/   /
o__*(3-way merge)*_/

|

VersioningIterator
|

OtherIterators
|
v
...


Fig. B: Merging in the data at a later stage

RfileIter1  RfileIter2  ...
|  /
o_/

|

VersioningIterator
|
| InjectIterator

o/

|

OtherIterators
|
v
...

(note: we're free to add iterators before the VersioningIterator too)

Unless the order of iterators matters (e.g., the VersioningIterator
position matters if InjectIterator generates an entry with the same row,
colFamily and colQualifier as an entry in the table), the two styles will
give the same results.

This has implications on composibility with other iterators, since
 downstream iterators would not see anything sent to the side channel but
 they would see things merged and returned by a MultiIterator.

If the iterator is at the top level, then every iterator below it will see
output from the top level iterator.  Did you mean composibility with other
iterators added at the top level?  If hypothetical iterator
InjectIterator2 needs to see the results of InjectIterator, then we
need to place InjectIterator2 below InjectIterator on the hierarchy,
whether in Fig. A or Fig. B.

For my particular situation, reading from another Accumulo table inside an
iterator, I'm not sure which is better.  I like the idea of adding another
data stream as a top-level source, but Fig. B is possible too.

Regards,
Dylan Hutchison


On Mon, Feb 16, 2015 at 11:34 AM, Adam Fuchs scubafu...@gmail.com wrote:

 Dylan,

 If I recall correctly (which I give about 30% odds), the original purpose
 of the side channel was to split up things like delete tombstone entries
 from regular entries so that other iterators sitting on top of a
 bifurcating iterator wouldn't have to handle the special tombstone
 preservation logic. This worked in theory, but it never really caught on.
 I'm not sure any operational code is calling the registerSideChannel method
 right now, so you're sort of in pioneering territory. That said, this looks
 like it should work as you described it.

 Can you describe why you want to use a side channel instead of
 implementing the merge in your own iterator (e.g. subclassing MultiIterator
 and overriding the init method)? This has implications on composibility
 with other iterators, since downstream iterators would not see anything
 sent to the side channel but they would see things merged and returned by a
 MultiIterator.

 Adam
  On Feb 16, 2015 3:18 AM, Dylan Hutchison dhutc...@stevens.edu wrote:

 If you can do a merge sort insertion, then you can guarantee order and
 it's fine.

 Yep, I guarantee the iterator we add as a side channel will emit tuples
 in sorted order.

 On a suggestion from David Medinets, I modified my testing code to use a
 MiniAccumuloCluster set to 2 tablet servers.  I then set a table split on
 row3 before launching the compaction.  The result looks good.  Here is
 output from a run on a local Accumulo instance.  Note that we write more
 values than we read.

 2015-02-16 02:44:51,125 [tserver.Tablet] DEBUG: Starting MajC k;row3
 (USER) [hdfs://localhost:9000/accumulo/tables/k/t-0g4/F0g5.rf] --
 hdfs://localhost:9000/accumulo/tables/k/t-0g4/A0g7.rf_tmp
  [name:InjectIterator, priority:15,
 class:edu.mit.ll.graphulo.InjectIterator, properties:{}]
 2015-02-16 02:44:51,127 [tserver.Tablet] DEBUG: Starting MajC k;row3
 (USER) [hdfs://localhost:9000/accumulo/tables/k/default_tablet/F0g6.rf]
 -- hdfs://localhost:9000/accumulo/tables/k/default_tablet/A0g8.rf_tmp
  [name:InjectIterator, priority:15,
 class:edu.mit.ll.graphulo.InjectIterator, properties:{}]
 2015-02-16 02:44:51,190 [tserver.Compactor] DEBUG: *Compaction k;row3 2
 read | 4 written* |111 entries/sec |  0.018 secs
 2015-02-16 02:44:51,194 [tserver.Compactor] DEBUG: *Compaction k;row3 1
 read | 4 written* | 43 entries/sec |  0.023 secs


 In addition, output from the DebugIterator looks as expected.  There is a
 re-seek after reading the first tablet to the key after the last entry
 returned in the first tablet.

 DEBUG:
 init(org.apache.accumulo.core.iterators.system.SynchronizedIterator@15085e63,
 {}, org.apache.accumulo.tserver.TabletIteratorEnvironment@586cc05e)
 DEBUG: 0x1C2BFB13 seek((-inf,+inf), [], false)

 ... snipped logs

 DEBUG:
 init(org.apache.accumulo.core.iterators.system.SynchronizedIterator@2b048c59,
 {}, org.apache.accumulo.tserver.TabletIteratorEnvironment@379a3d1f)
 DEBUG: 0x5946E74B seek([row2 colF3:colQ3 [] 9223372036854775807
 false,+inf), [], false)


 It seems the side channel strategy will hold up.  We have opened a new
 world of Accumulo-foo.  Of course, the real test is a multi-node instance
 with more than 10 entries of data

Re: Iterators adding data: IteratorEnvironment.registerSideChannel?

2015-02-16 Thread Dylan Hutchison


 If you can do a merge sort insertion, then you can guarantee order and
 it's fine.

Yep, I guarantee the iterator we add as a side channel will emit tuples in
sorted order.

On a suggestion from David Medinets, I modified my testing code to use a
MiniAccumuloCluster set to 2 tablet servers.  I then set a table split on
row3 before launching the compaction.  The result looks good.  Here is
output from a run on a local Accumulo instance.  Note that we write more
values than we read.

2015-02-16 02:44:51,125 [tserver.Tablet] DEBUG: Starting MajC k;row3
(USER) [hdfs://localhost:9000/accumulo/tables/k/t-0g4/F0g5.rf] --
hdfs://localhost:9000/accumulo/tables/k/t-0g4/A0g7.rf_tmp
 [name:InjectIterator, priority:15,
class:edu.mit.ll.graphulo.InjectIterator, properties:{}]
2015-02-16 02:44:51,127 [tserver.Tablet] DEBUG: Starting MajC k;row3
(USER) [hdfs://localhost:9000/accumulo/tables/k/default_tablet/F0g6.rf]
-- hdfs://localhost:9000/accumulo/tables/k/default_tablet/A0g8.rf_tmp
 [name:InjectIterator, priority:15,
class:edu.mit.ll.graphulo.InjectIterator, properties:{}]
2015-02-16 02:44:51,190 [tserver.Compactor] DEBUG: *Compaction k;row3 2
read | 4 written* |111 entries/sec |  0.018 secs
2015-02-16 02:44:51,194 [tserver.Compactor] DEBUG: *Compaction k;row3 1
read | 4 written* | 43 entries/sec |  0.023 secs


In addition, output from the DebugIterator looks as expected.  There is a
re-seek after reading the first tablet to the key after the last entry
returned in the first tablet.

DEBUG:
init(org.apache.accumulo.core.iterators.system.SynchronizedIterator@15085e63,
{}, org.apache.accumulo.tserver.TabletIteratorEnvironment@586cc05e)
DEBUG: 0x1C2BFB13 seek((-inf,+inf), [], false)

... snipped logs

DEBUG:
init(org.apache.accumulo.core.iterators.system.SynchronizedIterator@2b048c59,
{}, org.apache.accumulo.tserver.TabletIteratorEnvironment@379a3d1f)
DEBUG: 0x5946E74B seek([row2 colF3:colQ3 [] 9223372036854775807
false,+inf), [], false)


It seems the side channel strategy will hold up.  We have opened a new
world of Accumulo-foo.  Of course, the real test is a multi-node instance
with more than 10 entries of data.

Regards, Dylan


On Sun, Feb 15, 2015 at 11:17 PM, Andrew Wells awe...@clearedgeit.com
wrote:

 The main issue with adding data in an iterator is order. If you have can
 do a merge sort insertion, then you can guarantee order and  its fine. But
 if you are inserting base on input you cannot guarantee order, and it can
 only be on scan iterator.
  On Feb 15, 2015 8:03 PM, Dylan Hutchison dhutc...@stevens.edu wrote:

 Hello all,

 I've been toying with the registerSideChannel(iter)
 https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/IteratorEnvironment.html#registerSideChannel(org.apache.accumulo.core.iterators.SortedKeyValueIterator)
  method
 on the IteratorEnvironment passed to iterators through the init() method.
 From what I can tell, the method allows you to add another iterator as a
 top level source, to be merged in along with other usual top-level sources
 such as the in-memory cache and RFiles.

 Are there any downsides to using registerSideChannel( ) to add new data
 to an iterator chain?  It looks like this is fairly stable, so long as the
 iterator we add as a side channel implements seek() properly so as to only
 return entries whose rows are within a tablet.  I imagine it works like so:

 Suppose we set a custom iterator InjectIterator that registers a side
 channel inside init() at priority 5 as a one-time major compaction
 iterator.  InjectIterator forwards other operations to its parent, as in
 WrappingIterator
 https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/WrappingIterator.html.
 We start the compaction:

 Tablet 1 (a,g]

1. init() called on InjectIterator.  Creates the side channel
iterator, calls init() on it, and registers it.
2. init() called on VersioningIterator.
3. init() called on top level iterators, including Rfiles, in-memory
cache and the new side channel.
4. seek( (a,g] ) called on InjectIterator.
5. seek( (a,g] ) called on VersioningIterator.
6. seek( (a,g] ) called on top level iterators
7. next() called on InjectIterator. Forwards to parent.
8. next() called on VersioningIterator. Forwards to parent.
9. next() called on top level iterator (a MultiIterator

 https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/system/MultiIterator.html).
The next value is read from all the top-level iterator sources and the one
with the least key is cached ready to go.
10. ...

 Tablet 2 (g,p)  --- same as tablet 1 except steps 4-6 call seek( (g,p)
 ).  Done in parallel with tablet 1 if on a different tablet server.

 Is this an accurate depiction?  Anything I should treat with caution?  It
 seems to work on my single-node instance, so tips about difficulties going
 to multi-node are good.

 Code available here.
 https

Iterators adding data: IteratorEnvironment.registerSideChannel?

2015-02-15 Thread Dylan Hutchison

Hello all,

I've been toying with the registerSideChannel(iter)
https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/IteratorEnvironment.html#registerSideChannel(org.apache.accumulo.core.iterators.SortedKeyValueIterator)
method
on the IteratorEnvironment passed to iterators through the init() method.
From what I can tell, the method allows you to add another iterator as a
top level source, to be merged in along with other usual top-level sources
such as the in-memory cache and RFiles.

Are there any downsides to using registerSideChannel( ) to add new data
to an iterator chain?  It looks like this is fairly stable, so long as the
iterator we add as a side channel implements seek() properly so as to only
return entries whose rows are within a tablet.  I imagine it works like so:

Suppose we set a custom iterator InjectIterator that registers a side
channel inside init() at priority 5 as a one-time major compaction
iterator.  InjectIterator forwards other operations to its parent, as in
WrappingIterator
https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/WrappingIterator.html.
We start the compaction:

Tablet 1 (a,g]

   1. init() called on InjectIterator.  Creates the side channel iterator,
   calls init() on it, and registers it.
   2. init() called on VersioningIterator.
   3. init() called on top level iterators, including Rfiles, in-memory
   cache and the new side channel.
   4. seek( (a,g] ) called on InjectIterator.
   5. seek( (a,g] ) called on VersioningIterator.
   6. seek( (a,g] ) called on top level iterators
   7. next() called on InjectIterator. Forwards to parent.
   8. next() called on VersioningIterator. Forwards to parent.
   9. next() called on top level iterator (a MultiIterator
   
https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/system/MultiIterator.html).
   The next value is read from all the top-level iterator sources and the one
   with the least key is cached ready to go.
   10. ...

Tablet 2 (g,p)  --- same as tablet 1 except steps 4-6 call seek( (g,p) ).
Done in parallel with tablet 1 if on a different tablet server.

Is this an accurate depiction?  Anything I should treat with caution?  It
seems to work on my single-node instance, so tips about difficulties going
to multi-node are good.

Code available here.
https://github.com/Accla/d4m_api_java/blob/0d8c62164d5c0b59f949ce23c1b85536809764d2/src/main/java/edu/mit/ll/graphulo/InjectIterator.java#L166

Regards,
Dylan Hutchison

-- 
www.cs.stevens.edu/~dhutchis

Accumulo version at runtime?

2014-10-23 Thread Dylan Hutchison

Easy question Accumulators:

Is there an easy way to grab the version of a running Accumulo instance
programmatically from Java code in a class that connects to the instance?

Something like:

Instance instance = new
ZooKeeperInstance(instanceName,zookeeper_address);
String version = instance.getInstanceVersion();


Thanks, Dylan

-- 
www.cs.stevens.edu/~dhutchis

Re: Accumulo version at runtime?

2014-10-23 Thread Dylan Hutchison

How about a compromise: create *two classes *for the two versions, both
implementing the same interface.  Instantiate the class for the correct
version either from (1) static configuration file or (2) runtime hack
lookup to the version on the Monitor.

(1) gives safety at the expense of the user having to specify another
parameter.  (2) looks like it will work at least in the near future going
to 1.7, as well as for past versions.

Thanks for the suggestions!  I like the two classes approach better both as
a developer and as a user; no need to juggle JARs.

~Dylan


On Fri, Oct 24, 2014 at 12:41 AM, Sean Busbey bus...@cloudera.com wrote:


 On Thu, Oct 23, 2014 at 10:38 PM, Dylan Hutchison dhutc...@stevens.edu
 wrote:


 I'm working on a clean way to handle getting Accumulo monitor info for
 different versions of Accumulo, since I used methods to extract that
 information from Accumulo's internals which are version-dependent. As Sean
 wrote, these are not things one should do, but if it's a choice between
 getting the info or not...
 We're thinking of building separate JARs for each 1.x version.


 Why not just take the version of Accumulo you're going to talk to as
 configuration information that's given to you as a part of deploying your
 software?

 It'll make your life much simpler in the long run.


 --
 Sean




-- 
www.cs.stevens.edu/~dhutchis

Re: Determining tablets assigned to table splits, and the number of rows in each tablet

2014-10-06 Thread Dylan Hutchison

Yep, ticket here: ACCUMULO-3206
https://issues.apache.org/jira/browse/ACCUMULO-3206

There is a related movement at ACCUMULO-3005
https://issues.apache.org/jira/browse/ACCUMULO-3005 to make the
information of number of entries, number of bytes per tablet / tablet
server per table, available via a RESTful web server as an extension of the
monitor.  With the extra operations you suggest, number of keys in a range
and median key in a range, we would want to keep that at the API level so
that we can introduce authorizations.  Sounds great!

Could you layout a list of all the stats that Accumulo tracks already so
that we know what to implement, either here or on JIRA?  This will form the
basis for extending the API.

~Dylan


On Mon, Oct 6, 2014 at 10:31 AM, Adam Fuchs afu...@apache.org wrote:

 A few years ago we hashed out a rough idea of creating a stats API
 that would allow users to ask a variety of questions that leverage
 information that is already present in the system. Those questions
 would include things like:
  * Estimate of number of keys in a range. This would satisfy the key
 count per tablet request, but could also be used for things like
 predicting query result sizes.
  * Find the median key in a range. This is useful for doing things
 like parallelizing processing by ranges and predicting sizes of
 intersections.

 I think these would best be exposed in both the iterator API and as
 client operations. We never got around to building this before, mostly
 due to prioritization with other features. However, it seems to be
 coming up in conversation frequently these days. There are going to be
 a few tricky parts around cell-level security (information leakage)
 and accuracy of estimates. Is somebody working on creating this ticket
 already?

 Adam


 On Sat, Oct 4, 2014 at 9:23 PM, Josh Elser josh.el...@gmail.com wrote:
  I'll re-state it: I'd be happy to work with you to figure out some Java
 APIs
  for clients to consume for these kinds of metrics. A JIRA issue is the
 best
  way to encapsulate this. Would also love to help you provide a patch for
 it,
  too :)
 
  The biggest concern (at least for creating an API for entries in a table
 --
  by tablet/tabletserver/otherwise) is going to be that the number of
 entries
  is an approximation, not definitive. This is not prohibitive, though, as
  long as we're clear that it is an approximation and not an exact metric.
 
  Dylan Hutchison wrote:
 
  It should suffice to list the number of entries for a table, tablet and
  tablet server.  No need to worry about number of unique rows, number of
  unique column families, etc.  By entry I mean number of (key,value)s.
 
  For load balancing, we care about how much physical data is on each
 tablet
  / tablet server.  This is directly proportional to the number of
 entries,
  assuming that the key size and value size in b
 
  ytes do not
 
  differ too drastically.  If they do (say for raw documents of vastly
  different sizes), the best measure is the /size of the data in bytes
 /for
  each tablet / tablet server.  I didn't suggest it because it doesn't
 look
  like Accumulo tracks it so it would involve a lot of new implementation
 and
  book-keeping, which could hamper performance.
 
  Accumulo does already track the number of entries for tables, tablets
 and
  tablet server.  It's just hard to get to, relying on the format of the
  metadata table and accessing the non-public Monitor classes.  Bringing
 it to
  the public API just looks like a matter of reworking the API and
 letting the
  client gather the information that the Monitor already does by
 connecting to
  each tablet server.  Does that sound reasonable?
 
  Regards, Dylan
 
  On Sat, Oct 4, 2014 at 4:11 PM, David Medinets 
 david.medin...@gmail.com
  mailto:david.medin...@gmail.com wrote:
 
  Adding this functionality in
 
  to Accumulo's API would reduce it's
 
  efficiency for users that don't need this level of tracking. Let
  ingest procedures take the performance hit. There are
  synchronization issues that reduce degrade performance. Also what
  would be the appropriate level of tracking - at the row,
  column-family, or every level? Whatever answer you give, someone
  else will ask for something different. And then there are the
  aggregation questions. Not to mention the additional storage
  requirements.
 
 
 
  --
  www.cs.stevens.edu/~dhutchis http://www.cs.stevens.edu/~dhutchis




-- 
www.cs.stevens.edu/~dhutchis

Re: Determining tablets assigned to table splits, and the number of rows in each tablet

2014-10-04 Thread Dylan Hutchison

David, thanks for the pointer to the articles.  I read them a few months
ago but forgot.  Will need to read the HyperLogLog paper
https://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/40671.pdf
.

*The number of unique rows within a tablet are not explicitly tracked.*


Right Josh, I misspoke.  For load balancing, we're interested in the *number
of entries in each tablet*, not the number of unique rows.  Only counting
the number of unique rows doesn't distinguish between really big rows and
singleton rows, and as David pointed out, we need client-controlled means
of doing unique row counting/estimation.

We can see the number of entries in a Table and the number of entries in a
Table of a particular Tablet Server, because these are listed in the
monitor.
[image: Inline image 2]

David, you may recognize the name of this tablet server.  Just got Accumulo
Vagrant https://github.com/medined/Accumulo_1_5_0_By_Vagrant working last
week, thanks ;)

[image: Inline image 1]

However, there could be multiple Tablets assigned to the same Tablet
Server.  Here is an outline of the procedure I followed to read the
*TabletStats.numEntries*
https://accumulo.apache.org/1.5/apidocs/org/apache/accumulo/core/tabletserver/thrift/TabletStats.html#numEntries
for the correct Tablet that holds a split range.

Given table name,

   -

   get a list of all tablet servers by connecting to the Master and
   referencing the MasterMonitorInfo
   
https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/master/thrift/MasterClientService.Client.html#getMasterStats(org.apache.accumulo.trace.thrift.TInfo,%20org.apache.accumulo.core.security.thrift.TCredentials)
   -

   get internal table ID via Tables.getNameToIdMap
   
https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/client/impl/Tables.html#getNameToIdMap(org.apache.accumulo.core.client.Instance)
   -

   connect to each tablet server  TabletStat
   
https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/tabletserver/thrift/TabletStats.htmls
   of tablets that are on the tablet server under the given internal table ID
   -

   Scan Metadata table starting at the {tableName converted to internal
   table ID}
   -

   and ending at {internal table ID}’’ (last entry for this table in
   the metadata table)
   -

  Example row: 1  (if the internal table ID is 1 and this is the last
  split in the row)
  -

   look at the column for the previous row:  ~tab:~pr
   -

  Example row-col-val:   1 ~tab:~pr []\x00
  -

  (this table has no table splits-- no end row and no previous row
  start)
  -

   Create an extent for the value using KeyExtent
   
https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/data/KeyExtent.html
   -

  (shortcut for parsing the metadata table and getting the previous and
  current end row)
  -

   Among the list of TabletStats, find the one whose previous end row and
   next end row match the result from the Metadata table.

Take that tabletStat.numEntries
https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/tabletserver/thrift/TabletStats.html#numEntries
to get the number of entries in this table split range.

Later this information is combined into a method that returns an array of
triples

(tablet_split_range, tablet_num_entries, tablet_server_list_for_this_tablet)


I recommend adding the ability to get the number of entries for tables,
tablet servers and tablets to the public API.  It would be nice to
reference any of the data from the Accumulo monitor programmatically; in
this case we cross-reference monitor data with the Metadata table.  Josh,
is JIRA the place to file those kinds of suggestions?

Regards,
Dylan

-- 
www.cs.stevens.edu/~dhutchis

Re: Determining tablets assigned to table splits, and the number of rows in each tablet

2014-10-04 Thread Dylan Hutchison

It should suffice to list the number of entries for a table, tablet and
tablet server.  No need to worry about number of unique rows, number of
unique column families, etc.  By entry I mean number of (key,value)s.

For load balancing, we care about how much physical data is on each tablet
/ tablet server.  This is directly proportional to the number of entries,
assuming that the key size and value size in bytes do not differ too
drastically.  If they do (say for raw documents of vastly different sizes),
the best measure is the *size of the data in bytes *for each tablet /
tablet server.  I didn't suggest it because it doesn't look like Accumulo
tracks it so it would involve a lot of new implementation and book-keeping,
which could hamper performance.

Accumulo does already track the number of entries for tables, tablets and
tablet server.  It's just hard to get to, relying on the format of the
metadata table and accessing the non-public Monitor classes.  Bringing it
to the public API just looks like a matter of reworking the API and letting
the client gather the information that the Monitor already does by
connecting to each tablet server.  Does that sound reasonable?

Regards, Dylan

On Sat, Oct 4, 2014 at 4:11 PM, David Medinets david.medin...@gmail.com
wrote:

 Adding this functionality into Accumulo's API would reduce it's efficiency
 for users that don't need this level of tracking. Let ingest procedures
 take the performance hit. There are synchronization issues that reduce
 degrade performance. Also what would be the appropriate level of tracking -
 at the row, column-family, or every level? Whatever answer you give,
 someone else will ask for something different. And then there are the
 aggregation questions. Not to mention the additional storage requirements.



-- 
www.cs.stevens.edu/~dhutchis

53 matches

Mail list logo