Re: Why may "tablet read ahead" take long time? (was: Profile a (batch) scan)

2019-01-15 Thread Adam Fuchs
Hi Maxim,

What you're seeing is an artifact of the threading model that Accumulo
uses. When you launch a query, Accumulo tablet servers will coordinate RPCs
via Thrift in one thread pool (which grows unbounded) and queue up scans
(rfile lookups, decryption/decompression, iterators, etc.) in another
threadpool known as the readahead pool (which has a fixed number of
threads). You're seeing everything that happens in that readahead thread in
one big chunk. You may need to look a bit deeper into profiling/sampling
tablet server CPU to get insights into how to improve your query
performance. If you want to speed up queries in general you might try (in
no particular order):
1. Increase parallelism by bumping up the readahead threads
(tserver.readahead.concurrent.max). This will still be bounded by the
number of parallel scans clients are driving.
2. Increase parallelism driven by clients by querying more, smaller ranges,
or by splitting tablets.
3. Increase scan batch sizes if the readahead thread or thrift coordination
overhead is high.
4. Optimize custom iterators if that is a CPU bottleneck.
5. Increase cache sizes or otherwise modify queries to improve cache hit
rates.
6. Change compression settings if that is a CPU bottleneck. Try snappy
instead of gz.

Cheers,
Adam

On Tue, Jan 15, 2019, 10:45 AM Maxim Kolchin  Hi all,
>
> I try to trace some scans with Zipkin and see that quite often the trace
> called "tablet read ahead" takes 10x or 100x more time than the other
> similar traces.
>
> Why it may happen? What could be done to reduce the time? I found a
> similar discussion on the list, but it doesn't have an answer. I'd be great
> to have a how-to article listing some steps which could be done.
>
> Attaching a screenshot of one of the traces having this issue.
>
> Maxim Kolchin
>
> E-mail: kolchin...@gmail.com
> Tel.: +7 (911) 199-55-73
> Homepage: http://kolchinmax.ru
>
> Below you can find a good example of what I'm struggling to understand
>> right now. It's a trace for a simple scan over some columns with a
>> BatchScanner using 75 threads. The scan takes 877 milliseconds and the main
>> contributor is the entry "tablet read ahead 1", which starts at 248 ms.
>> These are the questions that I cannot answer with this trace:
>>
>>1. why this heavy operation starts after 248ms? By summing up the delay
>>before this operation you get a number which is not even close to 248ms.
>>2. what does "tablet read ahead 1" means? In general, how to map the
>>entries of a trace to their meaning? Is there a guide about this?
>>3. why "tablet read ahead 1" takes 600ms? It's clearly not the sum of
>>the entries under this one but that's the important part.
>>4. I may be naive but...how much data have been read by this scan? How
>>many entries? That's very important to understand what's going on.
>>
>> Thanks for the help,
>>
>> Mario
>>
>> 877+ 0 Dice@h01 counts
>> 2+ 7 tserver@h12 startScan
>> 6+ 10 tserver@h15 startScan
>> 5+ 11 tserver@h15 metadata tablets read ahead 4
>> 843+ 34 Dice@h01 batch scanner 74- 1
>> 620+ 230 tserver@h09 startMultiScan
>> 600+ 248 tserver@h09 tablet read ahead 1
>> 22+ 299 tserver@h09 newDFSInputStream
>> 22+ 299 tserver@h09 getBlockLocations
>> 2+ 310 tserver@h09 ClientNamenodeProtocol#getBlockLocations
>> 1+ 321 tserver@h09 getFileInfo
>> 1+ 321 tserver@h09 ClientNamenodeProtocol#getFileInfo
>> 2+ 322 tserver@h09 DFSInputStream#byteArrayRead
>> 1+ 324 tserver@h09 DFSInputStream#byteArrayRead
>> 2+ 831 tserver@h09 DFSInputStream#byteArrayRead
>> 2+ 834 tserver@h09 DFSInputStream#byteArrayRead
>> 1+ 835 tserver@h09 BlockReaderLocal#fillBuffer(1091850413)
>> 1+ 874 tserver@h09 closeMultiScan
>> --
>> Mario Pastorelli | TERALYTICS
>>
>> *software engineer*
>>
>> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
>> phone: +41794381682
>> email: mario.pastore...@teralytics.chwww.teralytics.net
>>
>> Company registration number: CH-020.3.037.709-7 | Trade register Canton
>> Zurich
>> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann
>> de Vries
>>
>> This e-mail message contains confidential information which is for the sole
>> attention and use of the intended recipient. Please notify us at once if
>> you think that it may not be intended for you and delete it immediately.
>>
>>


Re: Major Compactions

2017-12-12 Thread Adam Fuchs
Watch out for ACCUMULO-4578 if you're using --cancel on one of the affected
versions (1.7.2 or 1.8.0 or earlier).

Adam


On Tue, Dec 12, 2017 at 7:57 AM, Mike Walch  wrote:

> There should be a mention of the --cancel option in the docs.  I created a
> PR to add it to the 2.0 docs:
>
> https://github.com/apache/accumulo-website/pull/51
>
> On Tue, Dec 12, 2017 at 1:30 AM Jeff Downton  wrote:
>
>> Thank you for the responses, I didn't see the --cancel option in the docs
>> so good that I asked.  I don't think I'll go as far as trying to delete the
>> fate transaction but it's an option I'll hold in reserve.
>>
>> Cheers!
>> Jeff
>>
>> On Mon, Dec 11, 2017 at 7:29 PM, Keith Turner  wrote:
>>
>>> The command to run in the shell is "compact --cancel -t table"
>>>
>>> On Mon, Dec 11, 2017 at 7:41 PM, Jeff Downton 
>>> wrote:
>>> > Hi All,
>>> >
>>> > Is it possible to stop a major compaction once it has begun (Accumulo
>>> > 1.7.0)?
>>> >
>>> > Manually kicked one off on a table containing ~20k tablets,  which
>>> > subsequently queued up ~20k major compactions (I'm assuming one for
>>> each
>>> > tablet in the table).
>>> >
>>> > It's running slower than I'd like so I'm looking to defer running it
>>> till
>>> > another time.  The purpose for the major compaction is to permanently
>>> remove
>>> > deleted key-value pairs in the table.
>>> >
>>> > Thanks in advance for any help.
>>> >
>>> > -Jeff
>>> >
>>>
>>
>>
>>
>> --
>> Jeff Downton
>> Software Developer
>> *PHEMI Systems*
>> 180-887 Great Northern Way
>> 
>> Vancouver, BC V5T 4T5
>> 
>> 604-726-9433
>> website  twitter 
>>  Linkedin
>> 
>>
>


Re: Key Refactroing

2017-06-21 Thread Adam Fuchs
Sven,

You might consider using a combination of AccumuloInputFormat and
AccumuloFileOutputFormat in a map/reduce job. The job will run in parallel,
speeding up your transformation, the map/reduce framework should help with
hiccups, and the bulk load at the end provides a atomic, eventually
consistent commit. These input/output formats can also be used with other
job frameworks like Spark. See for example:

examples/simple/src/main/java/org/apache/accumulo/examples/simple/mapreduce/TableToFile.java
examples/simple/src/main/java/org/apache/accumulo/examples/simple/mapreduce/bulk/BulkIngestExample.java

Cheers,
Adam



On Wed, Jun 21, 2017 at 1:49 AM, Sven Hodapp  wrote:

> Hi there,
>
> I would like to select a subset of a Accumulo talbe and refactor the keys
> to create a new table.
> There are about 30M records with a value size about 5-20KB each.
> I'm using Accumulo 1.8.0 and Java accumulo-core client library 1.8.0.
>
> I've written client code like that:
>
>  * create a scanner fetching a specific column in a specific range
>  * transforming the key into the new schema
>  * using a batch writer to write the new generated mutations into the new
> table
>
> scan = createScanner(FROM, auths)
> // range, fetchColumn
> writer = createBatchWriter(TO, configWriter)
> iter = scan.iterator()
> while (iter.hasNext()) {
> entry = iter.next()
> // create mutation with new key schema, but unaltered value
> writer.addMutation(mutation)
> }
> writer.close()
>
> But this is slow and error prone (hiccups, ...).
> Is it possible to use the Accumulo shell for such a task?
> Are there another solutions I can use or some tricks?
>
> Thank you very much for any advices!
>
> Regards,
> Sven
>
> --
> Sven Hodapp, M.Sc.,
> Fraunhofer Institute for Algorithms and Scientific Computing SCAI,
> Department of Bioinformatics
> Schloss Birlinghoven, 53754 Sankt Augustin, Germany
> sven.hod...@scai.fraunhofer.de
> www.scai.fraunhofer.de
>


Re: Accumulo Seek performance

2016-09-12 Thread Adam Fuchs
Sorry, Monday morning poor reading skills, I guess. :)

So, 3000 ranges in 40 seconds with the BatchScanner. In my past experience
HDFS seeks tend to take something like 10-100ms, and I would expect that
time to dominate here. With 60 client threads your bottleneck should be the
readahead pool, which I believe defaults to 16 threads. If you get perfect
index caching then you should be seeing something like 3000/16*50ms =
9,375ms. That's in the right ballpark, but it assumes no data cache hits.
Do you have any idea of how many files you had per tablet after the ingest?
Do you know what your cache hit rate was?

Adam


On Mon, Sep 12, 2016 at 9:14 AM, Josh Elser <josh.el...@gmail.com> wrote:

> 5 iterations, figured that would be apparent from the log messages :)
>
> The code is already posted in my original message.
>
> Adam Fuchs wrote:
>
>> Josh,
>>
>> Two questions:
>>
>> 1. How many iterations did you do? I would like to see an absolute
>> number of lookups per second to compare against other observations.
>>
>> 2. Can you post your code somewhere so I can run it?
>>
>> Thanks,
>> Adam
>>
>>
>> On Sat, Sep 10, 2016 at 3:01 PM, Josh Elser <josh.el...@gmail.com
>> <mailto:josh.el...@gmail.com>> wrote:
>>
>> Sven, et al:
>>
>> So, it would appear that I have been able to reproduce this one
>> (better late than never, I guess...). tl;dr Serially using Scanners
>> to do point lookups instead of a BatchScanner is ~20x faster. This
>> sounds like a pretty serious performance issue to me.
>>
>> Here's a general outline for what I did.
>>
>> * Accumulo 1.8.0
>> * Created a table with 1M rows, each row with 10 columns using YCSB
>> (workloada)
>> * Split the table into 9 tablets
>> * Computed the set of all rows in the table
>>
>> For a number of iterations:
>> * Shuffle this set of rows
>> * Choose the first N rows
>> * Construct an equivalent set of Ranges from the set of Rows,
>> choosing a random column (0-9)
>> * Partition the N rows into X collections
>> * Submit X tasks to query one partition of the N rows (to a thread
>> pool with X fixed threads)
>>
>> I have two implementations of these tasks. One, where all ranges in
>> a partition are executed via one BatchWriter. A second where each
>> range is executed in serial using a Scanner. The numbers speak for
>> themselves.
>>
>> ** BatchScanners **
>> 2016-09-10 17:51:38,811 [joshelser.YcsbBatchScanner] INFO : Shuffled
>> all rows
>> 2016-09-10 17:51:38,843 [joshelser.YcsbBatchScanner] INFO : All
>> ranges calculated: 3000 ranges found
>> 2016-09-10 17:51:38,846 [joshelser.YcsbBatchScanner] INFO :
>> Executing 6 range partitions using a pool of 6 threads
>> 2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO : Queries
>> executed in 40178 ms
>> 2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO :
>> Executing 6 range partitions using a pool of 6 threads
>> 2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO : Queries
>> executed in 42296 ms
>> 2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO :
>> Executing 6 range partitions using a pool of 6 threads
>> 2016-09-10 17:53:47,414 [joshelser.YcsbBatchScanner] INFO : Queries
>> executed in 46094 ms
>> 2016-09-10 17:53:47,415 [joshelser.YcsbBatchScanner] INFO :
>> Executing 6 range partitions using a pool of 6 threads
>> 2016-09-10 17:54:35,118 [joshelser.YcsbBatchScanner] INFO : Queries
>> executed in 47704 ms
>> 2016-09-10 17:54:35,119 [joshelser.YcsbBatchScanner] INFO :
>> Executing 6 range partitions using a pool of 6 threads
>> 2016-09-10 17:55:24,339 [joshelser.YcsbBatchScanner] INFO : Queries
>> executed in 49221 ms
>>
>> ** Scanners **
>> 2016-09-10 17:57:23,867 [joshelser.YcsbBatchScanner] INFO : Shuffled
>> all rows
>> 2016-09-10 17:57:23,898 [joshelser.YcsbBatchScanner] INFO : All
>> ranges calculated: 3000 ranges found
>> 2016-09-10 17:57:23,903 [joshelser.YcsbBatchScanner] INFO :
>> Executing 6 range partitions using a pool of 6 threads
>> 2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO : Queries
>> executed in 2833 ms
>> 2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO :
>> Executing 6 range partitions using a pool of 6 threads
>> 2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] 

Re: Accumulo Seek performance

2016-09-12 Thread Adam Fuchs
Josh,

Two questions:

1. How many iterations did you do? I would like to see an absolute number
of lookups per second to compare against other observations.

2. Can you post your code somewhere so I can run it?

Thanks,
Adam


On Sat, Sep 10, 2016 at 3:01 PM, Josh Elser  wrote:

> Sven, et al:
>
> So, it would appear that I have been able to reproduce this one (better
> late than never, I guess...). tl;dr Serially using Scanners to do point
> lookups instead of a BatchScanner is ~20x faster. This sounds like a pretty
> serious performance issue to me.
>
> Here's a general outline for what I did.
>
> * Accumulo 1.8.0
> * Created a table with 1M rows, each row with 10 columns using YCSB
> (workloada)
> * Split the table into 9 tablets
> * Computed the set of all rows in the table
>
> For a number of iterations:
> * Shuffle this set of rows
> * Choose the first N rows
> * Construct an equivalent set of Ranges from the set of Rows, choosing a
> random column (0-9)
> * Partition the N rows into X collections
> * Submit X tasks to query one partition of the N rows (to a thread pool
> with X fixed threads)
>
> I have two implementations of these tasks. One, where all ranges in a
> partition are executed via one BatchWriter. A second where each range is
> executed in serial using a Scanner. The numbers speak for themselves.
>
> ** BatchScanners **
> 2016-09-10 17:51:38,811 [joshelser.YcsbBatchScanner] INFO : Shuffled all
> rows
> 2016-09-10 17:51:38,843 [joshelser.YcsbBatchScanner] INFO : All ranges
> calculated: 3000 ranges found
> 2016-09-10 17:51:38,846 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 40178 ms
> 2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 42296 ms
> 2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:53:47,414 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 46094 ms
> 2016-09-10 17:53:47,415 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:54:35,118 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 47704 ms
> 2016-09-10 17:54:35,119 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:55:24,339 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 49221 ms
>
> ** Scanners **
> 2016-09-10 17:57:23,867 [joshelser.YcsbBatchScanner] INFO : Shuffled all
> rows
> 2016-09-10 17:57:23,898 [joshelser.YcsbBatchScanner] INFO : All ranges
> calculated: 3000 ranges found
> 2016-09-10 17:57:23,903 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 2833 ms
> 2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 2536 ms
> 2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 2150 ms
> 2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 2061 ms
> 2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO : Executing 6
> range partitions using a pool of 6 threads
> 2016-09-10 17:57:35,628 [joshelser.YcsbBatchScanner] INFO : Queries
> executed in 2140 ms
>
> Query code is available https://github.com/joshelser/a
> ccumulo-range-binning
>
>
> Sven Hodapp wrote:
>
>> Hi Keith,
>>
>> I've tried it with 1, 2 or 10 threads. Unfortunately there where no
>> amazing differences.
>> Maybe it's a problem with the table structure? For example it may happen
>> that one row id (e.g. a sentence) has several thousand column families. Can
>> this affect the seek performance?
>>
>> So for my initial example it has about 3000 row ids to seek, which will
>> return about 500k entries. If I filter for specific column families (e.g. a
>> document without annotations) it will return about 5k entries, but the seek
>> time will only be halved.
>> Are there to much column families to seek it fast?
>>
>> Thanks!
>>
>> Regards,
>> Sven
>>
>>


Re: Adding a second node to a single node installation

2016-05-23 Thread Adam Fuchs
Cyrille,

I think you're going to have to do a few things to get the nodes to act as
a cluster:

1. How would you like your Zookeeper cluster to be set up? If you're
planning on using a one-node Zookeeper instance on the master node, then
you may need to turn zookeeper off on your second node and set the
instance.zookeeper.host property in your accumulo-site.xml file. There are
other ways to get multiple zookeeper nodes to participate in a quorum that
are specified in the zookeeper documentation.

2. I'm guessing that your HDFS is also set up to have two distinct
filesystems on the two nodes. Accumulo requires the same filesystem
perspective on each node, so you'll need to create an HDFS instance that
spans the cluster following the distributed configuration guide in the
Hadoop documentation, then tell Accumulo to use that HDFS instance via the
instance.volumes property in accumulo-site.xml.

3. You generally don't need to run bin/accumulo init to add a node, only to
create a new cluster. However, you may have a tough time migrating existing
from your single node HDFS setup to a clustered setup. Migrating data is
possible, but if initializing again for the cluster is OK then that will be
an easier option.

Cheers,
Adam


On Mon, May 23, 2016 at 7:33 AM, Cyrille Savelief 
wrote:

> Hi,
>
> I have a single node Accumulo installation on Google Compute Engine
> working perfectly fine. I am now trying to add a second node but I am not
> able to make it work in cooperation with the first node. What I did :
>
> *On the first node*
> - stopped accumulo : ./bin/stop-here.sh
> - replaced localhost by the DNS of the current node in /conf/masters
> - replaced localhost by the DNS of the current node in /conf/slaves
> - added the DNS of the new node to /conf/slaves
> - started accumulo again : ./bin/start-here.sh
>
> *Result*
> - Everything is working fine : the monitor shows up and I am able to
> ingest data.
>
> *On the second node*
> - Hadoop, ZooKeeper and Accumulo have been installed
> - Files from the first node /conf directory have been copied to the second
> node /conf directory
> - ./bin/accumulo init has been executed
> - accumulo has been started via ./bin/start-here.sh without errors
>
> *Result*
> - The monitor shows up for the second node with the indication "MASTER IS
> DOWN".
> - The instance-id are differents on the first and second nodes although
> instance-name are identicals.
> - There are no errors in the log files.
>
> Did I miss something ?
>
> Best,
>
>
>
>


Re: Accumulo folks at Hadoop Summit San Jose

2016-05-19 Thread Adam Fuchs
I'll be there.

Adam

On Thu, May 19, 2016 at 11:01 AM, Josh Elser  wrote:

> Out of curiosity, are there going to be any Accumulo-folks at Hadoop
> Summit in San Jose, CA at the end of June?
>
> - Josh
>


Re: Three day Fluo Common Crawl test

2016-01-12 Thread Adam Fuchs
Nice writeup!

Thanks,
Adam

On Tue, Jan 12, 2016 at 11:59 AM, Keith Turner  wrote:

> We just completed a three day test of Fluo using Common Crawl data that
> went pretty well.
>
> http://fluo.io/webindex-long-run/
>
>
>


Re: Trigger for Accumulo table

2015-12-08 Thread Adam Fuchs
I totally agree, Christopher. I have also run into a few situations where
it would have been nice to have something like a mutation listener hook.
Particularly in generating indexing and stats records.

Adam


On Tue, Dec 8, 2015 at 5:59 PM, Christopher  wrote:

> In the future, it might be useful to provide a supported API hook here. It
> certainly would've made implementing replication easier, but could also be
> useful as a notification system.
>
> On Tue, Dec 8, 2015 at 4:51 PM Keith Turner  wrote:
>
>> Constraints are checked before data is written.  In the case of failures
>> a constraint may see data thats never successfully written.
>>
>> On Tue, Dec 8, 2015 at 4:18 PM, Christopher  wrote:
>>
>>> Look at org.apache.accumulo.core.constraints.Constraint for a
>>> description and
>>> org.apache.accumulo.core.constraints.DefaultKeySizeConstraint as an example.
>>>
>>> In short, Mutations which are live-ingested into a tablet server are
>>> validated against constraints you specify on the table. That means that all
>>> Mutations written to a table go through this bit of user-provided code at
>>> least once. You could use that fact to your advantage. However, this would
>>> be highly experimental and might have some caveats to consider.
>>>
>>> You can configure a constraint on a table with
>>> connector.tableOperations().addConstraint(...)
>>>
>>>
>>> On Sun, Dec 6, 2015 at 10:49 PM Thai Ngo  wrote:
>>>
 Christopher,

 This is interesting! Could you please give me more details about this?

 Thanks,
 Thai

 On Thu, Dec 3, 2015 at 12:17 PM, Christopher 
 wrote:

> You could also implement a constraint to notify an external system
> when a row is updated.
>
> On Wed, Dec 2, 2015, 22:54 Josh Elser  wrote:
>
>> oops :)
>>
>> [1] http://fluo.io/
>>
>> Josh Elser wrote:
>> > Hi Thai,
>> >
>> > There is no out-of-the-box feature provided with Accumulo that does
>> what
>> > you're asking for. Accumulo doesn't provide any functionality to
>> push
>> > notifications to other systems. You could potentially maintain other
>> > tables/columns in which you maintain the last time a row was
>> updated,
>> > but the onus is on your "other services" to read the table to find
>> out
>> > when a change occurred (which is probably not scalable at "real
>> time").
>> >
>> > There are other systems you could likely leverage to solve this,
>> > depending on the durability and scalability that your application
>> needs.
>> >
>> > For a system "close" to Accumulo, you could take a look at Fluo [1]
>> > which is an implementation of Google's "Percolator" system. This is
>> a
>> > system based on throughput rather than low-latency, so it may not
>> be a
>> > good fit for your needs. There are probably other systems in the
>> Apache
>> > ecosystem (Kafka, Storm, Flink or Spark Streaming maybe?) that are
>> be
>> > helpful to your problem. I'm not an expert on these to recommend on
>> (nor
>> > do I think I understand your entire architecture well enough).
>> >
>> > Thai Ngo wrote:
>> >> Hi list,
>> >>
>> >> I have a use-case when existing rows in a table will be updated by
>> an
>> >> internal service. Data in a row of this table is composed of 2
>> parts:
>> >> 1st part - immutable and the 2nd one - will be updated (filled in)
>> a
>> >> little later.
>> >>
>> >> Currently, I have a need of knowing when and which rows will be
>> updated
>> >> in the table so that other services will be wisely start consuming
>> the
>> >> data. It will make more sense when I need to consume the data in
>> near
>> >> realtime. So developing a notification function or simpler - a
>> trigger
>> >> is what I really want to do now.
>> >>
>> >> I am curious to know if someone has done similar job or there are
>> >> features or APIs or best practices available for Accumulo so far.
>> I'm
>> >> thinking of letting the internal service which updates the data
>> notify
>> >> us whenever it updates the data.
>> >>
>> >> What do you think?
>> >>
>> >> Thanks,
>> >> Thai
>>
>

>>


Re: Can't connect to Accumulo

2015-12-04 Thread Adam Fuchs
Mike,

I suspect if you get rid of the "localhost" line and restart Accumulo then
you will get services listening on the non-loopback IPs. Right now you have
some of your processes accessible outside your VM and others only
accessible from inside, and you probably have two tablet servers when you
should only have one.

Cheers,
Adam



On Fri, Dec 4, 2015 at 9:50 AM, Mike Thomsen  wrote:

> I tried adding some read/write examples and ran into a problem. It would
> hang at the first scan or write operation I tried. I checked the master
> port () and it was only listening on 127.0.0.1:. netstat had two
> entries for 9997. This is what conf/masters has for my VM:
>
> # limitations under the License.
>
> localhost
> vagrant-ubuntu-vivid-64
>
> It's the same with all of the other files (slaves, gc, etc.)
>
> Any ideas?
>
> Thanks,
>
> Mike
>
> On Thu, Dec 3, 2015 at 3:54 PM, Mike Thomsen 
> wrote:
>
>> Thanks! That was all that I needed to do.
>>
>> On Thu, Dec 3, 2015 at 3:33 PM, Josh Elser  wrote:
>>
>>> Could be that the Accumulo services are only listening on localhost and
>>> not the "external" interface for your VM. To get a connector, that's a call
>>> to a TabletServer which run on 9997 by default (and you have open).
>>>
>>> Do a `netstat -nape | fgrep 9997 | fgrep LISTEN` in your VM and see what
>>> interface the server is bound to. I'd venture a guess that you just need to
>>> put the FQDN for your VM in $ACCUMULO_CONF_DIR/slaves (and masters,
>>> monitor, gc, tracers, for completeness) instead of localhost.
>>>
>>>
>>> Mike Thomsen wrote:
>>>
 I have Accumulo running in a VM. This Groovy script will connect just
 fine from within the VM, but outside of the VM it hangs at the first
 println statement.

 String instance = "test"
 String zkServers = "localhost:2181"
 String principal = "root";
 AuthenticationToken authToken = new PasswordToken("testing1234");

 ZooKeeperInstance inst = new ZooKeeperInstance(instance, zkServers);
 println "Attempting connection"
 Connector conn = inst.getConnector(principal, authToken);
 println "Connected!"

 This is the listing of ports I have opened up in Vagrant:

 config.vm.network "forwarded_port", guest: 2122, host: 2122
config.vm.network "forwarded_port", guest: 2181, host: 2181
config.vm.network "forwarded_port", guest: 2888, host: 2888
config.vm.network "forwarded_port", guest: 3888, host: 3888
config.vm.network "forwarded_port", guest: 4445, host: 4445
config.vm.network "forwarded_port", guest: 4560, host: 4560
config.vm.network "forwarded_port", guest: 6379, host: 6379
config.vm.network "forwarded_port", guest: 8020, host: 8020
config.vm.network "forwarded_port", guest: 8030, host: 8030
config.vm.network "forwarded_port", guest: 8031, host: 8031
config.vm.network "forwarded_port", guest: 8032, host: 8032
config.vm.network "forwarded_port", guest: 8033, host: 8033
config.vm.network "forwarded_port", guest: 8040, host: 8040
config.vm.network "forwarded_port", guest: 8042, host: 8042
config.vm.network "forwarded_port", guest: 8081, host: 8081
config.vm.network "forwarded_port", guest: 8082, host: 8082
config.vm.network "forwarded_port", guest: 8088, host: 8088
config.vm.network "forwarded_port", guest: 9000, host: 9000
config.vm.network "forwarded_port", guest: 9092, host: 9092
config.vm.network "forwarded_port", guest: 9200, host: 9200
config.vm.network "forwarded_port", guest: 9300, host: 9300
config.vm.network "forwarded_port", guest: 9997, host: 9997
config.vm.network "forwarded_port", guest: , host: 
#config.vm.network "forwarded_port", guest: 10001, host: 10001
config.vm.network "forwarded_port", guest: 10002, host: 10002
config.vm.network "forwarded_port", guest: 11224, host: 11224
config.vm.network "forwarded_port", guest: 12234, host: 12234
config.vm.network "forwarded_port", guest: 19888, host: 19888
config.vm.network "forwarded_port", guest: 42424, host: 42424
config.vm.network "forwarded_port", guest: 49707, host: 49707
config.vm.network "forwarded_port", guest: 50010, host: 50010
config.vm.network "forwarded_port", guest: 50020, host: 50020
config.vm.network "forwarded_port", guest: 50070, host: 50070
config.vm.network "forwarded_port", guest: 50075, host: 50075
config.vm.network "forwarded_port", guest: 50090, host: 50090
config.vm.network "forwarded_port", guest: 50091, host: 50091
config.vm.network "forwarded_port", guest: 50095, host: 50095

 Any ideas why it is not letting my connect? It just hangs and never even
 seems to time out.

 Thanks,

 Mike

>>>
>>
>


Re: Quick question re UnknownHostException

2015-11-13 Thread Adam Fuchs
Josef,

If these are intermittent failures, you might consider turning on the
watcher [1] to automatically restart your processes. This should keep your
cluster from atrophying over time. You'll still have to take administrative
action to fix the DNS problem, but your availability should be better.

Cheers,
Adam

[1] http://accumulo.apache.org/1.7/accumulo_user_manual.html#watcher

On Fri, Nov 13, 2015 at 6:57 AM, Josef Roehrl - PHEMI 
wrote:

> Hi Everyone,
>
> Turns out that it was a DNS server issue exactly.  Had to get this
> confirmed by the Data Centre, though.
>
> Thanks!
>
> On Fri, Nov 13, 2015 at 12:25 PM, Josef Roehrl - PHEMI 
> wrote:
>
>> Hi All,
>>
>> 3 times in the past few weeks (twice on 1 system, once on another), the
>> master gets UnknownHostException (s), one by one, for each of the tablet
>> servers.  Then, it wants to stop them. Eventually, all the tablet servers
>> quit.
>>
>> It goes like this for all the tablet servers:
>>
>> 12 08:14:01,0498tserver:620
>> ERROR
>>
>> error sending update to tserver3:9997: 
>> org.apache.thrift.transport.TTransportException: 
>> java.net.UnknownHostException
>>
>> 12 09:01:53,0352master:12
>> ERROR
>>
>> org.apache.thrift.transport.TTransportException: 
>> java.net.UnknownHostException
>>
>> 12 16:35:50,0672master:110
>> ERROR
>>
>> unable to get tablet server status tserver3:9997[250e6cd2c500012] 
>> org.apache.thrift.transport.TTransportException: 
>> java.net.UnknownHostException
>>
>>
>>
>> I've redacted the real host names, of course.
>>
>> This could be a DNS problem, though the system was running fine for days
>> before this happened (same scenario on the 2 systems with really quite
>> different DNS servers).
>>
>> If any one has a hint or seen something like this, I would appreciate any
>> pointers.
>>
>> I have looked at the JIRA issues regarding DNS outages, but nothing seems
>> to fit this pattern.
>>
>> Thanks
>>
>> --
>>
>>
>> Josef Roehrl
>> Senior Software Developer
>> *PHEMI Systems*
>> 180-887 Great Northern Way
>> Vancouver, BC V5T 4T5
>> 604-336-1119
>> Website  Twitter
>>  Linkedin
>> 
>>
>>
>>
>
>
> --
>
>
> Josef Roehrl
> Senior Software Developer
> *PHEMI Systems*
> 180-887 Great Northern Way
> Vancouver, BC V5T 4T5
> 604-336-1119
> Website  Twitter 
>  Linkedin
> 
>
>
>


Re: pre-sorting row keys vs not pre-sorting row keys

2015-10-29 Thread Adam Fuchs
I bet what you're seeing is more efficient batching in the latter case.
BatchWriter goes through a binning phase whenever it fills up half of its
buffer, binning everything in the buffer into tablets. If you give it
sorted data it will probably be binning into a subset of the tablets
instead of all of them, which would be likely in the random case. Fewer
batches translates into fewer RPC calls, and less general overhead.

This generally indicates that if your data starts roughly partitioned it
will load faster, and that becomes more important as you scale up.

Adam
Hi,

We just did a simple test:

- insert 10k batches of columns
- sort the same 10k batch based on row keys and insert

So basically the batch writer in the first test has items in non-sorted
order and in the second one in sorted order. We noticed 50% better
performance in the sorted version! Why is that the case? Is this something
we need to consider doing for live ingest scenarios?

Thanks,
Ara.





This message is for the designated recipient only and may contain
privileged, proprietary, or otherwise confidential information. If you have
received it in error, please notify the sender immediately and delete the
original. Any other use of the e-mail by you is prohibited. Thank you in
advance for your cooperation.




Re: Is there a sensible way to do this? Sequential Batch Scanner

2015-10-28 Thread Adam Fuchs
Rob,

I would use something like an IteratorChain [1] and fead it
Scanner.iterator() objects. If you setReadaheadThreshold(0) on the scanner
then calling Scanner.iterator() is a fairly lightweight operation, and
you'll be able to plop a bunch of iterators into the IteratorChain so that
they are dynamically activated when you're ready for them. If you want
higher throughput you will have to do something tricky with readahead
thresholds, like writing your own iterator chain and reading ahead on only
a few ScannerIterators at a time. You might not need that to get good
enough performance, though.

[1]
https://commons.apache.org/proper/commons-collections/javadocs/api-2.1.1/org/apache/commons/collections/iterators/IteratorChain.html

Adam

On Wed, Oct 28, 2015 at 4:00 PM, Rob Povey  wrote:

> Unfortunately that’s pretty much what I’m doing now, and the results are
> large enough that pulling them back and sorting them causes fairly dramatic
> GC issues.
> If I could get them in sorted order I no longer need to retain them, I can
> just process them and discard them eliminating my GC issues.
> I think the way I’ll end up working around this in the short term is to
> pull pages of data from a batch scanner, sort those, then combine the paged
> results. That should be manageable.
>
> Rob Povey
>
> From: Keith Turner 
> Reply-To: "user@accumulo.apache.org" 
> Date: Wednesday, October 28, 2015 at 8:04 AM
> To: "user@accumulo.apache.org" 
> Subject: Re: Is there a sensible way to do this? Sequential Batch Scanner
>
> Will the results always fit into memory?  If so could put results from
> batch scanner into ArrayList and sort it.
>
> On Tue, Oct 27, 2015 at 6:21 PM, Rob Povey  wrote:
>
>> What I want is something that behaves like a BatchScanner (I.e. Takes a
>> collection of Ranges in a single RPC), but preserves the scan ordering.
>> I understand this would greatly impact performance, but in my case I can
>> manually partition my request on the client, and send one request per
>> tablet.
>> I can’t use scanners, because in some cases I have 10’s of thousands of
>> none consecutive ranges.
>> If I use a single threaded BatchScanner, and only request data from a
>> single Tablet, am I guaranteed ordering?
>> This appears to work correctly in my small tests (albeit slower than a
>> single 1 thread Batch scanner call), but I don’t really want to have to
>> rely on it if the semantic isn’t guaranteed.
>> If not Is there another “efficient” way to do this.
>>
>> Thanks
>>
>> Rob Povey
>>
>>
>


Re: Why the Range not find the data

2015-10-14 Thread Adam Fuchs
Try using the Range.exact(...) and Range.prefix(...) helper methods to
generate specific ranges. Key.followingKey(...) might also be helpful.

Cheers,
Adam

On Wed, Oct 14, 2015 at 9:59 AM, Lu Qin  wrote:

> In my accumulo cluster ,the table has this data:
> 0 cf0:cq0 []v0
> 1 cf1:cq1 []v1
>
> then I use scan to find it like this:
>
> ranges.add(new Range(new Key(new Text("0"), new Text("cf0")), true,
>  new Key(new Text("0"), new Text("cf0")), true));
>
> but it tell me #results=0.
>
>  If i code like this:
>
> ranges.add(new Range(new Key(new Text("0"), new Text("cf0")), true,
>  new Key(new Text("0"), new Text("cf00")), true));
>
> it works OK.
>
>
> api show me if I set true ,it will include the start and end key.Why I can
> not find the data ?
>
> Thanks.
>


Re: What is the optimal number of tablets for a large table?

2015-10-13 Thread Adam Fuchs
Here are a few other factors to consider:
1. Tablets may not be used uniformly. If there is a temporal element to the
row key then writes and reads may be skewed to go to a portion of the
tablets. If some tables are big but more archival in nature then they will
skew the stats as well. It's usually good to estimate things like CPU and
RAM usage based on active tablets rather than total tablets, so plan
accordingly.
2. Compactions take longer as tablets grow. Accumulo tends to have a nice,
fairly well-bounded write amplification factor (number of times a key/value
pair goes through major compaction), even with large tablets. However, a
compaction of a 200GB+ tablet can take hours, making for difficulty in
predicting performance and availability. It's nice to have background
operations split up into manageably small tasks (incidentally, this is
something LevelDB solves by essentially compacting fixed-size blocks rather
than variable-size files). Assuming you have 20TB disk on a beefy node, you
may want 500-1000 tablets on that machine, which is probably much more than
the number of available cores.
3. Query and indexing patterns, such as the wikisearch-style inverted
index, may drive you closer to one tablet per thread, but with range
queries this becomes less important.

Cheers,
Adam


On Fri, Oct 9, 2015 at 6:53 PM, Jeff Kubina  wrote:

> I read the following from the Accumulo manual on tablet merging
> :
>
>
>
>> Over time, a table can get very large, so large that it has hundreds of
>> thousands of split points. Once there are enough tablets to spread a table
>> across the entire cluster, additional splits may not improve performance,
>> and may create unnecessary bookkeeping.
>>
>
> So would the optimal number of tablets for a very large table be close to
> the total tservers times the total cores of the machine (or the worker
> threads the tservers are config to use--whichever is less)?
>
>


Re: Watching for Changes with Write Ahead Log?

2015-10-01 Thread Adam Fuchs
I would stay away from ThreadLocal -- the threads that run Constraints can
be dynamically generated in a resizable thread pool, and cleaning up after
them could be challenging. Static might work better if you can make it
thread safe, maybe with a resource pool.

Adam


On Thu, Oct 1, 2015 at 2:39 PM, John Vines <vi...@apache.org> wrote:

> As dirty as it is, that sounds like a case for a static, or maybe thread
> local, object
>
> On Thu, Oct 1, 2015, 7:19 PM Parise, Jonathan <jonathan.par...@gd-ms.com>
> wrote:
>
>> I have a few follow up questions in regard to constraints.
>>
>>
>>
>> What is the lifecycle of a constraint? What I mean by this is are the
>> constraints somehow tied to Accumulo’s lifecycle or are they just
>> instantiated each time a mutation occurs and then disposed?
>>
>>
>>
>> Also, are there multiple instances of the same constraint class at any
>> time or do all mutation on a table go through the exact same constraint?
>>
>>
>>
>> My guess is that  when a mutation comes in a new constraint is made
>> through reflection. Then check() is called, the violation codes are parsed
>> and the object is disposed/finalized.
>>
>>
>>
>> The reason I ask is that what I want to do is update my ElasticSearch
>> index each time I see a mutation on the table. However, I don’t want to
>> have to make a connection, send the data and then tear down the connection
>> each time. That’s a lot of unnecessary overhead and with all that overhead
>> happening on every mutation performance could be badly impacted.
>>
>>
>>
>> Is there some way to cache something like a connection and reuse it
>> between calls to the Constraint’s check() method? How would such a thing be
>> cleaned up if Accumulo is shut down?
>>
>>
>>
>>
>>
>> Thanks again,
>>
>>
>>
>> Jon
>>
>> *From:* Parise, Jonathan [mailto:jonathan.par...@gd-ms.com
>> <jonathan.par...@gd-ms.com>]
>> *Sent:* Wednesday, September 30, 2015 9:21 AM
>> *To:* user@accumulo.apache.org
>> *Subject:* RE: Watching for Changes with Write Ahead Log?
>>
>>
>>
>> In this particular case, I need to update some of my application state
>> when changes made by another system occur.
>>
>>
>>
>> I would need to do a few things to accomplish my goal.
>>
>>
>>
>> 1)  Be notified or see that a table had changed
>>
>> 2)  Checked that against changes I know my system has made
>>
>> 3)  If my system is not the originator of the change, update
>> internal state to reflect the change.
>>
>>
>>
>> Examples of state I may need to update include an ElasticSearch index and
>> also an in memory cache.
>>
>>
>>
>> I’m going to read up on constraints again and see if I can use them for
>> this purpose.
>>
>>
>>
>> Thanks!
>>
>>
>>
>> Jon
>>
>>
>>
>>
>>
>>
>>
>> *From:* Adam Fuchs [mailto:afu...@apache.org <afu...@apache.org>]
>> *Sent:* Tuesday, September 29, 2015 5:46 PM
>> *To:* user@accumulo.apache.org
>> *Subject:* Re: Watching for Changes with Write Ahead Log?
>>
>>
>>
>> Jon,
>>
>>
>>
>> You might think about putting a constraint on your table. I think the API
>> for constraints is flexible enough for your purpose, but I'm not exactly
>> sure how you would want to manage the results / side effects of your
>> observations.
>>
>>
>>
>> Adam
>>
>>
>>
>>
>>
>> On Tue, Sep 29, 2015 at 5:41 PM, Parise, Jonathan <
>> jonathan.par...@gd-ms.com> wrote:
>>
>> Hi,
>>
>>
>>
>> I’m working on a system where generally changes to Accumulo will come
>> through that system. However, in some cases, another system may change data
>> without my system being aware of it.
>>
>>
>>
>> What I would like to do is somehow listen for changes to the tables my
>> system cares about. I know there is a write ahead log that I could
>> potentially listen to for changes, but I don’t know how to use it. I looked
>> around for some documentation about it, and I don’t see much. I get the
>> impression that it isn’t really intended for this type of use case.
>>
>>
>>
>> Does anyone have any suggestions on how to watch a table for changes and
>> then determine if those changes were made by a different system.
>>
>>
>>
>> Is there some documentation about how to use the write ahead log?
>>
>>
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Jon Parise
>>
>>
>>
>


Re: Document Partitioned Indexing

2015-09-30 Thread Adam Fuchs
Hi Tom,

Sqrrl uses a document-distributed indexing strategy extensively. On top of
the reasons you mentioned, we also like the ability to explicitly structure
our index entries in both information content and sort order. This gives us
the ability to do interesting things like build custom indexes and do joins
between graph indexes and term indexes.

Eventually, I'd like to see Accumulo build out explicit support for this
type of indexing in the core as an embedded secondary indexing capability.
That would solve several of the challenges around compatibility with other
Accumulo features and usage patterns.

Cheers,
Adam


On Wed, Sep 30, 2015 at 3:48 AM, Tom D  wrote:

> Hi,
>
> Have been doing a little reading about different distributed (text)
> indexing techniques and picked up on the Document Partitioned Index
> approach on Accumulo.
>
> I am interested in the use-cases people would have for indexing data in
> this way over using a distributed search service (Elastic or SolrCloud).
>
> I can think of a few reasons, but wondered if there's something more
> obvious that I'm missing?
>
> - cell (field level) access controls
>
> - scale - I understand Accumulo will scale to thousands of nodes. I
> believe there are some limitations in Elastic / Solr at about 100 nodes.
>
> - integration with an existing schema or index in Accumulo (not sure about
> this one and what benefits it would have over calling out to a search
> service)
>
> - you want to take advantage of other features in Accumulo, e.g. Combining
> iterators to perform some aggregation alongside your document partitioned
> index (again, can't imagine use cases here, but maybe there are some)
>
> - more control over 'messy data', e.g partial duplicates that need merging
> at ingest
>
> Are there others? Be interesting to hear if people use this indexing
> strategy.
>
> Many thanks.
>
>
>


Re: Watching for Changes with Write Ahead Log?

2015-09-29 Thread Adam Fuchs
Jon,

You might think about putting a constraint on your table. I think the API
for constraints is flexible enough for your purpose, but I'm not exactly
sure how you would want to manage the results / side effects of your
observations.

Adam


On Tue, Sep 29, 2015 at 5:41 PM, Parise, Jonathan  wrote:

> Hi,
>
>
>
> I’m working on a system where generally changes to Accumulo will come
> through that system. However, in some cases, another system may change data
> without my system being aware of it.
>
>
>
> What I would like to do is somehow listen for changes to the tables my
> system cares about. I know there is a write ahead log that I could
> potentially listen to for changes, but I don’t know how to use it. I looked
> around for some documentation about it, and I don’t see much. I get the
> impression that it isn’t really intended for this type of use case.
>
>
>
> Does anyone have any suggestions on how to watch a table for changes and
> then determine if those changes were made by a different system.
>
>
>
> Is there some documentation about how to use the write ahead log?
>
>
>
>
>
> Thanks,
>
>
>
> Jon Parise
>


Re: Presplitting tables for the YCSB workloads

2015-09-18 Thread Adam Fuchs
You could cat the splits to a temp file, then use the -sf option of
createtable, piping the command to the accumulo shell's standard in:

$ echo "createtable ycsb_tablename -sf /tmp/ycsb_splits.txt" | accumulo
shell -u user -p password -z instancename zoohost:2181

Not sure if the row keys are identical in the Accumulo YCSB mapping, but if
they are then you should be able to use the same split generation script
that HBase uses.

Adam


On Thu, Sep 17, 2015 at 10:10 PM, Sean Busbey  wrote:

> YCSB is gearing up for its next monthly release, and I really want to
> add in an Accumulo specific README for running workloads.
>
> This is generally so that folks have an easier time running tests
> themselves. It's also because I keep testing Accumulo for the YCSB
> releases and coupled with a README file we'd get an Accumulo-specific
> convenience binary. Avoiding the bulk of dependencies that get
> included in the generic YCSB distribution artifact is a big win.
>
> The thing I keep getting hung up on is remembering how to properly
> split the Accumulo table for YCSB workloads. The HBase README has a
> great hbase shell snippet for doing this (because users can copy/paste
> it)[1]:
>
> 
> 3. Create a HBase table for testing
>
> For best results, use the pre-splitting strategy recommended in HBASE-4163:
>
> hbase(main):001:0> n_splits = 200 # HBase recommends (10 * number of
> regionservers)
> hbase(main):002:0> create 'usertable', 'family', {SPLITS =>
> (1..n_splits).map {|i| "user#{1000+i*(-1000)/n_splits}"}}
>
> Failing to do so will cause all writes to initially target a single
> region server.
> 
>
> Anyone have a work up of an equivalent for Accumulo that I can include
> under an ASLv2 license? I seem to recall madrob had something done in
> a bash script, but I can't find it anywhere.
>
> [1]: http://s.apache.org/CFe
>
> --
> Sean
>


Re: RowID design and Hive push down

2015-09-14 Thread Adam Fuchs
Hi Roman,

What's the  used for in your previous key design?

As I'm sure you've figured out, it's generally a bad idea to have a fully
unique hash in your key, especially if you're trying to support extensive
secondary indexing. What we've found is that it's not just the size of the
key but also the compressibility that matters. It's often better to use a
one-up counter of some sort, regardless of whether you're using a hex
encoding or a binary encoding. Due to the birthday problem [1] a one-up id
generally takes less than half of the bytes of a uniformly distributed hash
that has low probability of collisions, and it will compress much better.
Twitter did something like that in a distributed fashion that they called
Snowflake [2]. Google also published about high performance timestamp
oracles for transactions in their Percolator paper [3].

Cheers,
Adam

[1] https://en.wikipedia.org/wiki/Birthday_problem
[2] https://github.com/twitter/snowflake
[3] http://research.google.com/pubs/pub36726.html


On Mon, Sep 14, 2015 at 2:47 PM, roman.drap...@baesystems.com <
roman.drap...@baesystems.com> wrote:

> Hi there,
>
>
>
> Our current rowid format is MMdd_payload_sha256(raw data). It works
> nicely as we have a date and uniqueness guaranteed by hash, however
> unfortunately, rowid is around 50-60 bytes per record.
>
>
>
> Requirements are the following:
>
> 1)  Support Hive on top of Accumulo for ad-hoc queries
>
> 2)  Query original table by date range (e.g rowID < ‘20060101’ AND
> rowID >= ‘20060103’) both in code and hive
>
> 3)  Additional queries by ~20 different fields
>
>
>
> Requirement 3) requires secondary indexes and of course because each RowID
> is 50-60 bytes, they become super massive (99% of overall space) and really
> expensive to store.
>
>
>
> What we are looking to do is to reduce index size to a fixed size:
> {unixTime}{logicalSplit}{hash}, where unixTime is 4 bytes unsigned integer,
> logicalSplit – 2 bytes unsigned integer, and hash is 4 bytes – overall 10
> bytes.
>
>
>
> What is unclear to me is how second requirement can be met in Hive as to
> my understanding an in-built RowID push down mechanism won’t work with
> unsigned bytes?
>
>
>
> Regards,
>
> Roman
>
>
>
>
>
>
>
>
> Please consider the environment before printing this email. This message
> should be regarded as confidential. If you have received this email in
> error please notify the sender and destroy it immediately. Statements of
> intent shall only become binding when confirmed in hard copy by an
> authorised signatory. The contents of this email may relate to dealings
> with other companies under the control of BAE Systems Applied Intelligence
> Limited, details of which can be found at
> http://www.baesystems.com/Businesses/index.htm.
>


Re: Accumulo: "BigTable" vs. "Document Model"

2015-09-04 Thread Adam Fuchs
Sqrrl uses a hybrid approach. For records that are relatively static we use
a compacted form, but for maintaining aggregates and for making updates to
the compacted form documents we use a more explicit form. This is done
mostly through iterators and a fairly complex type system. The big
trade-off for us was storage footprint. We gain something like 30% more
compression by using the compacted form, and that also translates into
better ingest and query performance. I can tell you it takes a significant
engineering investment to make this work without overspecializing, so make
sure your use case warrants it.

Cheers,
Adam


On Fri, Sep 4, 2015 at 11:42 AM, Michael Moss 
wrote:

> Hello, everyone.
>
> I'd love to hear folks' input on using the "natural" data model of
> Accumulo ("BigTable" style) vs more of a Document Model. I'll try to
> succinctly describe with a contrived example.
>
> Let's say I have one domain object I'd like to model, "SensorReadings". A
> single entry might look something like the following with 4 distinct CF, CQ
> pairs.
>
> RowKey: DeviceID-YYYMMDD-ReadingID (i.e. - 1-20150101-1234)
> CF: "Meta", CQ: "Timestamp", Value: 
> CF: "Sensor", CQ: "Temperature", Value: 80.4
> CF: "Sensor", CQ: "Humidity", Value: 40.2
> CF: "Sensor", CQ: "Barometer", Value: 29.1
>
> I might do queries like "get me all SensorReadings for 2015 for DeviceID =
> 1" and if I wanted to operate on each SensorReading as a single unit (and
> not as the 4 'rows' it returns for each one), I'd either have to aggregate
> the 4 CF, CQ pairs for each RowKey client side, or use something like the
> WholeRowIterator.
>
> In addition, if I wanted to write a query like, "for DeviceID = 1 in 2015,
> return me SensorReadings where Temperature > 90, Humidity < 40, Barometer >
> 31", I'd again have to either use the WholeRowIterator to 'see' each entire
> SensorReading in memory on the server for the compound query, or I could
> take the intersection of the results of 3 parallel, independent queries on
> the client side.
>
> Where I am going with this is, what are the thoughts around creating a
> Java, Protobuf, Avro (etc) object with these 4 CF, CQ pairs as fields and
> storing each SensorReading as a single 'Document'?
>
> RowKey: DeviceID-YYYMMDD
> CF: ReadingID Value: Protobuf(Timestamp=123, Temperature=80.4,
> Humidity=40.2, Barometer = 29.1)
>
> This way you avoid having to use the WholeRowIterator and unless you often
> have queries that only look at a tiny subset of your fields (let's say just
> "Temperature"), the serialization costs seem similar since Value is just
> bytes anyway.
>
> Appreciate folks' experience and wisdom here. Hope this makes sense, happy
> to clarify.
>
> Best.
>
> -Mike
>
>
>
>
>


rya incubator proposal

2015-09-03 Thread Adam Fuchs
Hey Accumulopers,

I thought you might like to know that the Rya project just proposed to join
the incubator. Rya is a mature project that supports RDF on top of
Accumulo. Feel free to join the discussion or show support on the incubator
general list.

Cheers,
Adam


Re: Questions on intersecting iterator and partition ids

2015-07-13 Thread Adam Fuchs
Vaibhav,

I have included some answers below.

Cheers,
Adam

On Mon, Jul 13, 2015 at 11:19 AM, vaibhav thapliyal 
vaibhav.thapliyal...@gmail.com wrote:

 Dear all,

 I have the following questions on intersecting iterator and partition ids
 used in document sharded indexing:

 1. Can we run a boolean and query using the current intersecting iterator
 on a given range of ids. These ids are a subset of the total ids stored in
 the column qualifier field as per the document sharded indexing format.

The IntersectingIterator is designed to do index intersections, which are
very similar to boolean AND queries. It does require indexes to be built in
a particular fashion. You should play around with the WikiSearch example (
https://accumulo.apache.org/example/wikisearch.html) to get familiar with
its use.

 If it's not possible with current iterator can I tweak the existing one?

If you are indexing documents similar to what the IntersectingIterator
expects then you should be able to get it to work for you. More generally,
any row-local logic can be implemented in an iterator. If you're not
building indexes then you might want to look at the RowFilter as a starting
point.

  2. Is the partitioning suggested in document sharded indexing logical or
 physical. For eg if I have 30 partition ids do I have to physically
 presplit the table based on the partition ids for the and query to run in
 the most efficient way so that I have 30 tablets in table?

You don't have to pre-split -- Accumulo will automatically split big rows
into their own tablets. However, there are some performance advantages to
pre-splitting before your tablet gets big enough to split on its own.

  3.  Lastly,  Can anybody suggest me the number of partitions for
 document sharded indexing. What should I look for when deciding it?

You have to consider a few factors for this: (a) ingest parallelization,
for which you want approximately as many partitions as you have cores in
your cluster, (b) size of a partition when full, which you want to be under
about 20GB for compaction performance reasons, and (c) query parallelism,
for which you want no more than a small factor of the number of cores in
your cluster to reduce query latency. If you can't find a solution that
works for all of these factors then you will be forced to make trade-offs
(or do something complicated like time-based partitioning).

  Thanks
 Vaibhav



Re: micro compaction

2015-06-09 Thread Adam Fuchs
I think this might be the same concept as in-mapper combining, but applied
to data being sent to a BatchWriter rather than an OutputCollector. See
[1], section 3.1.1. A similar performance analysis and probably a lot of
the same code should apply here.

Cheers,
Adam

[1] http://lintool.github.io/MapReduceAlgorithms/MapReduce-book-final.pdf

On Tue, Jun 9, 2015 at 1:02 PM, Russ Weeks rwe...@newbrightidea.com wrote:

 Having a combiner stack (more generally an iterator stack) run on the
 client-side seems to be the second most popular request on this list. The
 most popular being, How do I write to Accumulo from inside an iterator?

 Such a thing would be very useful for me, too. I have some cycles to help
 out, if somebody can give me an idea of where to get started and where the
 potential land-mines are.

 -Russ

 On Tue, Jun 9, 2015 at 9:08 AM roman.drap...@baesystems.com 
 roman.drap...@baesystems.com wrote:

 Aggregated output is tiny,  so if I do same calculations in memory
 (instead of sending mutations to Accumulo) , I can reduce overall number of
 mutations by 1000x or so



 -Original Message-
 From: Josh Elser [mailto:josh.el...@gmail.com]
 Sent: 09 June 2015 16:54
 To: user@accumulo.apache.org
 Subject: Re: micro compaction

 Well, you win the prize for new terminology. I haven't ever heard the
 term micro compaction before.

 Can you clarify though, you say hundreds of millions of mutations that
 result in megabytes of data. Is that an increase or decrease in size.
 Comparing apples to oranges :)

 roman.drap...@baesystems.com wrote:
  Hi guys,
 
  While doing pre-analytics we generate hundreds of millions of
  mutations that result in 1-100 megabytes of useful data after major
  compaction. We ingest into Accumulo using MR from Mapper job. We
  identified that performance really degrades while increasing a number
 of mutations.
 
  The obvious improvement is to do some calculations in-memory before
  sending mutations to Accumulo.
 
  Of course, at the same time we are looking for a solution to minimize
  development effort.
 
  I guess I am asking about micro compaction/ingest-time iterators on
  the client side (before data is sent to Accumulo).
 
  To my understanding, Accumulo does not support them, is it correct?
  And if so, are there any plans to support this functionality in the
 future?
 
  Thanks
 
  Roman
 
  Please consider the environment before printing this email. This
  message should be regarded as confidential. If you have received this
  email in error please notify the sender and destroy it immediately.
  Statements of intent shall only become binding when confirmed in hard
  copy by an authorised signatory. The contents of this email may relate
  to dealings with other companies under the control of BAE Systems
  Applied Intelligence Limited, details of which can be found at
  http://www.baesystems.com/Businesses/index.htm.
 Please consider the environment before printing this email. This message
 should be regarded as confidential. If you have received this email in
 error please notify the sender and destroy it immediately. Statements of
 intent shall only become binding when confirmed in hard copy by an
 authorised signatory. The contents of this email may relate to dealings
 with other companies under the control of BAE Systems Applied Intelligence
 Limited, details of which can be found at
 http://www.baesystems.com/Businesses/index.htm.




Re: Change column family

2015-05-26 Thread Adam Fuchs
This can also be done with a row-doesn't-fit-into-memory constraint. You
won't need to hold the second column in-memory if your iterator tree deep
copies, filters, transforms and merges. Exhibit A:

[HeapIterator-derivative]
   |_
   | \
[transform-graph1-to-graph2]  \
   |   \
[column-family-graph1] [all-but-column-family-graph1]

With this design, you can subclass the HeapIterator, deep copy the source
in the init method, wrap one in a custom transform iterator, and create a
appropriate seek method. This is probably more on the advanced side of
Accumulo programming, but can be done.

Adam


On Tue, May 26, 2015 at 8:59 AM, Eric Newton eric.new...@gmail.com wrote:

 Short answer: no.

 Long answer: maybe.

 You can write an iterator which will transform:

 row, cf1, cq, vis - value

 into:

 row, cf2, cq, vis - value

 And if you can do this while maintaining sort order, you can get your new
 ColumnFamily transformed during scans and compactions.

 But this bit about maintaining the sort order is more complex than it
 sounds.

 If you have the following:

 row, a, cq, vis - value
 row, aa, cq, vis - value


 And you want to transform cf a into cf b:

 row, aa, cq, vis - value
 row, b, cq, vis - value


 Your iterator needs to hold the second column in memory, after
 transforming the first column.  Tablet server memory for holding Key/Values
 is not infinite.

 -Eric

 On Tue, May 26, 2015 at 8:44 AM, shweta.agrawal shweta.agra...@orkash.com
  wrote:

 Hi,

 I want to ask, is it possible in accumulo to change the column family
 without changing the whole data.

 Suppose my column family is graph1, now i want to rename this column
 family as graph2.
 Is it possible?

 Thanks
 Shweta





Re: Accumulo Summit 2015

2015-05-04 Thread Adam Fuchs
Thanks, Mike. A good time was had by all!

The conference organizers expect videos and slides will be available in
about 2 weeks.

Cheers,
Adam

On Fri, May 1, 2015 at 9:58 AM, Giordano, Michael 
michael.giord...@vistronix.com wrote:

  -Had an awesome time at Accumulo Summit 2015.

 -Met with some great folks (special shout out to Josh Elsner and
 Adam Fuchs for their time and patience answering questions).

 -Can’t wait for next year’s summit.



 Any idea when the slides for the presentations will be available?



 Thanks,

 Mike G.

 This communication, along with its attachments, is considered confidential
 and proprietary to Vistronix.  It is intended only for the use of the
 person(s) named above.  Note that unauthorized disclosure or distribution
 of information not generally known to the public is strictly
 prohibited.  If you are not the intended recipient, please notify the
 sender immediately.



Re: Unexpected aliasing from RFile getTopValue()

2015-04-15 Thread Adam Fuchs
On Wed, Apr 15, 2015 at 10:20 AM, Keith Turner ke...@deenlo.com wrote:


 Random thought on revamp.  Immutable key values with enough primitives to
 make most operations efficient (avoid constant alloc/copy) might be
 something to consider for the iterator API


So, is this a tradeoff in the performance vs. inter-iterator isolation
space? From a performance perspective we would do best if we just passed
around pointers to an underlying byte array (e.g. ByteBuffer-style), but
maximum isolation would require never reusing anything returned from an
iterator's getTopX methods. From a security perspective we need to be
careful with how we reuse data objects (hence the need for the
SynchronizedIterator at the top of the system iterators), but I would say
we can probably relax other isolation concerns in the iterators in favor of
performance.

I think there's probably a bigger project here around minimizing the object
creation, data copying, serialization, and deserialization of keys. We did
some work that Chris McCubbin will be presenting at the upcoming accumulo
summit around pushing key comparisons down to a serialized form of the key,
and that made a huge impact on load performance. I think we could probably
achieve an order of magnitude more throughput in the iterator tree with a
major refactoring. Any thoughts on when we might have the appetite for such
a change? If we're thinking about making key/values immutable then we might
piggyback a bigger redesign on that already breaking change.

Adam


Re: Scans during Compaction

2015-02-23 Thread Adam Fuchs
Dylan,

The effect of a major compaction is never seen in queries before the major
compaction completes. At the end of the major compaction there is a
multi-phase commit which eventually replaces all of the old files with the
new file. At that point the major compaction will have completely processed
the given tablet's data (although other tablets may not be synchronized).
For long-running non-isolated queries (more than a second or so) the
iterator tree is occasionally rebuilt and re-seeked. When it is rebuilt it
will use whatever is the latest file set, which will include the results of
a completed major compaction.

In your case #1 that's a tricky guarantee to make across a whole tablet,
but it can be made one row at a time by using an isolated iterator.

To make your case #2 work, you probably will have to implement some
higher-level logic to only start your query after the major compaction has
completed, using an external mechanism to track the completion of your
transformation.

Adam


On Mon, Feb 23, 2015 at 12:35 PM, Dylan Hutchison dhutc...@stevens.edu
wrote:

 Hello all,

 When I initiate a full major compaction (with flushing turned on) manually via
 the Accumulo API
 https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/client/admin/TableOperations.html#compact(java.lang.String,%20org.apache.hadoop.io.Text,%20org.apache.hadoop.io.Text,%20java.util.List,%20boolean,%20boolean),
 how does the table appear to

1. clients that started scanning the table before the major compaction
began;
2. clients that start scanning during the major compaction?

 I'm interested in the case where there is an iterator attached to the full
 major compaction that modifies entries (respecting sorted order of entries).

 The best possible answer for my use case, with case #2 more important than
 case #1 and *low latency* more important than high throughput, is that

1. clients that started scanning before the compaction began would not
see entries altered by the compaction-time iterator;
2. clients that start scanning during the major compaction stream back
entries as they finish processing from the major compaction, such that the
clients *only* see entries that have passed through the
compaction-time iterator.

 How accurate are these descriptions?  If #2 really were as I would like it
 to be, then a scan on the range (-inf,+inf) started after compaction would
 monitor compaction progress, such that the first entry batch transmits to
 the scanner as soon as it is available from the major compaction, and the
 scanner finishes (receives all entries) exactly when the compaction
 finishes.  If this is not possible, I may make something to that effect by
 calling the blocking version of compact().

 Bonus: how does cancelCompaction()
 https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/client/admin/TableOperations.html#cancelCompaction(java.lang.String)
 affect clients scanning in case #1 and case #2?

 Regards,
 Dylan Hutchison



Re: Scans during Compaction

2015-02-23 Thread Adam Fuchs
Dylan,

I think the way this is generally solved is by using an idempotent iterator
that can be applied at both full major compaction and query scopes to give
a consistent view. Aggregation, age-off filtering, and all the other
standard iterators have the property that you can leave them in place and
get a consistent answer even if they are applied multiple times. Major
compaction and query-time iterators are even simpler than the general case,
since you don't really need to worry about partial views of the underlying
data. In your case I think you are trying to use an iterator that needs to
be applied exactly once to a complete stream of data (either at query time
or major compaction time). What we should probably do is look at options
for more generally supporting that type of iterator. You could help us a
ton by describing exactly what you want your iterator to do, and we can all
propose a few ideas for how this might be implemented. Here are a couple
off the top of my head:

1. If you can reform your iterator so that it is idempotent then you can
apply it liberally. This might be possible using some sort of flag that the
major compactor puts in the data and the query-time iterator looks for to
determine if the compaction has already happened. We often use version
numbers in column families to this effect. Special row keys at the
beginning of the tablet might also be an option. This would be doable
without changes to Accumulo.

2. We could build a mechanism into core accumulo that applies an iterator
with exactly once semantics, such that the user submits a transformation as
an iterator and it gets applied similarly to how you described. The
query-time reading of results of the major compaction might be overkill,
but that would be a possible optimization that we could think about
engineering in a second pass.

Adam



On Mon, Feb 23, 2015 at 1:42 PM, Dylan Hutchison dhutc...@stevens.edu
wrote:

 Thanks Adam and Keith.

 I see the following as a potential solution that achieves (1) low latency
 for clients that want to see entries after an iterator and (2) the entries
 from that iterator persisting in the Accumulo table.

1. Start a major compaction in thread T1 of a client with the iterator
set, blocking until the compaction completes.
2. Start scanning in thread T2 of the client with the same iterator
now set at scan-time scope. Use an isolated scanner to make sure we do not
read the results of the major compaction committing, though this is not
full-proof due to timing and that the isolated scanner is row-wise.
3. Eventually, T1 unblocks and signals that the compaction completes.
T1 interrupts T2.
4. Thread T2 stops scanning, removes the scan-time iterator, and
starts scanning again at the point it last left off, now seeing the results
of the major compaction which already passed through the iterator.

 The whole scheme is only necessary if the client wants results faster than
 the major compaction completes.  A disadvantage is duplicated work -- the
 iterator runs at scan-time and at compaction-time until the compaction
 finishes.  This may strain server resources.

 Will think about other schemes.  If only we could attach an apply-once
 scan-time iterator, that also persists its results to an Accumulo table in
 a streaming fashion.  Or on the flip side, a one-time compaction iterator
 that streams results, such that we could scan from them right away instead
 of needing to wait for the entire compaction to complete.

 Regards,
 Dylan Hutchison

 On Mon, Feb 23, 2015 at 12:48 PM, Adam Fuchs afu...@apache.org wrote:

 Dylan,

 The effect of a major compaction is never seen in queries before the
 major compaction completes. At the end of the major compaction there is a
 multi-phase commit which eventually replaces all of the old files with the
 new file. At that point the major compaction will have completely processed
 the given tablet's data (although other tablets may not be synchronized).
 For long-running non-isolated queries (more than a second or so) the
 iterator tree is occasionally rebuilt and re-seeked. When it is rebuilt it
 will use whatever is the latest file set, which will include the results of
 a completed major compaction.

 In your case #1 that's a tricky guarantee to make across a whole tablet,
 but it can be made one row at a time by using an isolated iterator.

 To make your case #2 work, you probably will have to implement some
 higher-level logic to only start your query after the major compaction has
 completed, using an external mechanism to track the completion of your
 transformation.

 Adam


 On Mon, Feb 23, 2015 at 12:35 PM, Dylan Hutchison dhutc...@stevens.edu
 wrote:

 Hello all,

 When I initiate a full major compaction (with flushing turned on)
 manually via the Accumulo API
 https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/client/admin/TableOperations.html#compact(java.lang.String

Re: Iterators adding data: IteratorEnvironment.registerSideChannel?

2015-02-16 Thread Adam Fuchs
Dylan,

If I recall correctly (which I give about 30% odds), the original purpose
of the side channel was to split up things like delete tombstone entries
from regular entries so that other iterators sitting on top of a
bifurcating iterator wouldn't have to handle the special tombstone
preservation logic. This worked in theory, but it never really caught on.
I'm not sure any operational code is calling the registerSideChannel method
right now, so you're sort of in pioneering territory. That said, this looks
like it should work as you described it.

Can you describe why you want to use a side channel instead of implementing
the merge in your own iterator (e.g. subclassing MultiIterator and
overriding the init method)? This has implications on composibility with
other iterators, since downstream iterators would not see anything sent to
the side channel but they would see things merged and returned by a
MultiIterator.

Adam
 On Feb 16, 2015 3:18 AM, Dylan Hutchison dhutc...@stevens.edu wrote:

 If you can do a merge sort insertion, then you can guarantee order and
 it's fine.

 Yep, I guarantee the iterator we add as a side channel will emit tuples in
 sorted order.

 On a suggestion from David Medinets, I modified my testing code to use a
 MiniAccumuloCluster set to 2 tablet servers.  I then set a table split on
 row3 before launching the compaction.  The result looks good.  Here is
 output from a run on a local Accumulo instance.  Note that we write more
 values than we read.

 2015-02-16 02:44:51,125 [tserver.Tablet] DEBUG: Starting MajC k;row3
 (USER) [hdfs://localhost:9000/accumulo/tables/k/t-0g4/F0g5.rf] --
 hdfs://localhost:9000/accumulo/tables/k/t-0g4/A0g7.rf_tmp
  [name:InjectIterator, priority:15,
 class:edu.mit.ll.graphulo.InjectIterator, properties:{}]
 2015-02-16 02:44:51,127 [tserver.Tablet] DEBUG: Starting MajC k;row3
 (USER) [hdfs://localhost:9000/accumulo/tables/k/default_tablet/F0g6.rf]
 -- hdfs://localhost:9000/accumulo/tables/k/default_tablet/A0g8.rf_tmp
  [name:InjectIterator, priority:15,
 class:edu.mit.ll.graphulo.InjectIterator, properties:{}]
 2015-02-16 02:44:51,190 [tserver.Compactor] DEBUG: *Compaction k;row3 2
 read | 4 written* |111 entries/sec |  0.018 secs
 2015-02-16 02:44:51,194 [tserver.Compactor] DEBUG: *Compaction k;row3 1
 read | 4 written* | 43 entries/sec |  0.023 secs


 In addition, output from the DebugIterator looks as expected.  There is a
 re-seek after reading the first tablet to the key after the last entry
 returned in the first tablet.

 DEBUG:
 init(org.apache.accumulo.core.iterators.system.SynchronizedIterator@15085e63,
 {}, org.apache.accumulo.tserver.TabletIteratorEnvironment@586cc05e)
 DEBUG: 0x1C2BFB13 seek((-inf,+inf), [], false)

 ... snipped logs

 DEBUG:
 init(org.apache.accumulo.core.iterators.system.SynchronizedIterator@2b048c59,
 {}, org.apache.accumulo.tserver.TabletIteratorEnvironment@379a3d1f)
 DEBUG: 0x5946E74B seek([row2 colF3:colQ3 [] 9223372036854775807
 false,+inf), [], false)


 It seems the side channel strategy will hold up.  We have opened a new
 world of Accumulo-foo.  Of course, the real test is a multi-node instance
 with more than 10 entries of data.

 Regards, Dylan


 On Sun, Feb 15, 2015 at 11:17 PM, Andrew Wells awe...@clearedgeit.com
 wrote:

 The main issue with adding data in an iterator is order. If you have can
 do a merge sort insertion, then you can guarantee order and  its fine. But
 if you are inserting base on input you cannot guarantee order, and it can
 only be on scan iterator.
  On Feb 15, 2015 8:03 PM, Dylan Hutchison dhutc...@stevens.edu wrote:

 Hello all,

 I've been toying with the registerSideChannel(iter)
 https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/IteratorEnvironment.html#registerSideChannel(org.apache.accumulo.core.iterators.SortedKeyValueIterator)
  method
 on the IteratorEnvironment passed to iterators through the init() method.
 From what I can tell, the method allows you to add another iterator as a
 top level source, to be merged in along with other usual top-level sources
 such as the in-memory cache and RFiles.

 Are there any downsides to using registerSideChannel( ) to add new
 data to an iterator chain?  It looks like this is fairly stable, so long
 as the iterator we add as a side channel implements seek() properly so as
 to only return entries whose rows are within a tablet.  I imagine it works
 like so:

 Suppose we set a custom iterator InjectIterator that registers a side
 channel inside init() at priority 5 as a one-time major compaction
 iterator.  InjectIterator forwards other operations to its parent, as in
 WrappingIterator
 https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/WrappingIterator.html.
 We start the compaction:

 Tablet 1 (a,g]

1. init() called on InjectIterator.  Creates the side channel
iterator, calls init() on it, and registers it.
2. init() called on 

Re: Iterators adding data: IteratorEnvironment.registerSideChannel?

2015-02-16 Thread Adam Fuchs
top-level with respect to the side channel description is inverted with
respect to your diagram. Fig. A should be more like this:

RfileIter1  RfileIter2
|  /
|_/
Merge
|
VersioningIterator
|
OtherIterators   InjectIterator
|   /
|__/
Merge
|
v

Thus, VersioningIterator and OtherIterators don't see any of the entries
coming from InjectIterator.

Adam


On Mon, Feb 16, 2015 at 1:23 PM, Dylan Hutchison dhutc...@stevens.edu
wrote:


 why you want to use a side channel instead of implementing the merge in
 your own iterator

 Here is a picture showing the difference--

 Fig. A: Using a side channel to add a top-level iterator.

 RfileIter1  RfileIter2  InjectIterator ...
 |  /   /
 |_/   /
 o__*(3-way merge)*_/

 |

 VersioningIterator
 |

 OtherIterators
 |
 v
 ...


 Fig. B: Merging in the data at a later stage

 RfileIter1  RfileIter2  ...
 |  /
 o_/

 |

 VersioningIterator
 |
 | InjectIterator

 o/

 |

 OtherIterators
 |
 v
 ...

 (note: we're free to add iterators before the VersioningIterator too)

 Unless the order of iterators matters (e.g., the VersioningIterator
 position matters if InjectIterator generates an entry with the same row,
 colFamily and colQualifier as an entry in the table), the two styles will
 give the same results.

 This has implications on composibility with other iterators, since
 downstream iterators would not see anything sent to the side channel but
 they would see things merged and returned by a MultiIterator.

 If the iterator is at the top level, then every iterator below it will see
 output from the top level iterator.  Did you mean composibility with other
 iterators added at the top level?  If hypothetical iterator
 InjectIterator2 needs to see the results of InjectIterator, then we
 need to place InjectIterator2 below InjectIterator on the hierarchy,
 whether in Fig. A or Fig. B.

 For my particular situation, reading from another Accumulo table inside an
 iterator, I'm not sure which is better.  I like the idea of adding another
 data stream as a top-level source, but Fig. B is possible too.

 Regards,
 Dylan Hutchison


 On Mon, Feb 16, 2015 at 11:34 AM, Adam Fuchs scubafu...@gmail.com wrote:

 Dylan,

 If I recall correctly (which I give about 30% odds), the original purpose
 of the side channel was to split up things like delete tombstone entries
 from regular entries so that other iterators sitting on top of a
 bifurcating iterator wouldn't have to handle the special tombstone
 preservation logic. This worked in theory, but it never really caught on.
 I'm not sure any operational code is calling the registerSideChannel method
 right now, so you're sort of in pioneering territory. That said, this looks
 like it should work as you described it.

 Can you describe why you want to use a side channel instead of
 implementing the merge in your own iterator (e.g. subclassing MultiIterator
 and overriding the init method)? This has implications on composibility
 with other iterators, since downstream iterators would not see anything
 sent to the side channel but they would see things merged and returned by a
 MultiIterator.

 Adam
  On Feb 16, 2015 3:18 AM, Dylan Hutchison dhutc...@stevens.edu wrote:

 If you can do a merge sort insertion, then you can guarantee order and
 it's fine.

 Yep, I guarantee the iterator we add as a side channel will emit tuples
 in sorted order.

 On a suggestion from David Medinets, I modified my testing code to use a
 MiniAccumuloCluster set to 2 tablet servers.  I then set a table split on
 row3 before launching the compaction.  The result looks good.  Here is
 output from a run on a local Accumulo instance.  Note that we write more
 values than we read.

 2015-02-16 02:44:51,125 [tserver.Tablet] DEBUG: Starting MajC k;row3
 (USER) [hdfs://localhost:9000/accumulo/tables/k/t-0g4/F0g5.rf] --
 hdfs://localhost:9000/accumulo/tables/k/t-0g4/A0g7.rf_tmp
  [name:InjectIterator, priority:15,
 class:edu.mit.ll.graphulo.InjectIterator, properties:{}]
 2015-02-16 02:44:51,127 [tserver.Tablet] DEBUG: Starting MajC k;row3
 (USER) [hdfs://localhost:9000/accumulo/tables/k/default_tablet/F0g6.rf]
 -- hdfs://localhost:9000/accumulo/tables/k/default_tablet/A0g8.rf_tmp
  [name:InjectIterator, priority:15,
 class:edu.mit.ll.graphulo.InjectIterator, properties:{}]
 2015-02-16 02:44:51,190 [tserver.Compactor] DEBUG: *Compaction k;row3
 2 read | 4 written* |111 entries/sec |  0.018 secs
 2015-02-16 02:44:51,194 [tserver.Compactor] DEBUG: *Compaction k;row3
 1 read | 4 written* | 43 entries/sec |  0.023 secs


 In addition, output from the DebugIterator looks as expected.  There is
 a re-seek after reading the first tablet to the key after the last entry
 returned in the first tablet.

 DEBUG:
 init(org.apache.accumulo.core.iterators.system.SynchronizedIterator@15085e63

Re: Keys with identical timestamps

2015-02-09 Thread Adam Fuchs
Hi Dave,

As long as your combiner is associative and commutative both of the
values should be represented in the combined result. The
non-determinism is really around ordering, which generally doesn't
matter for a combiner.

Adam

On Mon, Feb 9, 2015 at 3:49 PM, Dave Hardcastle
hardcastle.d...@gmail.com wrote:
 Hi,

 Could someone clarify whether the following statement from the manual - If
 two inserts are made into Accumulo with the same rowID, column, and
 timestamp, then the behavior is non-deterministic - applies even if the
 versioning iterator is off? Is the non-determinism the fact that the order
 is undetermined if two identical inserts are made and all versions are kept?

 I have an application where the key corresponds to an object and a time
 range, and the value is properties of the object over that time range. The
 time range is stored in the column qualifier, but I also put the end of the
 time range as the timestamp of the key. I frequently get data late, and so
 create a key and insert that, but that key may already exist in the table.
 When multiple identical versions get put in, the values are aggregated using
 a combiner. This seems to be working fine. But maybe I shouldn't be assuming
 that Accumulo won't silently drop one of the two keys?

 Thanks,

 Dave.


Re: hdfs cpu usage

2015-02-09 Thread Adam Fuchs
Ara,

What kind of query load are you generating within your batch scanners?
Are you using an iterator that seeks around a lot? Are you grabbing
many small batches (only a few keys per range) from the batch scanner?
As a wild guess, this could be the result of lots of seeks with a low
cache hit rate, which would induce CPU load in HDFS fetching blocks
and CPU load in Accumulo decrypting/decompressing those blocks. The
monitor page will show you seek rates and cache hit rates.

Adam


On Sat, Feb 7, 2015 at 8:48 PM, Ara Ebrahimi
ara.ebrah...@argyledata.com wrote:
 2.4.0.2.1.

 Yeah seems like I need to do that. I was hoping I’d get some advice based on
 prior experience with google cloud environment.

 Ara.

 On Feb 7, 2015, at 11:23 AM, Josh Elser josh.el...@gmail.com wrote:

 What version of Hadoop are you using?

 Have you considered hooking up a profiler to the Datanode on GCE to see
 where the time is being spent? That might help shed some light on the
 situation.

 Ara Ebrahimi wrote:

 Hi,

 We’re seeing some weird behavior from the hdfs daemon on google cloud
 environment when we use accumulo Scanner to sequentially scan a table. Top
 reports 200-300% cpu usage for the hdfs daemon. Accumulo is also around
 500%. iostat %util is low. avgrq-sz is low, rMB/s is low, there’s lots of
 free memory. It seems like something causes the hdfs daemon to consume a lot
 of cpu and not to send enough read requests to the disk (ssd actually, so
 disk is super fast and vastly under-utilized). The process which sends scan
 requests to accumulo is 500% active (using 3 query batch threads and
 aggressive scan-batch-size/read-ahead-threashold values). So it seems like
 somehow hdfs is the bottleneck. On another cluster we rarely see hdfs daemon
 going over 10% cpu usage. Any idea what the issue could be?

 Thanks,
 Ara.



 

 This message is for the designated recipient only and may contain
 privileged, proprietary, or otherwise confidential information. If you have
 received it in error, please notify the sender immediately and delete the
 original. Any other use of the e-mail by you is prohibited. Thank you in
 advance for your cooperation.

 




 

 This message is for the designated recipient only and may contain
 privileged, proprietary, or otherwise confidential information. If you have
 received it in error, please notify the sender immediately and delete the
 original. Any other use of the e-mail by you is prohibited. Thank you in
 advance for your cooperation.

 





 

 This message is for the designated recipient only and may contain
 privileged, proprietary, or otherwise confidential information. If you have
 received it in error, please notify the sender immediately and delete the
 original. Any other use of the e-mail by you is prohibited. Thank you in
 advance for your cooperation.

 


Re: Seeking Iterator

2015-01-12 Thread Adam Fuchs
On Mon, Jan 12, 2015 at 4:10 PM, Josh Elser josh.el...@gmail.com wrote:
 seek()'ing doesn't always imply an increase in performance -- remember that
 RFiles (the files that back Accumulo tables), are composed of multiple
 blocks/sections with an index of them. A seek is comprised of using that
 index to find the block/section of the RFile and then a linear scan forward
 to find the first key for the range you seek()'ed to.

 Thus, if you're repeatedly re-seek()'ing within the same block, you'll waste
 a lot of time re-read the same data. In your situation, it sounds like the
 cost of re-reading the data after a seek is about the same as naively
 consuming the records.

 You can try altering table.file.compress.blocksize (and then compacting your
 table) to see how this changes.


There is actually some fairly well-optimized code in the RFile seek
that minimizes the re-reading of RFile data and index blocks. Seeking
forward by one key adds a couple of key comparisons and function
calls, but that's about it. Incidentally, key comparisons are pretty
high up on my list of things that could use some performance
optimization.

Adam


Re: Accumulo available in Fedora 21

2014-12-15 Thread Adam Fuchs
Neato!

Adam


On Mon, Dec 15, 2014 at 3:25 PM, Christopher ctubb...@apache.org wrote:
 Accumulators,

 Fedora Linux now ships with Accumulo 1.6 packaged and available in its yum
 repositories, as of Fedora 21. Simply run yum install accumulo to get
 started. You can also just install sub-packages, as in yum install
 accumulo-tserver.

 You can get Fedora 21 at its website: https://getfedora.org/

 If you see any bugs that are Fedora-specific, file a bug here:
 https://bugzilla.redhat.com/. There are a few known issues right now.
 Namely, the monitor service is not yet packaged, due to licensing
 restrictions. And, you still have to configure Hadoop and ZooKeeper first.
 If anybody wants to help out with the Fedora packaging, let me know. Right
 now, I'm just trying to do it when I get some spare cycles.

 --
 Christopher L Tubbs II
 http://gravatar.com/ctubbsii


Re: comparing different rfile densities

2014-11-11 Thread Adam Fuchs
Jeff,

Density is an interesting measure here, because RFiles are going to
be sorted such that, even when the file is split between tablets, a
read of the file is going to be (mostly) a sequential scan. I think
instead you might want to look at a few other metrics: network
overhead, name node operation rates, and number of files per tablet.

The network overhead is going to be more an issue of locality than
density, so you'd have to do more than just have separate files per
tablet to optimize that. You'll need some way of specifying or hinting
at where the files should be generated. As an aside, we generally want
to shoot for probabilistic locality so that the aggregate traffic
over top level switches is much smaller than the total data processed
(smaller enough so that it isn't a bottleneck). This is generally a
little easier than guaranteeing that files are always read from a
local drive. You might be able to measure this by monitoring your
network usage, assuming those sensors are available to you. Accumulo
also prints out information on entries/s and bytes/s for compactions
in the debug logs.

Impact on the namenode is more or less proportional to the number of
blocks+files that you generate. As long as you're not generating a
large number of files that are smaller than your block size (64MB?
128MB?) you're probably going to be close to optimal here. I'm not
sure at what point the number of files+blocks becomes a bottleneck,
but I've seen it happen when generating a very large number of tiny
files. This is something that may cause you problems if you generate
50K files per ingest cycle rather than 500 or 5K. Measure this by
looking at the size of files that are being ingested.

Number of files per tablet has a big effect on performance -- much
more so than number of tablets per file. Query latency and aggregate
scan performance are directly proportional to the number of files per
tablet. Generating one file per tablet or one file per group of
tablets doesn't really change this metric. You can measure this by
scanning the metadata table as Josh suggested.

I'm very interested in this subject, so please let us know what you find.

Cheers,
Adam

On Tue, Nov 11, 2014 at 6:56 AM, Jeff Turner sjtsp2...@gmail.com wrote:
 is there a good way to compare the overall system effect of
 bulk loading different sets of rfiles that have the same data,
 but very different densities?

 i've been working on a way to re-feed a lot of data in to a table,
 and have started to believe that our default scheme for creating
 rfiles - mapred in to ~100-200 splits, sampled from 50k tablets -
 is actually pretty bad.  subjectively, it feels like rfiles that span
 300 or 400 tablets is bad in at least two ways for the tservers -
 until the files are compacted, all of the potential tservers have
 to check the file, right?  and then, during compaction, do portions
 of that rfile get volleyed around the cloud until all tservers
 have grabbed their portion?  (so, there's network overhead, repeatedly
 reading files and skipping most of the data, ...)

 if my new idea works, i will have a lot more control over the density
 of rfiles, and most of them will span just one or two tablets.

 so, is there a way to measure/simulate overall system benefit or cost
 of different approaches to building bulk-load data (destined for an
 established table, across N tservers, ...)?

 i guess that a related question would be are 1000 smaller and denser
 bulk files better than 100 larger bulk files produced under a typical
 getSplits() scheme?

 thanks,
 jeff


Re: Remotely Accumulo

2014-10-06 Thread Adam Fuchs
Accumulo tservers typically listen on a single interface. If you have a
server with multiple interfaces (e.g. loopback and eth0), you might have a
problem in which the tablet servers are not listening on externally
reachable interfaces. Tablet servers will list the interfaces that they are
listening to when they boot, and you can also use tools like lsof to find
them.

If that is indeed the problem, then you might just need to change you
conf/slaves file to use hostname instead of localhost, and then restart.

Adam
On Oct 6, 2014 4:27 PM, Geoffry Roberts threadedb...@gmail.com wrote:


 I have been happily working with Acc, but today things changed.  No errors

 Until now I ran everything server side, which meant the URL was
 localhost:2181, and life was good.  Today tried running some of the same
 code as a remote client, which means host name:2181.  Things hang when
 BatchWriter tries to commit anything and Scan hangs when it tries to
 iterate through a Map.

 Let's focus on the scan part:

 scan.fetchColumnFamily(new Text(colfY)); // This executes then hangs.
 for(EntryKey,Value entry : scan) {
 def row = entry.getKey().getRow();
 def value = entry.getValue();
 println value= + value;
 }

 This is what appears in the console :

 17:22:39.802 C{0} M DEBUG org.apache.zookeeper.ClientCnxn - Got ping
 response for sessionid: 0x148c6f03388005e after 21ms

 17:22:49.803 C{0} M DEBUG org.apache.zookeeper.ClientCnxn - Got ping
 response for sessionid: 0x148c6f03388005e after 21ms

 and on and on


 The only difference between success and a hang is a URL change, and of
 course being remote.

 I don't believe this is a firewall issue.  I shutdown the firewall.

 Am I missing something?

 Thanks all.

 --
 There are ways and there are ways,

 Geoffry Roberts



Re: Compaction slowing queries

2014-09-11 Thread Adam Fuchs
Paul,

Here are a few suggestions:

1. Reduce the number of concurrent compaction threads
(tserver.compaction.major.concurrent.max, and
tserver.compaction.minor.concurrent.max). You probably want to lean
towards twice as many major compaction threads as minor, but that
somewhat depends on how bursty your ingest rate is. The total number
of threads should leave plenty of cores for query processing.

2. Look into using a different compression codec. Snappy or LZz4 can
support a much higher throughput that the default of gzip, although
the compression ratio will not be as good.

3. Consider a key choice that limits the number of actively ingesting
tablets. Writing across all ~100k tablets means they will all be
actively compacting, but if you can arrange your keys such that only
~1k tablets are being actively written to then you can significantly
cut your expected write amplification (i.e. number of major
compactions needed). This is because minor compactions will be larger
and you'll spend proportionally more time writing into smaller
tablets.

Cheers,
Adam


On Thu, Sep 11, 2014 at 12:06 PM, pdread paul.r...@siginttech.com wrote:

 We have 100+ tablet servers, approx 860 tablets/server, ingest approx 300K+
 docs/day, the problem recently started that queries during a minor or major
 compaction are taking about 100+ seconds as opposed to about 2 seconds when
 no compaction. Everyone on the cluster is effected, mapreduce jobs and batch
 scanners.

 One table has as many as 65K tablets.

 In the hopes of reducing the compactions yesterday we changed on 2 tables
 that appeared to cause most of the compactions:

 compaction.ratio from 3 to 5
 table.file.max from 15 to 45
 split.threshold from 725M to 2G.

 tservers are set to 3G, top shows 6G res and 7G virt for the one I checked.

 The odd things is we expected the number of tablets to change and they did
 not. The only thing that happened was the number of compactions went up but
 the duration of the compactions went down by about half. Queries in off
 times did not seem to change.

 One more thing, we only store docs  64M in accumulo, otherwise they are
 written directly to hdfs.

 The question would be, is there a way to reduce the compaction frequency and
 or duration?

 Thanks in advance.

 Paul



 --
 View this message in context: 
 http://apache-accumulo.1065345.n5.nabble.com/Compaction-slowing-queries-tp11278.html
 Sent from the Users mailing list archive at Nabble.com.


Re: Compaction slowing queries

2014-09-11 Thread Adam Fuchs
You can change compression codecs at any time on a per-table basis. This
only affects how new files are written. Existing files will still be read
the same way. See the table.file.compress.type parameter.

One caveat is that you need to make sure your codec is supported before
switching to it or compactions will start failing. You might want to try it
on a test table first, testing insert and flush operations after
configuring the coodec.

Adam
On Sep 11, 2014 1:00 PM, pdread paul.r...@siginttech.com wrote:

 Adam


 Quick question if I may, your comment about compression util #2, are these
 other compression tools compatible with gzip? We have 10s of millions of
 docs already loaded and of course do not want to reload.

 I will try #1, say 6 threads for minor and 12 threads for major. I checked
 and the servers are 24 CPUs and the average load time is nil.

 Your suggestions #3 will not work since we would have to re-index all of
 our
 docs which is not going to work.

 Thanks

 Paul





 --
 View this message in context:
 http://apache-accumulo.1065345.n5.nabble.com/Compaction-slowing-queries-tp11278p11280.html
 Sent from the Users mailing list archive at Nabble.com.



Re: Advice on increasing ingest rate

2014-04-09 Thread Adam Fuchs
If the average is around 1k per k/v entry, then I would say that 400MB/s is
very good performance for incremental/streaming ingest into Accumulo on
that cluster. However, I suspect that your entries are probably not that
big on average. Do you have a measurement for MB/s ingest?

Adam
On Apr 9, 2014 4:42 PM, Mike Hugo m...@piragua.com wrote:




 On Tue, Apr 8, 2014 at 4:35 PM, Adam Fuchs afu...@apache.org wrote:

 MIke,

 What version of Accumulo are you using, how many tablets do you have, and
 how many threads are you using for minor and major compaction pools? Also,
 how big are the keys and values that you are using?


 1.4.5
 6 threads each for min and major compaction
 Keys and values are not that large, there may be a few outliers but I
 would estimate that most of them are  1k



 Here are a few settings that may help you:
 1. WAL replication factor (tserver.wal.replication). This defaults to 3
 replicas (the HDFS default), but if you set it to 2 it will give you a
 performance boost without a huge hit to reliability.
 2. Ingest buffer size (tserver.memory.maps.max), also known as the
 in-memory map size. Increasing this generally improves the efficiency of
 minor compactions and reduces the number of major compactions that will be
 required down the line. 4-8 GB is not unreasonable.
 3. Make sure your WAL settings are such that the size of a log
 (tserver.walog.max.size) multiplied by the number of active logs
 (table.compaction.minor.logs.threshold) is greater than the in-memory map
 size. You probably want to accomplish this by bumping up the number of
 active logs.
 4. Increase the buffer size on the BatchWriter that the clients use. This
 can be done with the setBatchWriterOptions method on the
 AccumuloOutputFormat.


 Thanks for the tips, I try these out


 Cheers,
 Adam



 On Tue, Apr 8, 2014 at 4:47 PM, Mike Hugo m...@piragua.com wrote:

 Hello,

 We have an ingest process that operates via Map Reduce, processing a
 large set of XML files and  inserting mutations based on that data into a
 set of tables.

 On a 5 node cluster (each node has 64G ram, 20 cores, and ~600GB SSD) I
 get 400k inserts per second with 20 mapper tasks running concurrently.
  Increasing the number of concurrent mapper tasks to 40 doesn't have any
 effect (besides causing a little more backup in compactions).

 I've increased the table.compaction.major.ratio and increased the number
 of concurrent allowed compactions for both min and max compaction but each
 of those only had negligible impact on ingest rates.

 Any advice on other settings I can tweak to get things to move more
 quickly?  Or is 400k/second a reasonable ingest rate?  Are we at a point
 where we should consider generating r files like the bulk ingest example?

 Thanks in advance for any advice.

 Mike






Re: HDFS caching w/ Accumulo?

2014-02-26 Thread Adam Fuchs
Maybe this could be used to speed up WAL recovery for use cases that demand
really high availability and low latency?

Adam
On Feb 25, 2014 10:50 AM, Donald Miner dmi...@clearedgeit.com wrote:

 HDFS caching is part of the new Hadoop 2.3 release. From what I
 understand, it allows you to mark specific files to be held in memory for
 faster reads.

 Has anyone thought about how Accumulo could leverage this?



Re: WAL - rate limiting factor x4.67

2013-12-04 Thread Adam Fuchs
One thing you can do is reduce the replication factor for the WAL. We have
found that makes a pretty significant different in write performance. That
can be modified with the tserver.wal.replication property. Setting it to 2
instead of the default (probably 3) should give you some performance
improvement, of course at some cost to durability.

Adam


On Wed, Dec 4, 2013 at 5:14 AM, Peter Tillotson slatem...@yahoo.co.ukwrote:

 I've been trying to get the most out of streaming data into Accumulo 1.5
 (Hadoop Cloudera CDH4). Having tried a number of settings, re-writing
 client code etc I finally switched off the Write Ahead Log
 (table.walog.enabled=false) and saw a huge leap in ingest performance.

 Ingest with table.walog.enabled= true:   ~6 MB/s
 Ingest with table.walog.enabled= false:  ~28 MB/s

 That is a factor of about x4.67 speed improvement.

 Now my use case could probably live without or work around not having a
 wal, but I wondered if this was a known issue??
 (didn't see anything in jira), wal seem to be a significant rate limiter
 this is either endemic to Accumulo or an HDFS / setup issue. Though given
 everything is in HDFS these days and otherwise IO flies it looks like
 Accumulo WAL is the most likely culprit.

 I don't believe this to be an IO issue on the box, with wal off the is
 significantly more IO (up to 80M/s reported by dstat), with wal on (up to
 12M/s reported by dstat). Testing the box with FIO sequential write is
 160M/s.

 Further info:
 Hadoop 2.00 (Cloudera cdh4)
 Accumulo (1.5.0)
 Zookeeper ( with Netty, minor improvement of 1MB/s  )
 Filesystem ( HDFS is ZFS, compression=on, dedup=on, otherwise ext4 )

 With large imports from scratch now I start off CPU bound and as more
 shuffling is needed this becomes Disk bound later in the import as
 expected. So I know pre-splitting would probably sort it.

 Tnx

 P



Re: Efficient Tablet Merging [SEC=UNOFFICIAL]

2013-10-03 Thread Adam Fuchs
Never underestimate the power of ascii art!

Adam
On Oct 2, 2013 11:28 PM, Eric Newton eric.new...@gmail.com wrote:

 I'll use ASCII graphics to demonstrate the size of a tablet.

 Small: []
 Medium: [ ]
 Large: [  ]

 Think of it like this... if you are running age-off... you probably have
 lots of little buckets of rows at the beginning and larger buckets at the
 end:

 [][][][][][][][][]...[ ][ ][ ][ ][ ][  ][  ][][][][][
  ][]

 What you probably want is something like this:

 [   ][   ][   ][   ][   ][   ][   ][
 ]

 Some big bucket at the start, with old data, and some larger buckets for
 everything afterwards.  But... this would probably work:

 [   ][   ][   ][   ][   ][   ][   ][   ][
   ]

 Just a bunch of larger tablets throughout.

 So you need to set your merge size to [  ] (4G), and you can always
 keep creating smaller tablets for future rows with manual splits:

 [   ][   ][   ][   ][   ][   ][   ][   ][
   ][  ][  ][  ][  ][  ]


 So increase the split threshold to 4G, and merge on 4G, but continue to
 make manual splits for your current days, as necessary.  Merge them away
 later.


 -Eric




 On Wed, Oct 2, 2013 at 6:35 PM, Dickson, Matt MR 
 matt.dick...@defence.gov.au wrote:

 **

 *UNOFFICIAL*
 Thanks Eric,

 If I do the merge with size of 4G does the split threshold need to be
 increased to the 4G also?

  --
 *From:* Eric Newton [mailto:eric.new...@gmail.com]
 *Sent:* Wednesday, 2 October 2013 23:05
 *To:* user@accumulo.apache.org
 *Subject:* Re: Efficient Tablet Merging [SEC=UNOFFICIAL]

  The most efficient way is kind of scary.  If this is a production
 system, I would not recommend it.

 First, find out the size of your 10x tablets.  Let's say it's 10G.  Set
 your split threshold to 10G.  Then merge all old tablets all of them
 into one tablet.  This will dump thousands of files into a single tablet,
 but it will soon split out again into the nice 10G tablets you are looking
 for.  The system will probably be unusable during this operation.

 The more conservative way is to specify the merge in single steps (the
 master will only coordinate a single merge on a table at a time anyhow).
  You can do it by range or by size... I would do it by size, especially if
 you are aging off your old data.

 Compacting the data won't have any effect on the speed of the merge.

 -Eric



 On Tue, Oct 1, 2013 at 11:58 PM, Dickson, Matt MR 
 matt.dick...@defence.gov.au wrote:

 **

 *UNOFFICIAL*
 I have a table that we create splits of the form mmdd-* *where
  ranges from  to 0840.  The bulk of our data is loaded for the
 current date with no data loaded for days older than 3 days so from my
 understanding it would be wise to merge splits older than 3 days in order
 to reduce the overall tablet count.  It would still be optimal to
 maintain some distribution of tablets for a day across the cluster so I'm
 looking at merging splits in 10 increments eg, merge -b 20130901- -e
 20130901-0009, therefore reducing 840 splits per day to 84.

 Currently we have 120K tablets (size 1G) on a cluster of 56 nodes and
 our ingest has slowed as the data quantity and tablet count has grown.
 Initialy we were achieving 200-300K, now 50-100K.

 My question is, what is the best way to do this merge?  Should we use
 the merge command with the size option set at something like 5G, or maybe
 use the compaction command?

 From my tests this process could take some time so I'm keen to
 understand the most efficient approach.

 Thanks in advance,
 Matt Dickson






Re: My Accumulo 1.5.0 instance has no tablet servers

2013-10-01 Thread Adam Fuchs
To follow up on this, I think maybe the config should be
namedfs.datanode.synconclosename, not namedfs.data.synconclosename.
Was that a typo, Eric?

Thanks,
Adam



On Thu, Sep 12, 2013 at 2:31 PM, Eric Newton eric.new...@gmail.com wrote:

 Add:

   property
   namedfs.support.append/name
   valuetrue/value
   /property
   property
   namedfs.data.synconclose/name
   valuetrue/value
   /property

 To hdfs-site.xml in your hadoop configuration.

 -Eric



 On Thu, Sep 12, 2013 at 2:27 PM, Pete Carlson pgcarl...@gmail.com wrote:

 Ok, so now that I have an Accumulo monitor I discovered that my Accumulo
 instance doesn't have any tablet servers.

 Here is what I tried so far to resolve the issue:

 1) Looked in the tserver_localhost.localdomain.log file, and found this
 FATAL message:

 2013-09-12 08:09:42,273 [tabletserver.TabletServer] FATAL: Must set
 dfs.durable.sync OR dfs.support.append to true.  Which one needs to be set
 depends on your version of HDFS.  See ACCUMULO-623.
 HADOOP RELEASE  VERSION   SYNC NAME DEFAULT
 Apache Hadoop   0.20.205  dfs.support.appendfalse
 Apache Hadoop0.23.x   dfs.support.appendtrue
 Apache Hadoop 1.0.x   dfs.support.appendfalse
 Apache Hadoop 1.1.x   dfs.durable.sync  true
 Apache Hadoop  2.0.0-2.0.2dfs.support.appendtrue
 Cloudera CDH 3u0-3u3    true
 Cloudera CDH   3u4dfs.support.appendtrue
 Hortonworks HDP   `1.0dfs.support.appendfalse
 Hortonworks HDP   `1.1dfs.support.appendfalse
 2013-09-12 11:54:00,752 [server.Accumulo] INFO : tserver starting
 2013-09-12 11:54:00,768 [server.Accumulo] INFO : Instance
 d57cdc38-8ceb-4192-9da3-1ce2664df33b
 2013-09-12 11:54:00,771 [server.Accumulo] INFO : Data Version 5
 2013-09-12 11:54:00,771 [server.Accumulo] INFO : Attempting to talk to
 zookeeper
 2013-09-12 11:54:00,952 [server.Accumulo] INFO : Zookeeper connected and
 initialized, attemping to talk to HDFS
 2013-09-12 11:54:00,956 [server.Accumulo] INFO : Connected to HDFS
 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.cycle.delay = 5m
 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.cycle.start = 30s
 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.port.client = 50091
 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.threads.delete = 16
 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.trash.ignore = false

 I saw this same FATAL message 8 times in the 
 tserver_localhost.localdomain.log
 between blocks of INFO messages, but no other fatal or warn messages.
 Btw, this FATAL message also appears in my
 tserver_localhost.localdomain.debug.log file.

 When I googled this Fatal message I found this page:

 http://mail-archives.apache.org/mod_mbox/accumulo-user/201304.mbox/%3c515f5518.1090...@gmail.com%3E
  with
 the same WARN: There are no tablet servers: check that zookeeper and
 accumulo are running. message.

 I checked http://127.0.0.1:50095/tservers, and it showed that there were
 no tablet servers online. I looked at http://127.0.0.1:50095/log, and
 saw the following messages:

 FATAL: Must set dfs.durable.sync or dfs.support.append to true. Which one
 needs to be set depends on your version of HDFS. See Accumulo-623.

 WARN: There are no tablet servers: check that zookeeper and accumulo are
 running.

 Using the info from the page I referenced above, I checked my
 $ACCUMULO_HOME path and realized that I hadn't set that in the
 conf/accumulo-env.sh

 So, I set it to the following:

 test -z $ACCUMULO_HOME  export
 ACCUMULO_HOME=/home/accumulo/accumulo-1.5.0

 When I did an echo of $ACCUMULO_HOME it didn't return anything, so I also
 tried setting it in my bash profile to see if that made any difference (it
 didn't).

 I also looked in the lib directory but didn't see any stray jars.

 In my tracer_localhost_localdomain.log I saw the following Exception with
 Zookeeper:

 2013-09-11 16:09:48,649 [impl.ServerClient] WARN : There are no tablet
 servers: check that zookeeper and accumulo are running.
 2013-09-11 18:02:23,385 [zookeeper.ZooCache] WARN : Zookeeper error, will
 retry
 org.apache.zookeeper.KeeperException$SessionExpiredException:
 KeeperErrorCode = Session expired for
 /accumulo/d57cdc38-8ceb-4192-9da3-1ce2664df33b/tservers
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1468)
 at org.apache.accumulo.fate.zookeeper.ZooCache$1.run(ZooCache.java:167)
 at org.apache.accumulo.fate.zookeeper.ZooCache.retry(ZooCache.java:130)
 at
 org.apache.accumulo.fate.zookeeper.ZooCache.getChildren(ZooCache.java:178)
 at
 org.apache.accumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:140)
 at
 

Re: Trouble with IntersectingIterator

2013-10-01 Thread Adam Fuchs
Heath,

In your case, the question that you are effectively asking is within each
partition, which documents' index entries include all of the given terms.
Since you have partitions aligned by field and only a single index entry
per field you will not get any matches for queries with more than one term.
You can't ask a question that correlates index entries that cross a
partition boundary with the IntersectingIterator. For example, document
m1 has the index entry for habelson in the sender partition, but the
index entry for mgiordano is in the receiver partition.

Another thing you might try is to partition by field within the document
partitions. You can hack this together by building something like the
following, with p1 = {m1,m2,m3} and p2 = {m4,m5}:

p1 receiver_habelson:m3 []habelson
p1 receiver_jmarcolla:m2 []jmarcolla
p1 receiver_mgiordano:m1 []mgiordano
p1 sender_habelson:m1 []habelson
p1 sender_habelson:m2 []habelson
p1 sender_mgiordano:m3 []mgiordano
p1 sentTime_1380571500:m1 []1380571500
p1 sentTime_1380571502:m2 []1380571502
p1 sentTime_1380571504:m3 []1380571504
p1 subject_Lunch:m1 []Lunch
p1 subject_Lunch:m2 []Lunch
p1 subject_Lunch:m3 []Lunch
p2 receiver_habelson:m5 []habelson
p2 receiver_mcross:m4 []mcross
p2 sender_habelson:m4 []habelson
p2 sender_mcross:m5 []mcross
p2 sentTime_1380571506:m4 []1380571506
p2 sentTime_1380571508:m5 []1380571508
p2 subject_Lunch:m4 []Lunch
p2 subject_Lunch:m5 []Lunch

Here terms are prefixed by field_, and you can do queries for things like
{sender_habelson, receiver_mgiordano}.

Adam




On Tue, Oct 1, 2013 at 4:13 PM, Heath Abelson habel...@netcentricinc.comwrote:

  Looking at this example, the index and record do not occur in the same
 row. The seems to be more related to the IndexedDocIterator.

 ** **

 If we take my “mail” object as my document, and think of it as being
 partitioned by field name rather than some hash, It seems to me like the
 use of this iterator could still apply.

 ** **

 *From:* William Slacum [mailto:wilhelm.von.cl...@accumulo.net]
 *Sent:* Tuesday, October 01, 2013 3:48 PM
 *To:* user@accumulo.apache.org
 *Subject:* Re: Trouble with IntersectingIterator

 ** **

 That iterator is designed to be used with a sharded table format, where in
 the index and record each occur within the same row. See the Accumulo
 examples page http://accumulo.apache.org/1.4/examples/shard.html

 ** **

 On Tue, Oct 1, 2013 at 3:35 PM, Heath Abelson habel...@netcentricinc.com
 wrote:

 I am attempting to get a very simple example working with the Intersecting
 Iterator. I made up some dummy objects for me to do this work:

  

 A scan on the “Mail” table looks like this:

  

 m1 mail:body [U(USA)]WTF?

 m1 mail:receiver [U(USA)]mgiordano

 m1 mail:sender [U(USA)]habelson

 m1 mail:sentTime [U(USA)]1380571500

 m1 mail:subject [U(USA)]Lunch

 m2 mail:body [U(USA)]I know right?

 m2 mail:receiver [U(USA)]jmarcolla

 m2 mail:sender [U(USA)]habelson

 m2 mail:sentTime [U(USA)]1380571502

 m2 mail:subject [U(USA)]Lunch

 m3 mail:body [U(USA)]exactly!

 m3 mail:receiver [U(USA)]habelson

 m3 mail:sender [U(USA)]mgiordano

 m3 mail:sentTime [U(USA)]1380571504

 m3 mail:subject [U(USA)]Lunch

 m4 mail:body [U(USA)]Dude!

 m4 mail:receiver [U(USA)]mcross

 m4 mail:sender [U(USA)]habelson

 m4 mail:sentTime [U(USA)]1380571506

 m4 mail:subject [U(USA)]Lunch

 m5 mail:body [U(USA)]Yeah

 m5 mail:receiver [U(USA)]habelson

 m5 mail:sender [U(USA)]mcross

 m5 mail:sentTime [U(USA)]1380571508

 m5 mail:subject [U(USA)]Lunch

  

 A scan on the “MailIndex” table looks like this:

  

 receiver habelson:m3 []habelson

 receiver habelson:m5 []habelson

 receiver jmarcolla:m2 []jmarcolla

 receiver mcross:m4 []mcross

 receiver mgiordano:m1 []mgiordano

 sender habelson:m1 []habelson

 sender habelson:m2 []habelson

 sender habelson:m4 []habelson

 sender mcross:m5 []mcross

 sender mgiordano:m3 []mgiordano

 sentTime 1380571500:m1 []1380571500

 sentTime 1380571502:m2 []1380571502

 sentTime 1380571504:m3 []1380571504

 sentTime 1380571506:m4 []1380571506

 sentTime 1380571508:m5 []1380571508

 subject Lunch:m1 []Lunch

 subject Lunch:m2 []Lunch

 subject Lunch:m3 []Lunch

 subject Lunch:m4 []Lunch

 subject Lunch:m5 []Lunch

  

 If I use an IntersectingIterator with a BatchScanner and pass it the terms
 “habelson”,”mgiordano” (or seemingly any pair of terms) I get zero results.
 If, instead, I use the same value as both terms (i.e.
 “habelson”,”habelson”) I properly get back the 

RE: Assigned and hosted Error [SEC=UNOFFICIAL]

2013-09-30 Thread Adam Fuchs
Matt,

Did you include any patches that have not been committed to the 1.5 branch
in your snapshot?

Adam
On Sep 30, 2013 6:25 PM, Dickson, Matt MR matt.dick...@defence.gov.au
wrote:

 **

 *UNOFFICIAL*
 1.5.1-SNAPSHOT from 20/09/13.

  --
 *From:* Sean Busbey [mailto:bus...@cloudera.com]
 *Sent:* Tuesday, 1 October 2013 08:07
 *To:* Accumulo User List
 *Subject:* Re: Assigned and hosted Error [SEC=UNOFFICIAL]

  Hi Matt!

 What version of Accumulo are you using?

 --
 Sean
 On Sep 30, 2013 2:54 PM, Dickson, Matt MR matt.dick...@defence.gov.au
 wrote:

 **

 *UNOFFICIAL*
 Hi,

 I'm getting a BadLocationStateException stating '3n;*nnn;nnn *is
 both assigned and hosted, which should never happen: 3n;*nnn;nnn
 @*(160.45.45.33:9997[2323423aeb], 160.45.45.33:9997[0],null)'

 The Accumulo console is unable to display any table details.  I have
 tried restarting Accumulo with no success and the logs contain an 'INFO
 : Failed to obtain problem reports' message.

 Has anyone come across this?

 Thanks in advance,
 Matt Dickson




Re: BatchWriter performance on 1.4

2013-09-19 Thread Adam Fuchs
The addMutations method blocks when the client-side buffer fills up, so you
may see a lot of time spent in that method due to a bottleneck downstream.
There are a number of things you could try to speed that up. Here are a few:
1. Increase the BatchWriter's buffer size. This can smooth out the network
utilization and increase efficiency.
2. Increase the number of threads that the BatchWriter uses to process
mutations. This is particularly useful if you have more tablet servers than
ingest clients.
3. Use a more efficient encoding. The more data you put through the
BatchWriter, the longer it will take, even if that data compresses well at
rest.
4. If you are seeing hold time show up on your tablet servers (displayed
through the monitor page) you can increase the memory.maps.max to make
minor compactions more efficient.

Cheers,
Adam
On Sep 18, 2013 10:08 PM, Slater, David M. david.sla...@jhuapl.edu
wrote:

 Hi, I’m running a single-threaded ingestion program that takes data from
 an input source, parses it into mutations, and then writes those mutations
 (sequentially) to four different BatchWriters (all on different tables).
 Most of the time (95%) taken is on adding mutations, e.g.
 batchWriter.addMutations(mutations); I am wondering how to reduce the time
 taken by these methods. 

 ** **

 1) For the method batchWriter.addMutations(IterableMutation), does it
 matter for performance whether the mutations returned by the iterator are
 sorted in lexicographic order? 

 ** **

 2) If the IterableMutation that I pass to the BatchWriter is very large,
 will I need to wait for a number of Batches to be written and flushed
 before it will finish iterating, or does it transfer the elements of the
 Iterable to a different intermediate list?

 ** **

 3) If that is the case, would it then make sense to spawn off short
 threads for each time I make use of addMutations?

 ** **

 At a high level, my code looks like this:

 ** **

 BatchWriter bw1 = connector.createBatchWriter(…)

 BatchWriter bw2 = …

 …

 while(true) {

 String[] data = input.getData();

 ListMutation mutations1 = parseData1(data);

 ListMutation mutations2 = parseData2(data);

 …

 bw1.addMutations(mutations1);

 bw2.addMutations(mutations2);

 …

 }

 

 Thanks,
 David



Re: Getting the IP Address

2013-08-28 Thread Adam Fuchs
Seems like a question a common and complex as which IP address to listen on
would have a fair amount of precedent in open-source projects that we could
pull from. Are we reinventing the wheel? Does anyone have an example of an
application like ours with the same set of supported platforms that has
already solved this problem and whose solution you like? Are there elements
of what we do that make us better/worse/different that something like the
scripting and networking code built for HBase or HDFS?

Adam



On Wed, Aug 28, 2013 at 1:35 PM, Keith Turner ke...@deenlo.com wrote:




 On Wed, Aug 28, 2013 at 4:26 PM, Christopher ctubb...@apache.org wrote:

 Ah, you're right, of course.

 In that case, I'm also wondering about NAT situations and other
 strange networking situations. For those especially, it seems what we
 need to do is treat the bind address differently from the advertised
 address.

 Perhaps attempting to use $(hostname -i) and falling back to
 $(hostname -I | head -1) would be best?


 I just noticed one wrinkle with hostname -I,  it may return IPV6
 addresses.   When I first looked at the man page, I thought it would
 exclude IPV6.  But on closes inspection I noticed it excludes IPv6
 link-local addresses.  So hostname -I will probably cause problems if the
 first thing it returns is a IPV6 addr.



 --
 Christopher L Tubbs II
 http://gravatar.com/ctubbsii


 On Wed, Aug 28, 2013 at 3:03 PM, John Vines vi...@apache.org wrote:
  Christopher,
 
  It's not a matter of determining which port to bind to. It's for
 recording
  it's location in zookeeper so other nodes can find it.
 
 
  On Wed, Aug 28, 2013 at 3:00 PM, Christopher ctubb...@apache.org
 wrote:
 
  I'm not sure this is even very portable. It relies on a specific
  ifconfig display format intended for human-readability, and I'm not
  sure that's entirely guaranteed to be static over time. It also won't
  work if there are multiple public interfaces. It also don't think it
  works for infiniband or other interface types that have issues in
  ifconfig.
 
  I think we have to make *some* assumptions that things like
  networking is properly configured using standard utilities for
  name-mapping (like DNS or /etc/hosts). I think it's more confusing for
  sysadmins if we have these sorts of automatic behaviors that are
  non-standard and unexpected (like automatically binding to a single,
  arbitrarily chosen, public IP out of the box).
 
  Honestly, though, I'm not sure why we need to be resolving public IP
  addresses *at all*. It should be configured explicitly, and bind to
  either 127.0.0.1 or 0.0.0.0 by default (to satisfy the ease for
  first-time users).
 
 
  --
  Christopher L Tubbs II
  http://gravatar.com/ctubbsii
 
 
  On Wed, Aug 28, 2013 at 1:54 PM, John Vines vi...@apache.org wrote:
   We use this similar logic throughout a lot of our scripts for
   determining
   the external facing IP address in a portable manner, it's just that
 the
   init.d scripts are a bit more strict about it. This is the
 equivalent of
   using the name defined in the slaves/masters/tracers/etc. files to
   determine
   which port to report as.
  
   Switching to a system that depends on DNS to succeed will fail for
 all
   first
   time users, which is a penalty that will not be worth it. If someone
 can
   find a better way to determine outward facing IP address I would
 love to
   have it, but unfortunately networks are hard.
  
  
   On Wed, Aug 28, 2013 at 1:44 PM, Billie Rinaldi
   billie.rina...@gmail.com
   wrote:
  
   Good point.  I don't care if the init.d scripts work on a Mac.  I do
   care
   about the other scripts, though.
  
  
   On Wed, Aug 28, 2013 at 10:32 AM, Christopher ctubb...@apache.org
   wrote:
  
   But... it shouldn't be a supported platform for init scripts... I
   imagine.
  
   --
   Christopher L Tubbs II
   http://gravatar.com/ctubbsii
  
  
   On Wed, Aug 28, 2013 at 1:03 PM, Billie Rinaldi
   billie.rina...@gmail.com wrote:
It's a supported development platform.  =)
   
   
On Wed, Aug 28, 2013 at 9:59 AM, Sean Busbey 
 bus...@cloudera.com
wrote:
   
hostname -i does not work on a Mac ( 10.8.4 )
   
Is Mac a supported platform?
   
   
On Wed, Aug 28, 2013 at 11:53 AM, Eric Newton
eric.new...@gmail.com
wrote:
   
Does hostname -i work on a mac?  Not being a mac user, I
 can't
check.
   
-Eric
   
   
   
On Wed, Aug 28, 2013 at 11:38 AM, Ravi Mutyala
r...@hortonworks.com
wrote:
   
Hi,
   
I see from the accumulo-tracer init.d script that IP is
determined
by
this logic.
   
ifconfig | grep inet[^6] | awk '{print $2}' | sed 's/addr://'
 |
grep
-v
0.0.0.0 | grep -v 127.0.0.1 | head -n 1
   
   
Any reason for using this logic instead of a hostname -i and
using
reverse dns lookup? I have a cluster where the order of nics
 on
one
of the
nodes is in a different order and ifconfig returns a IP from a
   

Re: master fails to start

2013-05-21 Thread Adam Fuchs
Chris,

Did you copy the conf/accumulo.policy.example to conf/accumulo.policy? If
so, you may need to make some changes to account for changes to hadoop
security. I suspect the problem is that the codebase
file:${hadoop.home.dir}/lib/* reference doesn't include your CDH3
libraries. You could modify that codebase to include the hadoop libs, or
you could disable the security policy by removing the conf/accumulo.policy
file.

Adam



On Mon, May 20, 2013 at 4:14 PM, Chris Retford chris.retf...@gmail.comwrote:

 I searched the archive before posting and didn't find anything. I have a
 new system with 12 nodes (3 ZK), and a single user in the hadoop group. The
 master fails to start. It looks to me like it is unable to read
 /accumulo/instance_id in HDFS, but I can't think why that would be. Thanks
 in advance for any advice on how to run this down. Here are the contents of
 master.err log:

 Thread master died null
 java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at org.apache.accumulo.start.Main$1.run(Main.java:89)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: java.lang.ExceptionInInitializerError
 at
 org.apache.hadoop.security.UserGroupInformation.clinit(UserGroupInformation.java:469)
 at
 org.apache.hadoop.fs.FileSystem$Cache$Key.init(FileSystem.java:1757)
 at
 org.apache.hadoop.fs.FileSystem$Cache$Key.init(FileSystem.java:1750)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1618)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:255)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:124)
 at
 org.apache.accumulo.core.file.FileUtil.getFileSystem(FileUtil.java:554)
 at
 org.apache.accumulo.core.client.ZooKeeperInstance.getInstanceIDFromHdfs(ZooKeeperInstance.java:258)
 at
 org.apache.accumulo.server.conf.ZooConfiguration.getInstance(ZooConfiguration.java:65)
 at
 org.apache.accumulo.server.conf.ServerConfiguration.getZooConfiguration(ServerConfiguration.java:49)
 at
 org.apache.accumulo.server.conf.ServerConfiguration.getSystemConfiguration(ServerConfiguration.java:58)
 at
 org.apache.accumulo.server.client.HdfsZooInstance.init(HdfsZooInstance.java:62)
 at
 org.apache.accumulo.server.client.HdfsZooInstance.getInstance(HdfsZooInstance.java:70)
 at org.apache.accumulo.server.Accumulo.init(Accumulo.java:132)
 at org.apache.accumulo.server.master.Master.init(Master.java:534)
 at org.apache.accumulo.server.master.Master.main(Master.java:2190)
 ... 6 more
 Caused by: java.security.AccessControlException: access denied
 (java.lang.RuntimePermission getenv.HADOOP_JAAS_DEBUG)
 at
 java.security.AccessControlContext.checkPermission(AccessControlContext.java:366)
 at
 java.security.AccessController.checkPermission(AccessController.java:560)
 at
 java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
 at java.lang.System.getenv(System.java:883)
 at
 org.apache.hadoop.security.UserGroupInformation$HadoopConfiguration.clinit(UserGroupInformation.java:392)



Re: [VOTE] 1.5.0-RC3

2013-05-17 Thread Adam Fuchs
Looks like the src part of the distribution is
accumulo-project-1.5.0-src.tar.gz.
For the same reasons that we removed the assemble tag form the bin
package, shouldn't we remove the project tag from the src package? This
also has implications as to whether we can just untar both the bin and src
tars into the same directory and have them form voltron (instead of two
different directories).

Adam



On Thu, May 16, 2013 at 9:05 PM, Christopher ctubb...@apache.org wrote:

 1.5.0-RC3 for consideration.
 https://repository.apache.org/content/repositories/orgapacheaccumulo-007/

 --
 Christopher L Tubbs II
 http://gravatar.com/ctubbsii



 -- Forwarded message --
 From: Nexus Repository Manager ne...@repository.apache.org
 Date: Thu, May 16, 2013 at 9:02 PM
 Subject: Nexus: Staging Completed.
 To: Christopher Tubbs ctubb...@gmail.com


 Description:

 1.5.0-RC3

 Details:

 The following artifacts have been staged to the
 org.apache.accumulo-007 (u:ctubbsii, a:173.66.3.39) repository.

 archetype-catalog.xml
 accumulo-1.5.0-bin.rpm.asc
 accumulo-1.5.0-native.rpm.asc
 accumulo-1.5.0-test.deb
 accumulo-1.5.0-bin.tar.gz.asc
 accumulo-1.5.0.pom
 accumulo-1.5.0-native.rpm
 accumulo-1.5.0.pom.asc
 accumulo-1.5.0-test.rpm
 accumulo-1.5.0-bin.deb
 accumulo-1.5.0-bin.rpm
 accumulo-1.5.0-bin.tar.gz
 accumulo-1.5.0-native.deb
 accumulo-1.5.0-test.rpm.asc
 accumulo-1.5.0-bin.deb.asc
 accumulo-1.5.0-test.deb.asc
 accumulo-1.5.0-native.deb.asc
 accumulo-examples-1.5.0.pom.asc
 accumulo-examples-1.5.0.pom
 accumulo-core-1.5.0.pom.asc
 accumulo-core-1.5.0-javadoc.jar
 accumulo-core-1.5.0-sources.jar
 accumulo-core-1.5.0-javadoc.jar.asc
 accumulo-core-1.5.0.pom
 accumulo-core-1.5.0.jar
 accumulo-core-1.5.0-sources.jar.asc
 accumulo-core-1.5.0.jar.asc
 accumulo-examples-simple-1.5.0.jar
 accumulo-examples-simple-1.5.0.jar.asc
 accumulo-examples-simple-1.5.0-javadoc.jar.asc
 accumulo-examples-simple-1.5.0.pom.asc
 accumulo-examples-simple-1.5.0-sources.jar
 accumulo-examples-simple-1.5.0-javadoc.jar
 accumulo-examples-simple-1.5.0-sources.jar.asc
 accumulo-examples-simple-1.5.0.pom
 accumulo-test-1.5.0-sources.jar.asc
 accumulo-test-1.5.0.pom
 accumulo-test-1.5.0.jar.asc
 accumulo-test-1.5.0.pom.asc
 accumulo-test-1.5.0-javadoc.jar.asc
 accumulo-test-1.5.0-sources.jar
 accumulo-test-1.5.0.jar
 accumulo-test-1.5.0-javadoc.jar
 accumulo-proxy-1.5.0-javadoc.jar.asc
 accumulo-proxy-1.5.0-sources.jar
 accumulo-proxy-1.5.0.pom.asc
 accumulo-proxy-1.5.0-javadoc.jar
 accumulo-proxy-1.5.0.jar.asc
 accumulo-proxy-1.5.0.jar
 accumulo-proxy-1.5.0-sources.jar.asc
 accumulo-proxy-1.5.0.pom
 accumulo-project-1.5.0-src.tar.gz.asc
 accumulo-project-1.5.0.pom
 accumulo-project-1.5.0.pom.asc
 accumulo-project-1.5.0-src.tar.gz
 accumulo-project-1.5.0-site.xml.asc
 accumulo-project-1.5.0-site.xml
 accumulo-trace-1.5.0.jar.asc
 accumulo-trace-1.5.0-javadoc.jar
 accumulo-trace-1.5.0.pom.asc
 accumulo-trace-1.5.0-sources.jar.asc
 accumulo-trace-1.5.0.jar
 accumulo-trace-1.5.0-sources.jar
 accumulo-trace-1.5.0.pom
 accumulo-trace-1.5.0-javadoc.jar.asc
 accumulo-server-1.5.0-sources.jar
 accumulo-server-1.5.0.pom.asc
 accumulo-server-1.5.0-sources.jar.asc
 accumulo-server-1.5.0.jar
 accumulo-server-1.5.0.jar.asc
 accumulo-server-1.5.0-javadoc.jar.asc
 accumulo-server-1.5.0-javadoc.jar
 accumulo-server-1.5.0.pom
 accumulo-fate-1.5.0-sources.jar.asc
 accumulo-fate-1.5.0.pom
 accumulo-fate-1.5.0-javadoc.jar
 accumulo-fate-1.5.0-sources.jar
 accumulo-fate-1.5.0.jar.asc
 accumulo-fate-1.5.0.pom.asc
 accumulo-fate-1.5.0-javadoc.jar.asc
 accumulo-fate-1.5.0.jar
 accumulo-start-1.5.0.pom
 accumulo-start-1.5.0-sources.jar.asc
 accumulo-start-1.5.0-sources.jar
 accumulo-start-1.5.0.jar.asc
 accumulo-start-1.5.0-javadoc.jar.asc
 accumulo-start-1.5.0.jar
 accumulo-start-1.5.0-javadoc.jar
 accumulo-start-1.5.0.pom.asc



Re: [VOTE] 1.5.0-RC3

2013-05-17 Thread Adam Fuchs
Thanks for putting up with us picky people, Chris!

Adam
On May 17, 2013 6:15 PM, Christopher ctubb...@apache.org wrote:

 So,

 I've fixed the problem with the src tarball including binaries, and I
 believe I've satisfied all the concerns regarding the naming
 conventions.
 I'm going to go ahead and include the nativemap soruce in the
 -bin.tar.gz, to satisfy some interested parties.
 I was also advised to include the assemble/scripts in the -bin.tar.gz
 And, I want release:prepare to run the tests, and release:perform to
 seal the jars (these are mutually exclusive activities)

 For now, I'm going to revert the commits for RC3, but please feel free
 to continue reviewing.
 I've been checking the RPMs/DEBs, but will check again (and check back
 on this thread) before cutting RC4, which I should be able to do this
 weekend.


 --
 Christopher L Tubbs II
 http://gravatar.com/ctubbsii


 On Fri, May 17, 2013 at 1:19 PM, Keith Turner ke...@deenlo.com wrote:
  I took a look at the binary tar, its sig and hashes looked good.   I was
  able to run accumulo from it w/ no problem and the apidocs were there!
  Tried to run proxy from binary tarball, there was no proxy.properties or
  proxy/README.
 
  I diffed the src tarball w/ the tag and saw the following extra files:
 
  $ diff -r accumulo-project-1.5.0 1.5.0-RC3
  Only in accumulo-project-1.5.0: DEPENDENCIES
  Only in accumulo-project-1.5.0/docs: apidocs
  Only in accumulo-project-1.5.0/lib: native
  Only in accumulo-project-1.5.0/server/src/main/c++/nativeMap:
  libNativeMap-Linux-amd64-64.so
 
  Christopher, are you thinking of spinning and RC4?  Has anyone verified
 the
  rpms and debs?  Would be nice to find any problems w/ them.
 
 
 
  On Thu, May 16, 2013 at 9:05 PM, Christopher ctubb...@apache.org
 wrote:
 
  1.5.0-RC3 for consideration.
 
 https://repository.apache.org/content/repositories/orgapacheaccumulo-007/
 
  --
  Christopher L Tubbs II
  http://gravatar.com/ctubbsii
 
 
 
  -- Forwarded message --
  From: Nexus Repository Manager ne...@repository.apache.org
  Date: Thu, May 16, 2013 at 9:02 PM
  Subject: Nexus: Staging Completed.
  To: Christopher Tubbs ctubb...@gmail.com
 
 
  Description:
 
  1.5.0-RC3
 
  Details:
 
  The following artifacts have been staged to the
  org.apache.accumulo-007 (u:ctubbsii, a:173.66.3.39) repository.
 
  archetype-catalog.xml
  accumulo-1.5.0-bin.rpm.asc
  accumulo-1.5.0-native.rpm.asc
  accumulo-1.5.0-test.deb
  accumulo-1.5.0-bin.tar.gz.asc
  accumulo-1.5.0.pom
  accumulo-1.5.0-native.rpm
  accumulo-1.5.0.pom.asc
  accumulo-1.5.0-test.rpm
  accumulo-1.5.0-bin.deb
  accumulo-1.5.0-bin.rpm
  accumulo-1.5.0-bin.tar.gz
  accumulo-1.5.0-native.deb
  accumulo-1.5.0-test.rpm.asc
  accumulo-1.5.0-bin.deb.asc
  accumulo-1.5.0-test.deb.asc
  accumulo-1.5.0-native.deb.asc
  accumulo-examples-1.5.0.pom.asc
  accumulo-examples-1.5.0.pom
  accumulo-core-1.5.0.pom.asc
  accumulo-core-1.5.0-javadoc.jar
  accumulo-core-1.5.0-sources.jar
  accumulo-core-1.5.0-javadoc.jar.asc
  accumulo-core-1.5.0.pom
  accumulo-core-1.5.0.jar
  accumulo-core-1.5.0-sources.jar.asc
  accumulo-core-1.5.0.jar.asc
  accumulo-examples-simple-1.5.0.jar
  accumulo-examples-simple-1.5.0.jar.asc
  accumulo-examples-simple-1.5.0-javadoc.jar.asc
  accumulo-examples-simple-1.5.0.pom.asc
  accumulo-examples-simple-1.5.0-sources.jar
  accumulo-examples-simple-1.5.0-javadoc.jar
  accumulo-examples-simple-1.5.0-sources.jar.asc
  accumulo-examples-simple-1.5.0.pom
  accumulo-test-1.5.0-sources.jar.asc
  accumulo-test-1.5.0.pom
  accumulo-test-1.5.0.jar.asc
  accumulo-test-1.5.0.pom.asc
  accumulo-test-1.5.0-javadoc.jar.asc
  accumulo-test-1.5.0-sources.jar
  accumulo-test-1.5.0.jar
  accumulo-test-1.5.0-javadoc.jar
  accumulo-proxy-1.5.0-javadoc.jar.asc
  accumulo-proxy-1.5.0-sources.jar
  accumulo-proxy-1.5.0.pom.asc
  accumulo-proxy-1.5.0-javadoc.jar
  accumulo-proxy-1.5.0.jar.asc
  accumulo-proxy-1.5.0.jar
  accumulo-proxy-1.5.0-sources.jar.asc
  accumulo-proxy-1.5.0.pom
  accumulo-project-1.5.0-src.tar.gz.asc
  accumulo-project-1.5.0.pom
  accumulo-project-1.5.0.pom.asc
  accumulo-project-1.5.0-src.tar.gz
  accumulo-project-1.5.0-site.xml.asc
  accumulo-project-1.5.0-site.xml
  accumulo-trace-1.5.0.jar.asc
  accumulo-trace-1.5.0-javadoc.jar
  accumulo-trace-1.5.0.pom.asc
  accumulo-trace-1.5.0-sources.jar.asc
  accumulo-trace-1.5.0.jar
  accumulo-trace-1.5.0-sources.jar
  accumulo-trace-1.5.0.pom
  accumulo-trace-1.5.0-javadoc.jar.asc
  accumulo-server-1.5.0-sources.jar
  accumulo-server-1.5.0.pom.asc
  accumulo-server-1.5.0-sources.jar.asc
  accumulo-server-1.5.0.jar
  accumulo-server-1.5.0.jar.asc
  accumulo-server-1.5.0-javadoc.jar.asc
  accumulo-server-1.5.0-javadoc.jar
  accumulo-server-1.5.0.pom
  accumulo-fate-1.5.0-sources.jar.asc
  accumulo-fate-1.5.0.pom
  accumulo-fate-1.5.0-javadoc.jar
  accumulo-fate-1.5.0-sources.jar
  accumulo-fate-1.5.0.jar.asc
  accumulo-fate-1.5.0.pom.asc
  accumulo-fate-1.5.0-javadoc.jar.asc
  

Re: Accumulo software and processes owner

2013-04-26 Thread Adam Fuchs
Terry,

To properly secure you Accumulo install it's important that the shared
secret in the Accumulo configs only be shared with the Accumulo processes,
so I would recommend using a separate accumulo user.

In HDFS you can create the directory that Accumulo writes to (/accumulo by
default) and then chown it to accumulo. That ought to get you started. If
trash is enabled in HDFS (fs.trash.interval set to something other than 0,
I believe) then you may also have to create the accumulo home directory in
hdfs and chown that as well.

Cheers,
Adam
 On Apr 26, 2013 4:36 PM, Terry P. texpi...@gmail.com wrote:

 I just finished setting up an 8-node cluster using Cloudera CDH3u5 and
 Accumulo 1.4.2.  The Cloudera rpm installations created the hdfs Linux user
 and hadoop group (and others).  I initially created an accumulo Linux user
 and set it as the owner of the Accumulo software.

 However, after HDFS was up and running, when I attempted to start Accumulo
 as the accumulo Linux user, I of course got HDFS permission denied when it
 tried to write to HDFS.  Being a newbie, I didn't bother figuring out how
 to grant HDFS permissions to the accumulo account, I just started Accumulo
 as the hdfs user so I could get things rolling.

 As what user does one normally start Accumulo?  hdfs?  Linux root?  The
 Accumulo User Manual never recommends anything about who the Accumulo
 binaries should be owned by or what account it should be run under (e.g.
 root, or an accumulo Linux account).

 Thanks in advance,
 Terry



Re: Suggestions on modeling a composite row key

2013-02-27 Thread Adam Fuchs
At sqrrl, we tend to use a Tuple class that implements ListString
(ListByteBuffer would also work), and has conversions to and from
ByteBuffer. To encode the tuple into a byte buffer, change all the \1s to
\1\2, change all the \0s to \1\1, and put a \0 byte between
elements. \1 is used as an escape character for all of the \1s and
\0s appearing in the the unencoded form. To decode, just split on \0
and reverse the escaping. This encoding preserves hierarchical,
lexicographical ordering of tuple elements.

Cheers,
Adam



On Tue, Feb 26, 2013 at 11:51 PM, Mike Hugo m...@piragua.com wrote:

 I need to build up a row key that consists of two parts, the first being a
 URL (e.g. http://foo.com/dir/page%20name.htm) and the second being a
 number (e.g. 12).

 To date we've been using \u to delimit these two pieces of the key,
 but that has some headaches associated with it.

 I'm curious to know how other people have delimited composite row keys.
  Any best practices or suggestions?

 Thanks,

 Mike



Re: Determining the cause of a tablet server failure

2013-02-27 Thread Adam Fuchs
There are a few primary reasons why your tablet server would die:
1. Lost lock in Zookeeper. If the tablet server and zookeeper can't
communicate with each other then the lock will timeout and the tablet
server will kill itself. This should show up as several messages in the
tserver log. If this happens when a tablet server is really busy (lots of
threads doing stuff) then the log message about the lost lock can be pretty
far back in the queue. Java garbage collection can cause long pauses that
inhibit the tserver/zookeeper messages. Zookeeper can also get overwhelmed
and behave poorly if the server it's running on swaps it out.
2. Problems talking with the master. If a tablet server is too slow in
communicating with the master then the master will try to kill it. This
should show up in the master log, and also will be noted in the tserver log.
3. Out of memory. If the tserver JVM runs out of memory it will terminate.
As John mentioned, this will be in the .err or .out files in the log
directory.

Adam



On Wed, Feb 27, 2013 at 12:10 PM, Mike Hugo m...@piragua.com wrote:

 After running an ingest process via map reduce for about an hour or so,
 one of our tserver fails.  It happens pretty consistently, we're able to
 replicate it without too much difficulty.

 I'm looking in the $ACCUMULO_HOME/logs directory for clues as to why the
 tserver fails, but I'm not seeing much that points to a cause of the
 tserver going offline.   One minute it's there, the next it's offline.
  There are some warnings about the swappiness as well as a large row that
 cannot be spit but other than that, not much else to go on.

 Is there anything that could help me figure out *why* the tserver died?
  I'm guessing it's something in our client code or a config that's not
 correct on the server, but it'd be really nice to have a hint before we
 start randomly changing things to see what will fix it.

 Thanks,

 Mike



Re: Determining the cause of a tablet server failure

2013-02-27 Thread Adam Fuchs
So, question for the community: inside bin/accumulo we have:
  -XX:OnOutOfMemoryError=kill -9 %p
Should this also append a log message? Something like:
  -XX:OnOutOfMemoryError=kill -9 %p; echo ran out of memory 
logfilename
Is this necessary, or should the OutOfMemoryException still find its way to
the regular log?

Adam



On Wed, Feb 27, 2013 at 3:17 PM, Mike Hugo m...@piragua.com wrote:

 I'm chalking this up to a mis-configured server.  It looks like during the
 install on this server the accumulo-env.sh file was copied from the
 examples, but rather than setting editing it to set the JAVA_HOME,
 HADOOP_HOME, and ZOOKEEPER_HOME, the entire file contents were replaced
 with those env variables.

 I'm assuming this caused us to pick up the default (?)  _OPTS settings
 rather than the correct ones we should have been getting based on our
 server memory capacity from the examples.  So we had a bunch of accumulo
 related java processes all running with memory settings that were way out
 of whack from what they should have been.

 To solve it I copied in the files from the conf/examples directory again
 and made sure everything was set up correctly and restarted everything.

 We never did see anything in out log files or .out / .err logs indicating
 the source of the problem, but the above is my best guess as to what was
 going on.

 Thanks again for all the tips and pointers!

 Mike


 On Wed, Feb 27, 2013 at 11:24 AM, Adam Fuchs afu...@apache.org wrote:

 There are a few primary reasons why your tablet server would die:
 1. Lost lock in Zookeeper. If the tablet server and zookeeper can't
 communicate with each other then the lock will timeout and the tablet
 server will kill itself. This should show up as several messages in the
 tserver log. If this happens when a tablet server is really busy (lots of
 threads doing stuff) then the log message about the lost lock can be pretty
 far back in the queue. Java garbage collection can cause long pauses that
 inhibit the tserver/zookeeper messages. Zookeeper can also get overwhelmed
 and behave poorly if the server it's running on swaps it out.
 2. Problems talking with the master. If a tablet server is too slow in
 communicating with the master then the master will try to kill it. This
 should show up in the master log, and also will be noted in the tserver log.
 3. Out of memory. If the tserver JVM runs out of memory it will
 terminate. As John mentioned, this will be in the .err or .out files in the
 log directory.

 Adam



 On Wed, Feb 27, 2013 at 12:10 PM, Mike Hugo m...@piragua.com wrote:

 After running an ingest process via map reduce for about an hour or so,
 one of our tserver fails.  It happens pretty consistently, we're able to
 replicate it without too much difficulty.

 I'm looking in the $ACCUMULO_HOME/logs directory for clues as to why the
 tserver fails, but I'm not seeing much that points to a cause of the
 tserver going offline.   One minute it's there, the next it's offline.
  There are some warnings about the swappiness as well as a large row that
 cannot be spit but other than that, not much else to go on.

 Is there anything that could help me figure out *why* the tserver died?
  I'm guessing it's something in our client code or a config that's not
 correct on the server, but it'd be really nice to have a hint before we
 start randomly changing things to see what will fix it.

 Thanks,

 Mike






Re: NoSuchMethodError: FieldValueMetaData (Conflict between hue-plugins-1.2.0-cdh3u5.har and libthrift-0.6.1.jar)

2013-02-08 Thread Adam Fuchs
Is that related to https://issues.apache.org/jira/browse/ACCUMULO-837? Do
you have a stack trace you can share?

Adam



On Fri, Feb 8, 2013 at 10:34 AM, David Medinets david.medin...@gmail.comwrote:

 I am running a map-reduce job. As soon as my mapper tried to serialize
 a Mutation I run into a NoSuchMethodError in reference to
 FieldValueMetaData. I could simply delete the hue-plugins jar file but
 that seems inelegant to me. When running a mapper can I shift when the
 jar files are loaded? Would HADOOP_USER_CLASSPATH_FIRST help in this
 situation? What about adding the libthift jar file to the
 sun.boot.class.path property?



Re: infinite number of max.versions?

2013-01-28 Thread Adam Fuchs
Mike,

The way to do that is to remove the versioning iterator entirely. Just
delete the configuration parameters for that iterator: something like
config -t tablename -d table.iterator.scan.vers in the accumulo shell,
for each of the six configuration parameters.

Adam



On Mon, Jan 28, 2013 at 11:08 AM, Mike Hugo m...@piragua.com wrote:

 I see I can use a config setting (e.g. 
 table.iterator.minc.vers.opt.maxVersions)
 to keep x number of versions of a record within different scopes (minc,
 majc, etc). Is there a way to set that to always store all versions (never
 delete old versions) for keeping history?

 I can probably just set this to a really large number and it would be fine
 for my purposes, but I'm wondering if there's a better way.

 Thanks,

 Mike



Re: Custom Iterators - behavior when switching tablets

2013-01-23 Thread Adam Fuchs
David,

The core challenge here is to be able to continue scans under failure
conditions. There are several places where we tear down the iterator tree
and rebuild it, including when tablet servers die, when we need to free
resources to support concurrency, and a few others. In order to continue a
scan where we left off, we need to be able to point to some place in the
stream of key/value pairs. If we want to be robust against tablet server
failure we can't just store that scan session information on the tablet
server, so we use the last key that was returned to the client for that
information.

When you add an iterator that transforms keys, you change the meaning of
that pointer so that it points into the transformed stream instead of the
underlying stream. There are two requirements in order to do this sanely:
1. Your iterator should not change the row portion of the key, although it
can change any of the remaining parts.
2. Your iterator's seek method should perform the reverse transformation on
the range when it seeks the underlying iterator(s). This will ensure that
you don't skip ranges of underlying keys when the scanner continues the
scan.

That said, I agree that the behavior of the scanner where it ignores keys
that are returned is probably not optimal, and I'm not sure why it does
that, except maybe to prevent some infinite loops.

Adam



On Tue, Jan 22, 2013 at 11:55 AM, Slater, David M.
david.sla...@jhuapl.eduwrote:

 In designing some of my own custom iterators, I was noticing some
 interesting behavior. Note: my iterator does not return the original key,
 but instead returns a computed value that is not necessarily in
 lexicographic order.

 ** **

 So far as I can tell, when the Scanner switches between tablets, it checks
 the key that is returned in the new tablet and compares it (I think it
 compares key.row()) with the last key from the previous tablet. If the new
 key is greater than the previous one, then it proceeds normally. If,
 however, the new key is less than or equal to the previous key, then the
 Scanner does not return the value. It does, however, continue to iterate
 through the tablet, continuing to compare until it finds a key greater than
 the last one. Once it finds one, however, it progresses through the rest of
 that tablet without doing a check. (It implicitly assumes that everything
 in a tablet will be correctly ordered). 

 ** **

 Now if I was to return the original key, it would work fine (since it
 would always be in order), but that also limits the functionality of my
 custom iterator. 

 ** **

 My primary question is: why would it be designed this way? When switching
 between tablets, are there potential problems that might crop up if this
 check isn’t done?

 ** **

 Thanks,
 David



Re: scripted way to create users

2013-01-18 Thread Adam Fuchs
Using the Java API through JRuby or Jython would be another option. With
Jython, that would look something like this:

 export
JYTHONPATH=$ACCUMULO_HOM/lib/accumulo-core-1.4.2.jar:$ACCUMULO_HOME/lib/log4j-1.2.16.jar:$ZOOKEEPER_HOME/zookeeper-*.jar:$HADOOP_HOME/hadoop-core-1.0.3.jar:$ACCUMULO_HOME/lib/libthrift-0.6.1.jar:$HADOOP_HOME/lib/commons-logging-1.1.1.jar:$HADOOP_HOME/lib/slf4j-log4j12-1.4.3.jar:$HADOOP_HOME/lib/slf4j-api-1.4.3.jar:$ACCUMULO_HOME/lib/cloudtrace-1.4.2.jar
 jython adduser.jython

contents of adduser.jython_
import java
from org.apache.accumulo.core.client import ZooKeeperInstance
from org.apache.accumulo.core.security import Authorizations
from org.python.core.util import StringUtil
inst = ZooKeeperInstance(instanceName,localhost)
conn = inst.getConnector(root,password)
conn.securityOperations().createUser(foo,StringUtil.toBytes(bar),Authorizations())


Cheers,
Adam



On Tue, Jan 15, 2013 at 1:12 PM, Ask Stack askst...@yahoo.com wrote:

 echo password_file | cbshell -f command_file  seems to work.  any better
 solutions?
 I am not a programmer so I am not going to try the java option but thanks.



 
 From: William Slacum wilhelm.von.cl...@accumulo.net
 To: user@accumulo.apache.org; Ask Stack askst...@yahoo.com
 Sent: Tuesday, January 15, 2013 12:07 PM
 Subject: Re: scripted way to create users


 You could redirect input from a file.


 On Tue, Jan 15, 2013 at 8:20 AM, Ask Stack askst...@yahoo.com wrote:

 Hello
  I like to make an accumulo commands file to create users. Something like
 createuser hello -s low But createuser command does not have an option
 to provide user password. The script will stop and prompt me for the new
 user password. Does anyone know how to solve this problem?
 Thanks.
 



Re: Accumulo Junit Concurrency/Latency issues ( Accumulo 1.3 )

2012-11-29 Thread Adam Fuchs
Sounds like you might need to introduce some synchronization to serialize
your JUnit tests. If you can recreate these symptoms in a small test case
that is representative of your code, maybe you can share that?

Adam



On Thu, Nov 29, 2012 at 11:10 AM, Josh Berk josh.accum...@gmail.com wrote:

 Sorry Adam, I can't give my source code :/.

 and Eric, I'm positive that the timestamp is not the issue. When I said I
 use the same key, i meant only the key and not the entire same Entry.  The
 timestamp and visibility are associated with the entry and not the key.
 so, the timestamp is auto-updated. Because, if it weren't auto-updated then
 the problem i'm experiencing would occur for the same method outside of
 JUnit.

 I think the problem I'm experiencing has something to do with JUnit
 running it's test methods concurrently..  Each of my JUnit methods are
 independent from each other. So, no other method could influence the values
 of the objects that i'm storing/accessing with Accumulo in any particular
 method.

 -All JUnit methods are independent of each other.
 -The problem I'm experiencing is that when i update a value  retrieve it,
 I can intermittently receive stale data, which causes my test to fail.
 -The same method ran in a normal class (non-JUnit) succeeds every single
 time.

 I know that JUnit concurrently runs its test methods . I'm wondering if
 more than one thread maybe executes for a single test method and some parts
 are getting ahead of others in the same method. I won't say more about that
 theory in fear of being labeled a crazy person. I only know that the
 problem is JUnit-Accumulo specific  am wondering if anyone else has
 experienced the same issues?

 -Josh






 On Thu, Nov 29, 2012 at 10:51 AM, Eric Newton eric.new...@gmail.comwrote:

 I am definitely using the same key to update and retrieve the data.

 At least update the timestamp to the current time (or old timestamp + 1).

 -Eric


 On Thu, Nov 29, 2012 at 10:38 AM, Adam Fuchs afu...@apache.org wrote:

 Josh,

 Can you share your junit test code so I can replicate this behavior?

 Adam



 On Thu, Nov 29, 2012 at 9:59 AM, Joe Berk josh.accum...@gmail.comwrote:

 Good morning all,

 I'm experiencing some weirdness when executing JUnit tests for my
 classes that operate with Accumulo. I can best describe it as latency.
 Basically, when I write my object to Accumulo  then immediately
 retrieve it to inspect the values, the values are not always updated to
 what I just saved them as.

 Problem:
   part 1:
 - I create an object that has some primitive types.
 - I set the primitive variables to acceptable values.
 - I serialize the object (the Value)
 - I write the Value to Accumulo ( Entry )
 - I retrieve the Object from Accumulo  inspect. The primitive
 values are equal to what they were set to.

   part 2:
 - I retrieve the object from Accumulo
 - I set the primitive variables to different values
 - I serialize the object
 - I write the Value to Accumlo ( Entry )
 - I retrieve the Object from Accumulo  inspect. The primitive
 values are *not equal* to what they were just set to

 This only seems to be happening during the JUnit.

 I have a method that performs the above task, in a JUnit test, and when
 I repeatedly run the JUnit test, it will intermittently fail.
 I have the same exact method, but it is in a regular class, and I can
 run it as much as I want, with no failure.

 for the non-JUnit test, MockInstances and  real instances succeed
 every time
 for the JUnit test, MockInstances and real instances both fail
 intermittently.

 sidenotes:
 - I am definitely using the same key to update and retrieve the data. I
 also inspected the entries that I was writing to Accumulo, every time, and
 can confirm that they are being sent/written to Accumulo as I intend
 them to be. In summary, I am positive that I am sending the correct data to
 be written. This is doubly verified by my ability to intermittently succeed
 when JUnit and 100% succeed in a normal class.

 Any assistance would be greatly appreciated.

 Best Regards,

 Josh














Re: [VOTE] accumulo-1.4.2 RC4

2012-11-09 Thread Adam Fuchs
+1

The only problem I have found is that the example policy file is still not
included (ACCUMULO-364), but that has been corrected for the next version
for real this time. The release notes are slightly wrong in that respect,
but I don't think this should delay release.

Checked signatures, hashes, comparison with previous release, and found no
other problems.

Adam


On Thu, Nov 8, 2012 at 3:01 PM, Eric Newton eric.new...@gmail.com wrote:

 Please vote on releasing the following candidate as Apache Accumulo
 version 1.4.2.

 The src tar ball was generated by exporting:

 https://svn.apache.org/repos/asf/accumulo/tags/1.4.2rc4

 To build the dist tar ball from the source run the following command:
src/assemble/build.sh

 Tarballs, checksums, signatures:
   http://people.apache.org/~ecn/1.4.2rc4

 Maven Staged Repository:
   https://repository.apache.org/content/repositories/orgapacheaccumulo-031

 Keys:
   http://www.apache.org/dist/accumulo/KEYS

 Changes:
   https://svn.apache.org/repos/asf/accumulo/tags/1.4.2rc4/CHANGES

 The vote will be held open for the next 72 hours.

 The only change from RC3 was the addition of copyright headers on two
 files.



Re: Accumulo design questions

2012-11-06 Thread Adam Fuchs
 4. In supporting dynamic column families, was there a design trade-off
 with
 respect to the original BigTable or current HBase design?  What might
 be a
 benefit of doing it the other way?

 One trade-off is that pinning locality groups in memory (i.e. making them
ephemeral) would be challenging for Accumulo, while this is something that
Bigtable supports.

Another trade-off is that supporting compactions on only one locality group
at a time would be impossible. Since any one file can hold all locality
groups, it is likely that all data will go through the compaction channels
even if only one locality group is being actively written. The flip side of
this is that Accumulo uses fewer files, putting less load on HDFS.

Adam


Re: Number of partitions for sharded table

2012-10-30 Thread Adam Fuchs
Krishmin,

There are a few extremes to keep in mind when choosing a manual
partitioning strategy:
1. Parallelism and balance at ingest time. You need to find a happy medium
between too few partitions (not enough parallelism) and too many partitions
(tablet server resource contention and inefficient indexes). Probably at
least one partition per tablet server being actively written to is good,
and you'll want to pre-split so they can be distributed evenly. Ten
partitions per tablet server is probably not too many -- I wouldn't expect
to see contention at that point.
2. Parallelism and balance at query time. At query time, you'll be
selecting a subset of all of the partitions that you've ever ingested into.
This subset should be bounded similarly to the concern addressed in #1, but
the bounds could be looser depending on the types of queries you want to
run. Lower latency queries would tend to favor only a few partitions per
node.
3. Growth over time in partition size. Over time, you want partitions to be
bounded to less than about 10-100GB. This has to do with limiting the
maximum amount of time that a major compaction will take, and impacts
availability and performance in the extreme cases. At the same time, you
want partitions to be as large as possible so that their indexes are more
efficient.

One strategy to optimize partition size would be to keep using each
partition until it is full, then make another partition. Another would be
to allocate a certain number of partitions per day, and then only put data
in those partitions during that day. These strategies are also elastic, and
can be tweaked as the cluster grows.

In all of these cases, you will need a good load balancing strategy. The
default strategy of evening up the number of partitions per tablet server
is probably not sufficient, so you may need to write your own tablet load
balancer that is aware of your partitioning strategy.

Cheers,
Adam



On Tue, Oct 30, 2012 at 3:06 PM, Krishmin Rai kr...@missionfoc.us wrote:

 Hi All,
   We're working with an index table whose row is a shardId (an integer,
 like the wiki-search or IndexedDoc examples). I was just wondering what the
 right strategy is for choosing a number of partitions, particularly given a
 cluster that could potentially grow.

   If I simply set the number of shards equal to the number of slave nodes,
 additional nodes would not improve query performance (at least over the
 data already ingested). But starting with more partitions than slave nodes
 would result in multiple tablets per tablet server… I'm not really sure how
 that would impact performance, particularly given that all queries against
 the table will be batchscanners with an infinite range.

   Just wondering how others have addressed this problem, and if there are
 any performance rules of thumb regarding the ratio of tablets to tablet
 servers.

 Thanks!
 Krishmin


Re: [VOTE] accumulo-1.4.2 RC3

2012-10-26 Thread Adam Fuchs
Oops, looks like Eric and I owe donuts.

Anyone know how to get vim to automatically add license headers? ;-)

Adam



On Fri, Oct 26, 2012 at 11:14 AM, Billie Rinaldi bil...@apache.org wrote:

 -1

 These files don't have licenses:

 src/core/src/test/java/org/apache/accumulo/core/iterators/FirstEntryInRowIteratorTest.java

 src/server/src/main/java/org/apache/accumulo/server/util/FindOfflineTablets.java

 Committers, don't forget to follow the Eclipse Configuration Tips whenever
 you install a new version of Eclipse.  Installing the template will make it
 so that Eclipse automatically adds a license header to new Java files, and
 the codestyle will auto-format Java to our preferred style:
 http://accumulo.apache.org/source.html

 Eric, sorry I didn't find these in time to save you the trouble of
 generating the md5s.  I'm going to keep looking at the release candidate to
 see if I can find any other issues.

 Billie


 On Tue, Oct 23, 2012 at 6:12 AM, Eric Newton eric.new...@gmail.comwrote:

 Please vote on releasing the following candidate as Apache Accumulo
 version 1.4.2.

 The src tar ball was generated by exporting:

 https://svn.apache.org/repos/asf/accumulo/tags/1.4.2rc3

 To build the dist tar ball from the source run the following command:
src/assemble/build.sh

 Tarballs, checksums, signatures:
   http://people.apache.org/~ecn/1.4.2rc3

 Maven Staged Repository:

 https://repository.apache.org/content/repositories/orgapacheaccumulo-159

 Keys:
   http://www.apache.org/dist/accumulo/KEYS

 Changes:
   https://svn.apache.org/repos/asf/accumulo/tags/1.4.2rc3/CHANGES

 The vote will be held open for the next 72 hours.

 The only change from RC2 was ACCUMULO-826.






Re: What is the Communication and Time Complexity for Bulk Inserts?

2012-10-24 Thread Adam Fuchs
For the bulk load of one file, shouldn't it be roughly O(log(n) * log(P) *
p), where n is the size of the file, P is the total number of tablets
(proportional to tablet servers), and p is the number of tablets that get
assigned that file?

For the BatchWriter case, there's a client-side lookup/binning that takes
O(log(p)) per entry, so the latter would be O(n/p * (log(n/p) + log(p)))
for each of p partitions. So, O(n*log(n)) in aggregate. Yes/no?

Adam


On Wed, Oct 24, 2012 at 3:57 PM, Jeff Kubina jeff.kub...@gmail.com wrote:

 @eric, assuming the records are evenly distributed and network bandwidth
 is not an issue, shouldn't that be O(n/p)+O(p) and O(n/p * log (n/p))?


 On Wed, Oct 24, 2012 at 2:45 PM, Eric Newton eric.new...@gmail.comwrote:

 Adding a sorted file to accumulo (bulk loading) is essentially
 constant in the normal case.  It is O(n) + O(p) for the worst case
 where the index must be read, and the file assigned to every tablet
 server.  In this case, the (slow) RPCs will dominate over the (fast)
 read of the index, except for very small clusters or very large
 indexes.

 Inserting with the BatchWriter is eventually dominated by compactions,
 which is a merge sort, or O(n log n).

 -Eric

 On Thu, Oct 18, 2012 at 11:37 AM, Jeff Kubina jeff.kub...@gmail.com
 wrote:
  BatchWriter, but I would be interested in the answer assuming a
  pre-sorted rfile.
 
  On Thu, Oct 18, 2012 at 11:20 AM, Josh Elser josh.el...@gmail.com
 wrote:
  Are you referring to bulk inserts as importing a pre-sorted rfile of
  Key/Values or usinga BatchWriter?
 
  On 10/18/12 10:49 AM, Jeff Kubina wrote:
 
  I am deriving the time complexities for an algorithm I implemented in
  Hadoop using Accumulo and need to know the time complexity of bulk
  inserting m records evenly distributed across p nodes into an empty
  table with p tablet servers. Assuming B is the bandwidth of the
  network, would the communication complexity be O(m/B) and the
  computation complexity O(m/p * log(m/p))? If the table contained n
  records would the values be O(m/B) and O(m/p * log(m/p) + n/p)?





Re: Accumulo Between Two Centers (DR - disaster recovery)

2012-09-26 Thread Adam Fuchs
Another way to say this is that cross-data center replication for Accumulo
is left to a layer on top of Accumulo (or the application space). Cassandra
supports a mode in which you can have a bigger write replication than write
quorum, allowing writes to eventually propagate and reads to happen on
stale versions of the data. This increases availability at the cost of
consistency, which is important when dealing with links that are less
reliable or higher latency (but does nothing special for lower bandwidth
links). Cassandra, running in this mode, leaves dealing with eventual
consistency to the application space, which might be only slightly less
challenging than implementing a cross-data center replication scheme.

Adam


On Wed, Sep 26, 2012 at 9:46 AM, Eric Newton eric.new...@gmail.com wrote:

 I think you're talking about 2 different things.

 Accumulo is architected to run on fast connections.  If you add one
 slowly connected computer, generally speaking, it will make everything
 run slowly.

 Replication is typically used to send copies from one data center to
 another, so that each has a local copy.  Typically, the trick uses
 extra latency in updates to the copies to compensate for the
 relatively slow connections between data centers.

 Accumulo does not presently support replication.  See ACCUMULO-378.

 -Eric

 On Wed, Sep 26, 2012 at 8:08 AM, Christopher Tubbs ctubb...@gmail.com
 wrote:
  I believe Accumulo can work across data centers, if the underlying DFS
  span data centers. I also believe the latency tolerance is
  configurable, and matters for servers holding locks in Zookeeper and
  heartbeat messages to the Master. I'm not sure what the defaults for
  these are, though.
 
  On Wed, Sep 26, 2012 at 8:00 AM, David Medinets
  david.medin...@gmail.com wrote:
  I recall a conversation in which people were pointed to Cassandra for
  its ability to replicate between data centers. I have forgotten what
  Accumulo offers on this topic. And does latency matter? If latency
  matters, what is the highest acceptable latency?



Re: bulk ingested table showing zero entries on the monitor page

2012-09-21 Thread Adam Fuchs
John is referring to the streaming ingest, not the bulk ingest. Dave is
correct on this one. Basically, we don't count the records when you bulk
ingest so that we can get sub-linear runtime on the bulk ingest operation.

Adam


On Fri, Sep 21, 2012 at 4:22 PM, ameet kini ameetk...@gmail.com wrote:


 I was expecting that to be updated, but it doesn't.

 Thanks,
 Ameet


 On Fri, Sep 21, 2012 at 11:30 AM, John Vines vi...@apache.org wrote:

 You should see the stats for entries in memory update on the master
 portion of the monitor page. It may take a few seconds for it to update.

 John


 On Fri, Sep 21, 2012 at 10:53 AM, ameet kini ameetk...@gmail.com wrote:


 Got it, I just forced a compaction and saw the entries show up on the
 monitor.

 Thanks!

 Ameet


 On Fri, Sep 21, 2012 at 10:44 AM, dlmar...@comcast.net wrote:

 The number of entries will show up on the monitor after a compaction.



 Dave

  --

 *From: *ameet kini ameetk...@gmail.com
 *To: *user@accumulo.apache.org
 *Sent: *Friday, September 21, 2012 10:42:32 AM
 *Subject: *bulk ingested table showing zero entries on the monitor page



 I'm ingesting a table using the AccumuloFileOutputFormat similar to the
 bulk ingest example. Scanning the table via the shell, I see that the
 entries are there. But on the monitor page, the table shows up as having
 zero entries. So I went back to running the bulk ingest example itself, and
 indeed, the test_table shows up as also having zero entries. Any one else
 seen this? I'm guessing/hoping its just a UI issue and won't affect
 querying the contents of the table.

 Thanks,
 Ameet







RE: Running Accumulo straight from Memory

2012-09-12 Thread Adam Fuchs
Even if you are just using memory, minor and major compactions are
important to get compression, handle deletes, get sequential access (cache
line efficiency), use iterators, and introduce locality groups.

Adam
On Sep 12, 2012 12:33 PM, Moore, Matthew J. matthew.j.mo...@saic.com
wrote:

 Adam,

 It does look like we are the first to try this.  We are trying to keep
 everything in memory and as a result there is no minor compactions, and
 probably major compactions to make tables larger.  We tried this on SSDs
 using a file system and we were not getting the processing speeds that we
 had wanted.

 ** **

 Matt

 ** **

 ** **

 *From:* user-return-1330-MATTHEW.J.MOORE=saic@accumulo.apache.org[mailto:
 user-return-1330-MATTHEW.J.MOORE=saic@accumulo.apache.org] *On Behalf
 Of *Adam Fuchs
 *Sent:* Tuesday, September 11, 2012 5:30 PM
 *To:* user@accumulo.apache.org
 *Subject:* Re: Running Accumulo straight from Memory

 ** **

 Matthew,

 ** **

 I don't know of anyone who has done this, but I believe you could:

 1. mount a RAM disk

 2. point the hdfs core-site.xml fs.default.name property to file:///

 3. point the accumulo-site.xml instance.dfs.dir property to a directory on
 the RAM disk

 4. disable the WAL for all tables by setting the accumulo-site.xml
 table.walog.enabled to false

 5. initialize and start up accumulo as you regularly would and cross your
 fingers

 Of course, the you may lose data and this is not an officially
 supported configuration caveats apply. Out of curiosity, what would you be
 trying to accomplish with this configuration?

 ** **

 Adam

 ** **

 ** **

 On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J. 
 matthew.j.mo...@saic.com wrote:

 Has anyone run Accumulo on a single server straight from memory?  Probably
 using something like a Fusion  IO drive.  We are trying to use it without
 using an SSD or any spinning discs.

  

 *Matthew Moore*

 Systems Engineer

 SAIC, ISBU

 Columbia, MD

 410-312-2542

  

 ** **



Re: Running Accumulo straight from Memory

2012-09-11 Thread Adam Fuchs
Matthew,

I don't know of anyone who has done this, but I believe you could:
1. mount a RAM disk
2. point the hdfs core-site.xml fs.default.name property to file:///
3. point the accumulo-site.xml instance.dfs.dir property to a directory on
the RAM disk
4. disable the WAL for all tables by setting the accumulo-site.xml
table.walog.enabled to false
5. initialize and start up accumulo as you regularly would and cross your
fingers

Of course, the you may lose data and this is not an officially supported
configuration caveats apply. Out of curiosity, what would you be trying to
accomplish with this configuration?

Adam


On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J. 
matthew.j.mo...@saic.com wrote:

 Has anyone run Accumulo on a single server straight from memory?  Probably
 using something like a Fusion  IO drive.  We are trying to use it without
 using an SSD or any spinning discs.

 ** **

 *Matthew Moore*

 Systems Engineer

 SAIC, ISBU

 Columbia, MD

 410-312-2542

 ** **



RE: ColumnQualifierFilter

2012-09-10 Thread Adam Fuchs
fetchColumn is agglomerative, so if you call it multiple times it will
fetch multiple columns.

Adam
On Sep 10, 2012 6:25 PM, bob.thor...@l-3com.com wrote:

 Billie

 ** **

 That’s what I’m doing at the moment, but I’d like to give the iterator a
 collection of CF/CQ to filter on.  Is that possible?

 ** **

 *From:* Billie Rinaldi [mailto:bil...@apache.org]
 *Sent:* Monday, September 10, 2012 17:14
 *To:* user@accumulo.apache.org
 *Subject:* Re: ColumnQualifierFilter

 ** **

 On Mon, Sep 10, 2012 at 2:56 PM, bob.thor...@l-3com.com wrote:

 Anyone have a snippet on how to apply a ColumnQualifierFilter to a

 BatchScanner that has a VersioningIterator (any iterator for that
 matter)?


 You don't have to apply the ColumnQualifierFilter yourself.  Just use the
 fetchColumn(Text colFam, Text colQual) method of the BatchScanner.

 Billie

  


 Bob Thorman
 Engineering Fellow
 L-3 Communications, ComCept
 1700 Science Place
 Rockwall, TX 75032
 (972) 772-7501 work
 bob.thor...@ncct.af.smil.mil
 rdth...@nsa.ic.gov

 

 ** **



Re: [receivers.SendSpansViaThrift] ERROR: java.net.ConnectException: Connection refused

2012-09-05 Thread Adam Fuchs
Fred,

One tracer is fine, and you can set that to be the same as the master node.
You also need to set the username and password for the tracer in
accumulo-site.xml if you haven't already.

Adam
On Sep 5, 2012 1:22 PM, Fred Wolfinger fred.wolfin...@g2-inc.com wrote:

 Hey Marc,

 I can't tell you how much I appreciate the quick response. I never touched
 the tracers fie, so it still just says localhost. If I have a master and 3
 slaves, which IP's go into that file? All, or just the slaves?

 Thanks!

 Fred

 On Wed, Sep 5, 2012 at 1:15 PM, Marc Parisi m...@accumulo.net wrote:

 have you verified that the tracers are running?

 look at /conf/tracers to make sure you have them configured and that they
 are running on those servers.


 On Wed, Sep 5, 2012 at 1:06 PM, Fred Wolfinger fred.wolfin...@g2-inc.com
  wrote:

 I am trying to get data into a single table. I am getting very high
 error counts from all tablet servers and loggers of which there are 3
 each.all with the same error:

 05 12:58:02,224 [receivers.SendSpansViaThrift] ERROR:
 java.net.ConnectException: Connection refused
 java.net.ConnectException: Connection refused
 at java.net.PlainSocketImpl.socketConnect(Native Method)
 at
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
  at
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
 at
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
  at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
 at java.net.Socket.connect(Socket.java:546)
 at java.net.Socket.connect(Socket.java:495)
  at
 org.apache.accumulo.cloudtrace.instrument.receivers.SendSpansViaThrift.createDestination(SendSpansViaThrift.java:53)
 at
 org.apache.accumulo.cloudtrace.instrument.receivers.SendSpansViaThrift.createDestination(SendSpansViaThrift.java:34)
  at
 org.apache.accumulo.cloudtrace.instrument.receivers.AsyncSpanReceiver.sendSpans(AsyncSpanReceiver.java:87)
 at
 org.apache.accumulo.cloudtrace.instrument.receivers.AsyncSpanReceiver$1.run(AsyncSpanReceiver.java:63)
  at java.util.TimerThread.mainLoop(Timer.java:534)
 at java.util.TimerThread.run(Timer.java:484)

 I can ping/ssh/etc. freely across all slaves and master nodes.

 Thoughts?  Thanks a million,

 Fred





 --
 Fred Wolfinger
 Security Research Engineer
 G2, Inc.

 302 Sentinel Drive, Suite 300
 Annapolis Junction, MD 20701
 Office: 301-575-5142
 Cell: 443-655-3322




Re: more questions about IndexedDocIterators

2012-07-16 Thread Adam Fuchs
*SNIP

  3. Compressed reverse-timestamp using Unicode tricks?
  --
 
  I see code in Accumulo like
 
  // We're past the index column family, so return a term that will sort
  // lexicographically last. The last unicode character should suffice
  return new Text(\uFFFD);
 
  which gets me thinking that i can probably pull off a impressively
  compressed,
  but still lexically orderd, reverse timestamp using Unicode trickery
  to get a
  gigantic radix. Is there any precedence for this? I'm a little worried
  about
  running into corner cases with Unicode encoding. Otherwise, I think it
  feels
  like a simple algorithm that may not eat up much CPU in translation
  and might
  save disk space at scale.
 
  Or is this optimizing into the noise given compression Accumulo
  already does
  under the covers?

 I would think the compression would take care of this.  If you try it and
 get an improvement, we'd be interested in seeing the results.


I think it is generally a good idea to use encoding techniques whenever
they're quick, effective, and easy. If you know something about your data
then you can usually do better than a general-purpose compression
algorithm. Slide 11 of my table design presentation (
http://people.apache.org/~afuchs/slides/accumulo_table_design.pdf) also
shows a few extra tricks that might help you out. Another possibility is to
use a two's complement representation for a fixed precision number (e.g. a
long or an int), but flip the first bit.

Cheers,
Adam


Re: getMasterStats problems

2012-07-15 Thread Adam Fuchs
Jim,

The HdfsZooInstance looks for accumulo-site.xml on the classpath to find
the directory in HDFS to look for the instance ID. If accumulo-site.xml is
not on the classpath then it will default to /accumulo, which is probably
different from the directory you are using. accumulo-site.xml also includes
a shared secret which is used to construct the system credentials. This is
the standard way that servers communicate with each other.

Clients, however, typically don't use HdfsZooInstance. You might try
replacing that with ZooKeeperInstance, and then construct your own AuthInfo
object (i.e. new
AuthInfo(user,ByteBuffer.wrap(password.getBytes()),zkInstance.getInstanceID())
) instead of trying to use the system credentials.

Cheers,
Adam

On Sun, Jul 15, 2012 at 10:52 PM, Jim Klucar klu...@gmail.com wrote:

 I'm trying to get the master stats in the same way the monitor server
 does, but I keep having issues.

 The code I'm using is from the GetMasterStats.java program:

 MasterClientService.Iface client = null;
 MasterMonitorInfo stats = null;
 try {
   client =
 MasterClient.getConnectionWithRetry(HdfsZooInstance.getInstance());
   stats = client.getMasterStats(null,
 SecurityConstants.getSystemCredentials());
 } finally {
   if (client != null)
 MasterClient.close(client);
 }

 The problem is that the HdfsZooInstance can never find the instance
 data, even though it's there. I can create a ZooKeeperInstance just
 fine, but passing that into the MasterClient fails the same way.

 Caused by: java.lang.RuntimeException: Accumulo not initialized, there
 is no instance id at /accumulo/instance_id
 at
 org.apache.accumulo.core.client.ZooKeeperInstance.getInstanceIDFromHdfs(ZooKeeperInstance.java:263)
 at
 org.apache.accumulo.server.conf.ZooConfiguration.getInstance(ZooConfiguration.java:65)
 at
 org.apache.accumulo.server.conf.ServerConfiguration.getZooConfiguration(ServerConfiguration.java:49)
 at
 org.apache.accumulo.server.conf.ServerConfiguration.getSystemConfiguration(ServerConfiguration.java:58)
 at
 org.apache.accumulo.server.client.HdfsZooInstance.init(HdfsZooInstance.java:62)
 at
 org.apache.accumulo.server.client.HdfsZooInstance.getInstance(HdfsZooInstance.java:70)
 at
 org.apache.accumulo.server.security.SecurityConstants.makeSystemPassword(SecurityConstants.java:58)
 at
 org.apache.accumulo.server.security.SecurityConstants.clinit(SecurityConstants.java:43)
 ... 91 more

 Any ideas as to what I'm missing?



Re: monitor.Monitor - Unable to contact the garbage collector - connection refused

2012-07-12 Thread Adam Fuchs
Sounds like a good upgrade to me. Could even be done as part of that
warning message.

Adam


On Thu, Jul 12, 2012 at 9:28 PM, David Medinets david.medin...@gmail.comwrote:

 I am seeing the following output in my monitor_lasho.log file. Would
 it be possible to display the host and port that is being displayed as
 a DEBUG level message?

 13 01:25:50,760 [monitor.Monitor] WARN :  Unable to contact the
 garbage collector
 org.apache.thrift.transport.TTransportException:
 java.net.ConnectException: Connection refused
 at
 org.apache.accumulo.core.client.impl.ThriftTransportPool.createNewTransport(ThriftTransportPool.java:475)
 at
 org.apache.accumulo.core.client.impl.ThriftTransportPool.getTransport(ThriftTransportPool.java:464)
 at
 org.apache.accumulo.core.client.impl.ThriftTransportPool.getTransport(ThriftTransportPool.java:441)
 at
 org.apache.accumulo.core.client.impl.ThriftTransportPool.getTransportWithDefaultTimeout(ThriftTransportPool.java:366)
 at
 org.apache.accumulo.core.util.ThriftUtil.getClient(ThriftUtil.java:54)
 at
 org.apache.accumulo.server.monitor.Monitor.fetchGcStatus(Monitor.java:432)
 at
 org.apache.accumulo.server.monitor.Monitor.fetchData(Monitor.java:299)
 at
 org.apache.accumulo.server.monitor.Monitor$2.run(Monitor.java:502)
 at
 org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: java.net.ConnectException: Connection refused
 at sun.nio.ch.Net.connect0(Native Method)
 at sun.nio.ch.Net.connect(Net.java:364)
 at sun.nio.ch.Net.connect(Net.java:356)
 at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623)
 at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:92)
 at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:76)
 at
 org.apache.accumulo.core.util.TTimeoutTransport.create(TTimeoutTransport.java:39)
 at
 org.apache.accumulo.core.client.impl.ThriftTransportPool.createNewTransport(ThriftTransportPool.java:473)



Re: WholeRowIterator, BatchScanner, and fetchColumnFamily don't play well together?

2012-07-09 Thread Adam Fuchs
John,

This was a fun one, but we figured it out. Thanks for providing code --
that helped a lot. The quick workaround is to set the priority of the
WholeRowIterator to 21, above the VersioningIterator. Turns out the two
iterators are not commutative, so order matters.

Solution: when you set up your WholeRowIterator, use priority 21 or greater:
  batchScanner.setScanIterators(21 /* voila */,
WholeRowIterator.class.getName(), UUID.randomUUID().toString());

Here's what's happening:

First, a little background on notation. When you scan for a range including
the row bar, that range can be notated (bear with me):
  [bar : [] 9223372036854775807 false,bar%00; : [] 9223372036854775807
false)
This shows two complete keys, separated by a comma, with empty column
family, column qualifier, column visibility, a MAX_LONG timestamp, and
delete flag set to false. The second key has a row that is the same as the
first, but with a 0 byte value added on (%00;). In this notation, the [
means the left key is inclusive, and the ) means that the right key is
exclusive.

In your query, you added a column family filter, so we (the Accumulo client
library) got tricky and narrowed your range in addition to doing filtering.
Here's what the narrowed range looks like:
  [bar foo: [] 9223372036854775807 false,bar foo%00;: []
9223372036854775807 false)
You can see the column family foo specified on the left, and foo%00;
specified on the right side. This will select only everything in the foo
column family within the row bar.

When the VersioningIterator seeks to a range it does a couple of
interesting things. First, it widens the range to include all of the
possible versions of a key by setting the left-hand side timestamp to
MAX_LONG. This is done to get an accurate count of the versions so that it
knows which versions to skip. Second, it scans through the versions,
skipping anything after the start of the range it was given. This way, you
can seek directly to the nth version of a key and maintain a consistent
last version. Skipping keys that don't fit in the range works great until
we throw in an iterator that transforms keys, modifying their columns.

Enter the WholeRowIterator. The WholeRowIterator groups and encodes all
key/value pairs in a row into a single key/value pair to guarantee
isolation. This new key looks like:
  bar : [] 9223372036854775807 false
Effectively, we're taking the key in column family foo and moving it to
column family . This breaks the second interesting behavior of the
VersioningIterator, which will skip over everything that is not in the
narrowed range (including this key).

So, the conflict is actually the confluence of the WholeRowIterator, the
VersioningIterator, and setting to a single row range with a column filter
(resulting in a range that is narrower than one row). This is also not
specific to the BatchScanner. If you set the range of your Scanner to (new
Range(new Text(bar))), just like the BatchScanner, the Scanner will
display the same behavior.

Cheers,
Adam


On Mon, Jul 9, 2012 at 12:42 PM, John Armstrong j...@ccri.com wrote:

 Hi everybody.

 I've run across an unexpected behavior when using WholeRowIterator on a
 BatchScanner.  In case it matters, we're using cloudbase-1.3.4.

 When I tell it to fetchColumnFamily(new Text(foo)) I get no results
 back, though there are definitely records in that column family and in the
 row ranges I'm scanning.  This doesn't happen when I use a scanner on that
 column family, though in that case I'm scanning over the entire table.

 To be more explicit, some constants:

 ListRange ranges = new ArrayListRange();
 ranges.add(new Range(new Text(bar)));
 Text CF = new Text(foo);

 getNewScanner() and getNewBatchScanner() create scanners for the
 appropriate table name, authorization, and number of threads.

 BatchScanner batchScanner = getNewBatchScanner();
 batchScanner.**fetchColumnFamily(CF);
 batchScanner.setRanges(ranges)**;

 returns all the entries in row bar and column family foo.

 BatchScanner batchScanner = getNewBatchScanner();
 batchScanner.**fetchColumnFamily(CF);
 batchScanner.setScanIterators(**1,
   WholeRowIterator.class.**getName(),
   UUID.randomUUID().toString());
 batchScanner.setRanges(ranges)**;

 returns nothing.

 BatchScanner batchScanner = getNewBatchScanner();
 batchScanner.setScanIterators(**1,
   WholeRowIterator.class.**getName(),
   UUID.randomUUID().toString());
 batchScanner.setRanges(ranges)**;

 returns an encoded entry containing all the entries in row bar.

 Scanner scanner = getNewScanner();
 scanner.fetchColumnFamily(CF);
 scanner.setScanIterators(1,
  WholeRowIterator.class.**getName(),
  UUID.randomUUID().toString());

 returns encoded entries containing all the entries in column family foo,
 one for each row that contains anything in that column family.

 

Re: Recovering Tables from HDFS

2012-07-05 Thread Adam Fuchs
Hi Patrick,

The short answer is yes, but there are a few caveats:
1. As you said, information that is sitting in the in-memory map and in the
write-ahead log will not be in those files. You can periodically call flush
(Connector.getTableOperations().flush(...)) to guarantee that your data has
made it into the RFiles.
2. Old data that has been deleted may reappear. RFiles can span multiple
tablets, which happens when tablets split. Often, one of the tablets
compacts, getting rid of delete keys. However, the file that holds the
original data is still in HDFS because it is referenced by another tablet
(or because it has not yet been garbage collected). If you're using
Accumulo in an append-only fashion, then this will not be a problem.
3. For the same reasons as #2, if you're doing any aggregation you might
run into counts being incorrect.

You might also check out the table cloning feature introduced in 1.4 as a
means for backing up a table:
http://accumulo.apache.org/1.4/user_manual/Table_Configuration.html#Cloning_Tables

Cheers,
Adam


On Thu, Jul 5, 2012 at 9:52 AM, patricklync...@aim.com wrote:

users@accumulo,

  I need help understanding if one could recover or backup tables by
 taking their files stored in HDFS and reattaching them to tablet servers,
 even though this would mean the loss of information from recent mutations
 and write ahead logs. The documentation on recovery is focused on the
 failure of a tablet server, but, in the event of a failure of the master or
 other situation where the tablet servers cannot be utilized, it would be
 beneficial to know whether the files in HDFS can be used for recovery.

  Thanks,

  Patrick Lynch



Re: [VOTE] accumulo-1.3.6 RC1

2012-07-03 Thread Adam Fuchs
+1

Signature looks good
Hashes look good
Installs and runs well (configured, installed, started, attached with
shell, created table, inserted, scanned, flushed, compacted, shutdowned)

Adam


On Tue, Jul 3, 2012 at 1:57 PM, Eric Newton eric.new...@gmail.com wrote:

 I've recreated the build artifacts from the rc1 tag.  Thanks for the
 double-check.

 Maven, and the apache parent process control what goes into repo.  If
 something needs to be changed, let me know.

 I've re-deployed to the repo, too:

 https://repository.apache.org/content/repositories/orgapacheaccumulo-022/

 On Tue, Jul 3, 2012 at 1:10 PM, Eric Newton eric.new...@gmail.com wrote:
  Thanks... my attempt to automate some of this release process has
  probably gotten me into trouble.
 
  I'll update the binaries in a couple of hours.
 
  -Eric
 
  On Tue, Jul 3, 2012 at 12:37 PM, Billie J Rinaldi
  billie.j.rina...@ugov.gov wrote:
  The checksums are bad for accumulo-1.3.6-dist.tar.gz.  Also, were
 accumulo-1.3.6-dist.tar.gz and accumulo-1.3.6-source-release.zip
 intentionally deployed to the maven repo?  (If so, that's kind of cool.)
 
 https://repository.apache.org/content/repositories/orgapacheaccumulo-010/org/apache/accumulo/accumulo/1.3.6/
 
  Billie
 
 
  - Original Message -
  From: Eric Newton eric.new...@gmail.com
  To: user@accumulo.apache.org
  Sent: Monday, July 2, 2012 2:07:02 PM
  Subject: [VOTE] accumulo-1.3.6 RC1
  Please vote on releasing the following candidate as Apache Accumulo
  version 1.3.6.
 
  Note: this is *not* the same as the recent call for a vote on Apache
  Accumulo version 1.4.1.
 
  The src tar ball was generated by exporting :
  https://svn.apache.org/repos/asf/accumulo/tags/1.3.6rc1
 
  To build the dist tar ball from the source run the following command :
  src/assemble/build.sh
 
  Tarballs, checksums, signatures:
  http://people.apache.org/~ecn/1.3.6rc1
 
  Maven Staged Repository:
 
 https://repository.apache.org/content/repositories/orgapacheaccumulo-010
 
  Keys:
  http://www.apache.org/dist/accumulo/KEYS
 
  Changes:
  https://svn.apache.org/repos/asf/accumulo/tags/1.3.6rc1/CHANGES
 
  The vote will be held open for the next 72 hours.
 
  -Eric



Re: querying for relevant rows

2012-06-29 Thread Adam Fuchs
You can't scan backwards in Accumulo, but you probably don't need to. What
you can do instead is use the last timestamp in the range as the key like
this:

key=2  value= {a.1 b.1 c.2 d.2}
key=5  value= {m.3 n.4 o.5}
key=7  value={x.6 y.6 z.7}

As long as your ranges are non-overlapping, you can just stop when you get
to the first key/value pair that starts after your given time range. If
your ranges are overlapping then you will have to do a more complicated
intersection between forward and reverse orderings to efficiently select
ranges, or maybe use some type of hierarchical range intersection index
akin to a binary space partitioning tree.

Cheers,
Adam


On Fri, Jun 29, 2012 at 2:19 PM, Lam dnae...@gmail.com wrote:

 I'm using a timestamp as a key and the value is all the relevant data
 starting at that timestamp up to the timestamp represented by the key
 of the next row.

 When querying, I'm given a time span, consisting of a start and stop
 time.  I want to return all the relevant data within the time span, so
 I was to retrieve the appropriate rows (then filter the data for the
 given timespan).

 Example:
 In Accumulo:  (the format of the value is  letter.timestamp)
 key=1  value= {a.1 b.1 c.2 d.2}
 key=3  value= {m.3 n.4 o.5}
 key=6  value={x.6 y.6 z.7}

 Query:  timespan=[2 4]  (get all data from timestamp 2 to 4 inclusively)

 Desire result: retrieve key=1 and key=3, then filter out a.1, b.1, and
 o.5, and return the rest

 Problem: How do I know to retrieve key=1 and key=3 without scanning
 all the keys?

 Can I create a scanner that looks for the given start key=2 and go to
 the prior row (i.e. key=1)?

 --
 D. Lam



Re: Incorrectly setting TKey causes NPE (to nobody's surprise)

2012-06-26 Thread Adam Fuchs
The tradeoff would be convenience versus complexity in the API. I would
lean towards having fewer ways to create a Key.

Has this debate played out before?
http://www.wikivs.com/wiki/Python_vs_Ruby#Philosophy

Adam



On Tue, Jun 26, 2012 at 9:17 AM, David Medinets david.medin...@gmail.comwrote:

 I play 'stupid developer' fairly well. I saw something that defines a
 key and started to use it. If I set row, cf, cq, and visibility then
 the iterator works fine.

 Is there any reasons why default values of  should not be provided
 for cf, cq, and visibility?

 On Tue, Jun 26, 2012 at 9:09 AM, Marc P. marc.par...@gmail.com wrote:
  I realized that Mr Slacum and I addressed the concern of using thrift;
  however, perhaps you are doing something internally. Have you tried
  setting the stop key on the TRange just for SGs?
 
  On Tue, Jun 26, 2012 at 9:03 AM, Marc P. marc.par...@gmail.com wrote:
  Why are you using that accepts the thrift key and range? They're
  internal communication objects within accumulo. I haven't looked the
  code directly, but they're likely contracted to be set in a different
  manner.
 
 
  On Tue, Jun 26, 2012 at 8:56 AM, David Medinets
  david.medin...@gmail.com wrote:
  I did this:
 
  TKey tKey = new TKey();
  tKey.setRow(row_id.getBytes());
 
 
  TRange tRange = new TRange();
  trange.setStart(tKey);
 
  scan.setRange(tRange);
 
  Iterator iterator = scan.iterator();
  iterator.hasNext();
 
  This resulted in an NPE in:
 
  org.apache.accumulo.core.data.Key.rowColumnStringBuilder(Key.java:472)
 
  While I have no real objection to this NPE (my code is clearly
  deficient), I wonder if a more cogent error message is possible.
  Should there be guard statements somewhere to ensure a valid object?



Re: Incorrectly setting TKey causes NPE (to nobody's surprise)

2012-06-26 Thread Adam Fuchs
Mark,

You're right, my answer was totally misdirected.

The real reason is that TKey is not very user-friendly is that it is auto
generated from the code in core/src/main/thrift/data.thrift. We don't
intend for these auto-generated classes to be part of the public API. Also,
we don't hand-code them (anymore) to make them  convenient because we are
likely to re-generate them when we upgrade Thrift in the future. Bottom
line: don't use classes that are in a *.thrift package in your client code.

Adam



On Tue, Jun 26, 2012 at 10:31 AM, Marc P. marc.par...@gmail.com wrote:

 I don't agree with that last promotion. From a usability perspective,
 I think it would be better to either require all arguments or allow
 them not to be set, instead of throwing an exception. This will create
 confusion, especially with people unfamiliar with the stack, as
 evidenced by David's question.

 On Tue, Jun 26, 2012 at 10:20 AM, Adam Fuchs afu...@apache.org wrote:
  The tradeoff would be convenience versus complexity in the API. I would
 lean
  towards having fewer ways to create a Key.
 
  Has this debate played out
  before? http://www.wikivs.com/wiki/Python_vs_Ruby#Philosophy
 
  Adam
 
 
 
  On Tue, Jun 26, 2012 at 9:17 AM, David Medinets 
 david.medin...@gmail.com
  wrote:
 
  I play 'stupid developer' fairly well. I saw something that defines a
  key and started to use it. If I set row, cf, cq, and visibility then
  the iterator works fine.
 
  Is there any reasons why default values of  should not be provided
  for cf, cq, and visibility?
 
  On Tue, Jun 26, 2012 at 9:09 AM, Marc P. marc.par...@gmail.com wrote:
   I realized that Mr Slacum and I addressed the concern of using thrift;
   however, perhaps you are doing something internally. Have you tried
   setting the stop key on the TRange just for SGs?
  
   On Tue, Jun 26, 2012 at 9:03 AM, Marc P. marc.par...@gmail.com
 wrote:
   Why are you using that accepts the thrift key and range? They're
   internal communication objects within accumulo. I haven't looked the
   code directly, but they're likely contracted to be set in a different
   manner.
  
  
   On Tue, Jun 26, 2012 at 8:56 AM, David Medinets
   david.medin...@gmail.com wrote:
   I did this:
  
   TKey tKey = new TKey();
   tKey.setRow(row_id.getBytes());
  
  
   TRange tRange = new TRange();
   trange.setStart(tKey);
  
   scan.setRange(tRange);
  
   Iterator iterator = scan.iterator();
   iterator.hasNext();
  
   This resulted in an NPE in:
  
  
 org.apache.accumulo.core.data.Key.rowColumnStringBuilder(Key.java:472)
  
   While I have no real objection to this NPE (my code is clearly
   deficient), I wonder if a more cogent error message is possible.
   Should there be guard statements somewhere to ensure a valid object?
 
 



Re: Can I connect an InputStream to a Mutation value?

2012-06-19 Thread Adam Fuchs
There's also the concern of elements of the document that are too large by
themselves. A general purpose streaming solution would include support for
any kind of objects passed in, not just XML with small elements. I think
the fact that it is an XML document is probably a red herring in this case.

In the past, what we have done is solve this on the application side by
breaking up large objects into chunks and then using a key structure that
groups and maintains the order of the chunks. This usually means that we
append a sequence number to the column qualifier using an integer encoding.
The filedata example that Billie referred to does this. Accumulo would
benefit from some sort of general purpose fragmentation solution for
streaming large objects, and an InputStream/OutputStream solution might be
good for that. Sounds like a fun project!

Adam


On Mon, Jun 18, 2012 at 2:06 PM, Marc P. marc.par...@gmail.com wrote:

 I'm sorry, I must be missing something.

 Why does the schema matter? If you were to build keys from all
 attributes and elements, you could, at any point, rebuild the XML
 document. You could store the heirarchy, by virtue of your keys.

 If you were to do that, the previous suggestions would be applicable.
 Realistically, if you stored the entire XML file into a given
 key/value pair, your heap elements will be borne upon thrift reception
 ( at the client ), therefore, streaming would only add complexity and
 additional memory overhead. It wouldn't give you what you want.

 Splitting the file amongst keys can maintain hierarchy, allow you to
 rebuild the XML doc, and store large records into the value.

 On Mon, Jun 18, 2012 at 2:00 PM, David Medinets
 david.medin...@gmail.com wrote:
  Thanks for the offer. I thinking of a situation were I don't know the
  schema ahead of time. For example, a JMS queue that I simply want to
  store the XML somewhere. And let some other program parse it. This is
  a thought experiment.
 
  On Sun, Jun 17, 2012 at 1:06 PM, Jim Klucar klu...@gmail.com wrote:
  David,
 
  Can you give a taste of the schema of the XML? With that we may be
  able to help break the XML file up into keys and help create an index
  for it. IMHO that's the power you would get from accumulo. If you just
  want it as one big lump, and don't need to search it or only retrieve
  portions of the file, then putting it in accumulo is just adding
  overhead to hdfs.
 
 
  Sent from my iPhone
 
  On Jun 17, 2012, at 9:54 AM, David Medinets david.medin...@gmail.com
 wrote:
 
  Some of the XML records that I work with are over 50M. I was hoping to
  store them inside of Accumulo instead of the text-based HDFS XML super
  file currently being used. However, since they are so large I can't
  create a Value object without running out of memory. Storing values
  this large may simply be using the wrong tool, please let me know.



Re: Can Sort Order Be Reversed?

2012-05-31 Thread Adam Fuchs
Nope, we currently only support one sort order. The closest you can come is
by using an encoding the flips the sort order. In this case, you would take
every byte and subtract it from 255 to get your new row, so:

void convert(byte[] row)
{
  for(int i = 0; i  row.length; i++)
row[i] = (byte)(255 - row[i]);
}

void foo()
{
  byte[] row = abcd.getBytes();
  // convert to backwards sort
  convert(row);
  Mutation m = new Mutation(row);
  ...
  BatchWriter bw = ...
  bw.add(m);
  ...

  Scanner s = ...
  for(EntryKey,Value e:s)
  {
byte [] row = e.getKey().getRow();
// convert back
convert(row);
System.out.println(new String(row));
  }
}

On Thu, May 31, 2012 at 10:25 AM, David Medinets
david.medin...@gmail.comwrote:

 I don't know why it has taken me so long to ask this basic question.
 Rows are stored in Accumulo in sorted order - high to low. Is there a
 configuration option to flip the sort? My specific use case has dates
 as the record key and I want to see the oldest records first.



Re: Filtering rows by presence of keys

2012-05-25 Thread Adam Fuchs
One of the differences you'll see between WholeRowIterator and RowFilter is
that WholeRowIterator buffers an entire row in memory while RowFilter does
not. Each includes a boolean method that you would override in a subclass
-- acceptRow(...) in RowFilter or filter(...) in WholeRowIterator. In this
case, I think the acceptRow(...) method would be easier for you to
implement, it might be more efficient, and you wouldn't have to worry about
buffering too much in memory. Here's how I would write it:

public class AwesomeIterator extends RowFilter {
  ...
  public boolean acceptRow(SortedKeyValueIteratorKey,Value rowIterator)
throws IOException
  {
// the seek will get clipped to the row in question, so we can use an
infinite
//   range and look for anything in the ACTIVE column family
rowIterator.seek(new Range(),Collections.singleton((ByteSequence)new
ArrayByteSequence(ACTIVE)),true);
return rowIterator.hasTop();
  }
}


Cheers,
Adam


On Tue, May 22, 2012 at 12:56 PM, John Armstrong j...@ccri.com wrote:

 On 05/22/2012 12:46 PM, bob.thor...@l-3com.com wrote:

 IntersectingIterator is designed to reduce a dataset to a common column
 qualifier for a collection of column families.  So I presume you mental
 picture (like mine was for a long time) inverted to the logic of that
 iterator.  You might try another type...like RowFilter.


 Adding a filter to the WholeRowIterator has been suggested, and I'm trying
 that.  I'm also pushing for an upgrade from 1.3.4 to 1.4.x, but that may be
 harder going.



Re: ROW ID Iterator - sanity check

2012-05-19 Thread Adam Fuchs
One issue here is you are mixing Iterator and Iterable in the same object.
Usually, an Iterable will return an iterator at the beginning of some
logical sequence, but your iterable returns the same iterator object over
and over again. This state sharing would make it so that you can really
only iterate over the iterable once. In your iterator() method you might
instead return new RowIdIterator(scanner), and that would properly
separate the state of the different iterators.

To test this, you could construct a unit test that starts with a
MockInstance, adds some data, then checks to see that the row ids come out
as expected with code similar to your main method.

We can also talk about how to make this more efficient with an iterator if
you like.

Cheers,
Adam


On Sat, May 19, 2012 at 6:10 PM, David Medinets david.medin...@gmail.comwrote:

 I wanted a program to display Row Id values in the simplest way
 possible. Please let me know if I have overlooked something. First, i
 wrapped the RowIterator like this;

 package com.codebits.accumulo;

 import java.util.Iterator;
 import java.util.Map.Entry;

 import org.apache.accumulo.core.client.RowIterator;
 import org.apache.accumulo.core.client.Scanner;
 import org.apache.accumulo.core.data.Key;
 import org.apache.accumulo.core.data.Value;

 public class RowIdIterator implements IteratorString, IterableString {

Scanner scanner = null;
RowIterator iterator = null;

public RowIdIterator(Scanner scanner) {
super();
this.scanner = scanner;
this.iterator = new RowIterator(scanner);
}

@Override
public boolean hasNext() {
return iterator.hasNext();
}

@Override
public String next() {
IteratorEntryKey, Value entry = iterator.next();
return entry.next().getKey().getRow().toString();
}

@Override
public void remove() {
}

@Override
public IteratorString iterator() {
return this;
}
 }

 And then I used a driver program like this;

 package com.codebits.accumulo;

 import org.apache.accumulo.core.client.AccumuloException;
 import org.apache.accumulo.core.client.AccumuloSecurityException;
 import org.apache.accumulo.core.client.Connector;
 import org.apache.accumulo.core.client.Scanner;
 import org.apache.accumulo.core.client.TableNotFoundException;
 import org.apache.accumulo.core.client.ZooKeeperInstance;
 import org.apache.accumulo.core.security.Authorizations;

 public class RowIdInteratorDriver {

  public static void main(String[] args) throws AccumuloException,
 AccumuloSecurityException, TableNotFoundException {
String instanceName = development;
String zooKeepers = localhost;
String user = root;
byte[] pass = password.getBytes();
String tableName = test_row_iterator;
Authorizations authorizations = new Authorizations();

ZooKeeperInstance instance = new
 ZooKeeperInstance(instanceName, zooKeepers);
Connector connector = instance.getConnector(user, pass);
Scanner scanner = connector.createScanner(tableName,
 authorizations);

for (String rowId : new RowIdIterator(scanner)) {
System.out.println(ROW ID:  + rowId);
}
  }

 }

 This code works:

 ROW ID: R001
 ROW ID: R002
 ROW ID: R003

 My concern is that scanner that I am passing into the iterator. How is
 that testable? And, of course, the class name is confusing..



Re: recursive rollup

2012-04-12 Thread Adam Fuchs
Small correction: the branching factor would not have to be exactly 1, but
it would be small on average (close to 1).

Adam


On Thu, Apr 12, 2012 at 12:50 PM, Adam Fuchs adam.p.fu...@ugov.gov wrote:

 This probably won't work, unless all node names are unique at a given
 level. For example, given paths /a/c/d/e and /b/c/d/e you would have two
 conflicting entries for 4:d/e childOf c/d. You might be able to use a
 unique name instead with a similar scheme, but that could possibly
 introduce a bottleneck.

 Another thought on this is if you actually have path depths in the
 thousands then there's probably a compression scheme that could save you a
 lot of space and compute time, as the branching factor would necessarily be
 1 for the majority of those nodes.

 Cheers,
 Adam


 On Wed, Apr 11, 2012 at 10:43 PM, Keith Turner ke...@deenlo.com wrote:

 There is one other refinement to this table structure, the depth could
 be sorted in inverse order.  To do this store (MAX_DEPTH - depth).
 This avoids the binary search the in filesystem example to find the
 max depth.  So the tree would look like the following, assuming max
 dept is 999.

 997:B/D childOf A/B
 997:B/E childOf A/B
 997:C/F childOf A/C
 997:C/G childOf A/C
 998:A/B childOf /A
 998:A/C childOf /A

 Keith

 2012/4/11 Perko, Ralph J ralph.pe...@pnnl.gov:
  Keith, thanks for the quick reply. I see the hierarchy is maintained in
  the row id - would you still
  recommend this approach for deep hierarchies, on the order of hundreds
 and
  perhaps thousands?
 
  Maybe this is an accumulo newbie question - Is this approach better than
  walking the tree?  My current schema is something like
 
  A parentOf:B 1
  A parentOf:C 1
  B childOf:A 1
  C childOf:A 1
  And so on…
 
  Thanks,
  Ralph
 
 
  On 4/10/12 3:23 PM, Keith Turner ke...@deenlo.com wrote:
 
 oooppsss I was not paying close attention,  when it scans level 001 it
 will insert the following
 
   000/A count = 6 (not 5)
 
 On Tue, Apr 10, 2012 at 6:21 PM, Keith Turner ke...@deenlo.com wrote:
  Take a look at the
  org.apache.accumulo.examples.simple.dirlist.FileCount example,  I
  think it does what you are asking.
 
  The way it works it that assume your tree would be stored in accumulo
  as follows.
 
  000/A
  001/A/B
  001/A/C
  002/A/B/D
  002/A/B/E
  002/A/C/F
  002/A/C/G
 
  The number before the path is the depth, therefore all data for a
  particular depth of the tree is stored contiguously.  The example
  program scans each depth, starting with the highest depth, and push
  counts up.  This very efficient because there is not a lot of random
  access, it sequentially reads data for each depth and stream updates
  to the lower depth.
 
  For example when reading level 002, it would do the following two
 inserts :
 
   001/A/B count=2
   001/A/C count=2
 
  Then it would read level 001 and push the following insert :
 
   000/A count = 5
 
  Keith
 
  On Tue, Apr 10, 2012 at 6:02 PM, Perko, Ralph J ralph.pe...@pnnl.gov
 
 wrote:
  Hi, I wish to do a recursive rollup-count and am wondering the best
 way to
  do this.
 
  What I mean is this ­ in accumulo is a table with data that
 represents
 the
  nodes and leafs in a tree.  Each node/leaf in accumulo knows it's
 parent
  and children and wether it is a node or leaf.  I wish to have a
  rollup-count for any given node to know the combined total of all
  descendants.
 
  For example, given the tree:
 
   A
   |

|  |
B  C
|  |
  -  -
  |   |  ||
  D   E  FG
 
  A would have 6 descendants.
 
  I can use the SummingCombiner iterator to get a child count, e.g A
 has 2
  children, but I am not sure the best way to recurse down.  The data
 is
  static so I do not necessarily need a dynamic, on-the-fly solution.
 
  Thanks for your help,
  Ralph
 





Re: Newbie Install/Setup Questions

2012-04-10 Thread Adam Fuchs
Sam,

Yes, Accumulo 1.4.0 should be compatible with Hadoop 1.0.1 after you remove
that check. We've run with it some, but mostly we've tested with 0.20.x.
Please let us know if you see any compatibility problems.

There are two possibilities for why your second tablet server did not
start. Either the start-all.sh script did not try to start the other server
because your conf/slaves file doesn't reference the other server, or the
other server has a configuration problem that prevented it from starting.
Does your conf/slaves file contain both machine names of your tablet
servers, and does the start-all.sh script print out something like
starting tablet server on ... for both servers? If it does, then check to
see that the configuration is identical on both servers.

Cheers,
Adam


On Tue, Apr 10, 2012 at 3:41 PM, Patel, Sameer sameer.pa...@lmco.comwrote:

 Hello,

 I'm a total newbie to Accumulo and am trying to setup version 1.4.0(I'm
 assuming this is the latest release) on XUbuntu 11.10.   I'm running on top
 of Apache Hadoop 1.0.1.  I'm having some setup problems and was wondering
 if someone might have some insight?

 Hadoop Setup Environment
 ---
 I'm running in a multi node setup where I have the Job Tracker, Name Node,
 Secondary Name Node, Zookeeper, and 2 slaves all on separate machines
 (virtual machines).  I have verified that the Hadoop install seems ok and
 even performed a distributed file action to test the file system (e.g. made
 a directory).  Additionally, I see the 2 Live Nodes(2 slave nodes) show
 up on the Hadoop status page at http:machine:50070.

 Accumulo Setup Environment
 --
 I have the Accumulo Master/Table Server deployed on one of the Hadoop
 slaves. I also have Accumulo installed on the other Hadoop slave to serve
 as another Accumulo Tablet Server.  So theoretically, I'm expecting one
 Accumulo master and two Accumulo tablet servers (played by my 2 Hadoop
 slave nodes).

 Question 1
 --
 When I initially tried to start accumulo with the start-all.sh script,
 it did not start and printed: Accumulo 1.4.0 requires Hadoop version
 0.20.x.  Is this accurate?  Is it really 0.20.x (which is listed as the
 legacy version on the Apache Hadoop site)?  I probably missed this
 requirement somewhere in the documentation but I still can't find this
 spelled out.

 Question 2
 --
 Temporarily, I commented out the check for the particular Hadoop version
 relating to Question 1 to get Accumulo started (in config.sh).  Once I
 did that I was able to go to the Accumulo Status page and see the following
 information as listed below.  Everything looks good except the Tablet
 Servers number.  Should I be seeing 2 there (instead of 1)?  I have the 2
 slave nodes listed in my slaves conf file.  Interestingly it detects 2
 Data Nodes..and when I click on the link in the UI for Live Data Nodes it
 shows my 2 machines listed.

 Accumulo Master
 
 Tables 2
 Tablet Servers 1
 Tablets 4

 NameNode
 -
 Live Data Nodes 2

 JobTracker
 --
 Trackers 2

 ZooKeeper
 -
 Myzookeeper server:2181

 Appreciate any help!

 Thanks,
 Sam