Hi there,
I was going to do some performance tests to evaluate the technology on AWS.
Followed
https://blogs.aws.amazon.com/bigdata/post/Tx15973X6QHUM43/Running-Apache-Accumulo-on-Amazon-EMR
The exact command I am using:
aws emr create-cluster --name Accumulo --no-auto-terminate --bootstrap-ac
Hi guys,
While doing pre-analytics we generate hundreds of millions of mutations that
result in 1-100 megabytes of useful data after major compaction. We ingest into
Accumulo using MR from Mapper job. We identified that performance really
degrades while increasing a number of mutations.
The o
aring apples to oranges :)
roman.drap...@baesystems.com wrote:
> Hi guys,
>
> While doing pre-analytics we generate hundreds of millions of
> mutations that result in 1-100 megabytes of useful data after major
> compaction. We ingest into Accumulo using MR from Mapper job. We
> i
2015 at 9:08 AM
roman.drap...@baesystems.com<mailto:roman.drap...@baesystems.com>
mailto:roman.drap...@baesystems.com>> wrote:
Aggregated output is tiny, so if I do same calculations in memory (instead of
sending mutations to Accumulo) , I can reduce overall number of mutations by
1000x or
to handle the in-memory
aggregation before giving the data to the BatchWriter. Why would any part of
Accumulo code be responsible for this kind of application-specific data
handling?
On Tue, Jun 9, 2015 at 3:17 PM,
roman.drap...@baesystems.com<mailto:roman.drap...@baesystems.com&
compaction
On Tue, Jun 9, 2015 at 4:06 PM,
roman.drap...@baesystems.com<mailto:roman.drap...@baesystems.com>
mailto:roman.drap...@baesystems.com>> wrote:
My view is that introduction of ingest-time iterators would be quite a useful
feature. Anyway. ☺
Also, could anyone exactly explain w
Thanks a lot, will give a try!
From: Keith Turner [mailto:ke...@deenlo.com]
Sent: 09 June 2015 22:28
To: user@accumulo.apache.org
Subject: Re: micro compaction
On Tue, Jun 9, 2015 at 5:10 PM,
roman.drap...@baesystems.com<mailto:roman.drap...@baesystems.com>
mailto:roma
Hi there,
My question is how Accumulo compression works in regards to visibility labels.
Is there any difference between "VeryLargeLargeLarge & AlsoLargeLargeLarge" and
"A&B" expressions? Will it be internally compiled to a low data consuming
structure?
Same question applies to column and qual
Hi there,
Our current rowid format is MMdd_payload_sha256(raw data). It works nicely
as we have a date and uniqueness guaranteed by hash, however unfortunately,
rowid is around 50-60 bytes per record.
Requirements are the following:
1) Support Hive on top of Accumulo for ad-hoc querie
rk to work with. Referencing some
of the HBaseStorageHandler code might also be worthwhile (as the two are
very similar).
- Josh
roman.drap...@baesystems.com wrote:
> Hi there,
>
> Our current rowid format is MMdd_payload_sha256(raw data). It works
> nicely as we have a date and uniquen
LazyObject type system is not my favorite
framework to work with. Referencing some of the HBaseStorageHandler code might
also be worthwhile (as the two are very similar).
- Josh
roman.drap...@baesystems.com wrote:
> Hi there,
>
> Our current rowid format is MMdd_payload_sha256(raw d
hould bridge the gap
https://github.com/apache/hive/blob/release-1.2.1/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/AccumuloRangeGenerator.java#L277
roman.drap...@baesystems.com wrote:
> Hi Josh,
>
> Thanks for response.
>
> Well, I am not an expert in Accumulo (s
ing that repetitious prefix. You sure it wasn't the
"payload_sha256" you had as a suffix that was problematic?
Human readable data (that doesn't sacrifice performance terribly) is always
more pleasant to work with. Just a thought.
roman.drap...@baesystems.com wrote:
> So
#x27;d have to write some custom code to
use the Lexicoders (an extension to the AccumuloRowSerializer).
roman.drap...@baesystems.com wrote:
> Yes, payload + sha256 adds 35 more bytes, so we want to use 4 bytes instead
> of 32 for hash but we need second precision (instead of day).
>
>
erformance timestamp oracles for transactions
in their Percolator paper [3].
Cheers,
Adam
[1] https://en.wikipedia.org/wiki/Birthday_problem
[2] https://github.com/twitter/snowflake
[3] http://research.google.com/pubs/pub36726.html
On Mon, Sep 14, 2015 at 2:47 PM,
roman.drap...@baesy
Hi there,
Trying to setup Accumulo 1.7 on Kerberized cluster. Only interested in
master/tablets to be kerberized (not end-users). Configured everything as per
manual:
1) Created principals
2) Generated glob keytab
3) Modified accumulo-site.xml providing general.kerberos.keytab
collect the output specifying
-Dsun.security.krb5.debug=true in accumulo-env.sh (per the instructions) and
try enabling log4j DEBUG on org.apache.hadoop.security.UserGroupInformation.
- Josh
[1] https://issues.apache.org/jira/browse/ACCUMULO-4069
[2] http://accumulo.apache.org/1.7/accumulo_user_manu
e fix from the JIRA case and rebuild Accumulo yourself, or build
1.7.1-SNAPSHOT from our codebase. I would recommend using 1.7.1-SNAPSHOT as it
should be the least painful (1.7.1-SNAPSHOT now is likely to not change
significantly from what is ultimately released as 1.7.1)
roman.drap...@baesystem
-
From: roman.drap...@baesystems.com [mailto:roman.drap...@baesystems.com]
Sent: 26 January 2016 19:43
To: user@accumulo.apache.org
Subject: RE: Accumulo and Kerberos
Hi Josh,
Two quick questions.
1) What should I use instead of HDFS classloader? All examples seem to be from
hdfs.
2) Whan
y thoughts please?
-Original Message-
From: Josh Elser [mailto:josh.el...@gmail.com]
Sent: 26 January 2016 20:08
To: user@accumulo.apache.org
Subject: Re: Accumulo and Kerberos
The normal classloader (on the local filesystem) which is configured out of the
box.
roman.drap...@baesystems
tating that the Kerberos login happened
(or didn't). The server should exit if it fails to log in (but I don't know if
I've actively tested that). Do you see this message?
Does it say you successfully logged in (and the principal you logged in as)?
roman.drap...@baesystems.com wrote:
you share logs? Try enabling
-Dsun.security.kr5b.debug=true in the appropriate environment variable (for the
service you want to turn it on for) in accumulo-env.sh and then start the
services again (hopefully, sharing that too if the problem isn't obvious).
roman.drap...@baesystems.com wr
(or
"token").
If I had to venture a guess, it would be that you have Accumulo configured to
use the wrong Hadoop configuration files, notably core-site.xml and
hdfs-site.xml.
Try the command `accumulo classpath` command and verify that the Hadoop
configuration files included th
files into the Accumulo installation
(making upgrades less error prone).
Glad to hear you got it working.
roman.drap...@baesystems.com wrote:
> Hi Josh,
>
> Thanks a lot for your guess. Classpath did not help, however symlinks from
> Hadoop conf directory to Accumulo conf directory worked
classpaths
...elided...
$HADOOP_CONF_DIR,
...elided...
Classpaths that accumulo checks for updates and class
files.
This is all you should need to get the necessary Hadoop configuration files
made available to the Accumulo services.
roman.drap...@baesystems.com wrote
25 matches
Mail list logo