Hello all,
I've got an odd use case that requires me to calculate the midpoint between
two Accumulo splits. I've been searching through the Accumulo source code
for a little bit trying to find where Accumulo automatically calculates a
new split. I am assuming that the new split point is
t to look
> at org.apache.accumulo.server.util.FileUtil#findMidPoint. Note that it
> isn't in the public API, and can change in the future.
>
> -Eric
>
>
> On Thu, Dec 17, 2015 at 1:17 PM, Adam J. Shook <adamjsh...@gmail.com>
> wrote:
>
>> Hello all,
>>
>> I've got an odd use ca
, if any. Hard to find some good examples of people who
are sorting a list of maps.
On Tue, Dec 29, 2015 at 2:47 PM, Keith Turner <ke...@deenlo.com> wrote:
>
>
> On Mon, Dec 28, 2015 at 11:47 AM, Adam J. Shook <adamjsh...@gmail.com>
> wrote:
>
>> Hello all,
>
ot impossible) --
> thought it was worth mentioning to make sure you thought about it.
>
>
> Adam J. Shook wrote:
>
>> Hello all,
>>
>> Any suggestions for using a Map Lexicoder (or implementing one)? I am
>> currently using a new ListLexicoder(new PairLexicoder(
Hello all,
Any suggestions for using a Map Lexicoder (or implementing one)? I am
currently using a new ListLexicoder(new PairLexicoder(some lexicoder, some
lexicoder), which is working for single maps. However, when one of the
lexicoders in the Pair is itself a Map (and therefore another
it a
createdTime.
--Adam
On Fri, Feb 17, 2017 at 1:40 PM, Josh Elser <josh.el...@gmail.com> wrote:
> Hey Adam,
>
> Thanks for sharing this one.
>
> Adam J. Shook wrote:
>
>> Hello folks,
>>
>> One of our clusters has been throwing a handful of replication errors
>&
tency?
> That would help in identifying the changes to make and how best to
> implement them.
>
> - Josh
>
>
> Adam J. Shook wrote:
>
>> I'm currently scoping what it would take to improve the latency in the
>> replication feature of Accumulo. I'm interested in knowing what w
I'm currently scoping what it would take to improve the latency in the
replication feature of Accumulo. I'm interested in knowing what work, if
any, is being done to improve replication latency? If work is being done,
would there be some interest in collaborating on that effort?
If nothing is
As an aside, this is actually pretty relevant to the work I've been doing
for Presto/Accumulo integration. It isn't uncommon to have around a
million exact Ranges (that is, Ranges with a single row ID) spread across
the five Presto worker nodes we use for scanning Accumulo. Right now,
these
We had some corrupt WAL blocks on our stage environment the other day and
opted to delete them. We not have some missing metadata and about 3k files
pending for replication. I've dug into it a bit and noticed that many of
the WALs in the `order` queue of the replication table A) no longer exist
wrote:
>
>
> On 7/24/17 1:44 PM, Adam J. Shook wrote:
>
>> We had some corrupt WAL blocks on our stage environment the other day and
>> opted to delete them. We not have some missing metadata and about 3k files
>> pending for replication. I've dug into it a bit
Hello all,
Has there ever been discussion of having the Garbage Collector skip the
HDFS trash when deleting WALs and old RFiles as a configurable feature
(assuming it isn't already -- I couldn't find it)? Outside of the risks
involved in having the files immediately deleted, what'd be the
Howdy folks,
Anyone have any experience running Accumulo on IPv6-only hosts?
Specifically the MiniAccumloCluster?
There is an open issue in the Presto-Accumulo connector (see [1] and [2])
saying the MAC doesn't work in an IPv6-only environment, and the PR comment
thread has some suggestions to
Anyone run into the below error? We're upgrading from 1.7.3 to 1.8.1 on
Hadoop 2.6.0, ZooKeeper 3.4.6, and JDK 8u121. The monitor is continuously
complaining about the log4j socket appender and it eventually crashes. It
is accepting connections, but they are promptly closed with the
rote:
>>
>>> This is strange. I've only ever seen this when HDFS has reported
>>> problems, such as missing blocks, or another obvious failure. What is your
>>> durability settings (were WALs turned on)?
>>>
>>> On Fri, May 11, 2018 at 12:45 PM Adam J.
Hello all,
On one of our clusters, there are a good number of missing RFiles from
HDFS, however HDFS is not/has not reported any missing blocks. We were
experiencing issues with HDFS; some flapping DataNode processes that needed
more heap.
I don't anticipate I can do much besides create a bunch
I
> had better answers for you.
>
>
> On Wed, May 16, 2018 at 11:25 AM Adam J. Shook <adamjsh...@gmail.com>
> wrote:
>
>> I tried building a timeline but the logs are just not there. We weren't
>> sending the debug logs to Splunk due to the verbosity, but we may be
&g
failure.
>
> Mike
>
> On Sat, May 12, 2018 at 5:26 PM Adam J. Shook <adamjsh...@gmail.com>
> wrote:
>
>> WALs are turned on. Durability is set to flush for all tables except for
>> root and metadata which are sync. The current rfile names on HDFS and
>
The WAL is from 1.9.1.
On Mon, Jun 11, 2018 at 6:33 PM, Christopher wrote:
> That's what I was thinking it was related to. Do you know if the
> particular WAL file was created from a previous version, from before you
> upgraded?
>
> On Mon, Jun 11, 2018 at 6:00 PM Adam J.
Sorry, I had the error backwards. There is an OPEN for the WAL and then
immediately a COMPACTION_FINISH entry. This would cause the error.
On Wed, Jun 13, 2018 at 11:34 AM, Adam J. Shook
wrote:
> Looking at the log I see that the last two entries are COMPACTION_START of
> one
Hey all,
The root tablet on one of our dev systems isn't loading due to an illegal
state exception -- COMPACTION_FINISH preceding COMPACTION_START. What'd be
the best way to mitigate this issue? This was likely caused due to both of
our NameNodes failing.
Thank you,
--Adam
Hello all,
Has anyone come across an issue with a TabletServer occupying a large
number of ports in a CLOSED_WAIT state? 'Normal' number of used ports on a
12-node cluster are around 12,000 to 20,000 ports. In one instance, there
were over 68k and it was affecting other applications from
er
> than to check for OS updates. What JVM are you running?
>
> It's possible it's not a leak... and these are just getting cleaned up too
> slowly. That might be something that can be tuned with sysctl.
>
> On Thu, Jan 25, 2018 at 11:27 AM Adam J. Shook <adamjsh...@gmail.com>
>
s-referencing the PID with `jps -ml` (or similar)? Are you able to
>> confirm based on the port number that these were Thrift connections or
>> could they be ZooKeeper or Hadoop connections? Do you have any special
>> non-default Accumulo RPC configuration (SSL or SASL)?
>>
&g
Yes, it does use RPC to talk to HDFS. You will need to update the value of
instance.volumes in accumulo-site.xml to reference this address,
haz0-m:8020, instead of the default localhost:9000.
--Adam
On Wed, Jan 31, 2018 at 4:45 PM, Geoffry Roberts
wrote:
> I have a
uccess with this workaround strategy? I am also
> experiencing this issue.
>
> On 2018/06/13 16:30:22, "Adam J. Shook" wrote:
> > Sorry, I had the error backwards. There is an OPEN for the WAL and
> > then immediately a COMPACTION_FINISH entry. This would cause
It is possible to do a rolling upgrade from 1.8.1 to 1.9.2 -- no need to
shut down the whole cluster. The 1.9 series is effectively a continuation
of the 1.8 line, however some client methods were deprecated which caused
it to be a 1.9 release per the laws of semantic versioning. If your code
is
1.9.x is effectively the continuation of the 1.8.x bug fix releases. I've
also upgraded several clusters from 1.8 to 1.9. There were some issues
identified and fixed in the interim 1.9.x versions that you may experience,
so I would recommend upgrading directly to the latest 1.9.3.
On Mon, Jun
I've used the Accumulo Tracer API before to help identify bottlenecks in my
scans. You can find the most recent traces in the Accumulo Monitor UI, and
there are also some tools you can use to view the contents of the trace
table. See section 18.10.4 "Viewing Collected Traces" at
Hi Niclas,
1. Accumulo uses a VersioningIterator for all tables which ensures that you
see the latest version of a particular entry, defined as the entry that has
the highest value for the timestamp. Older versions of the same key (row
ID + family + qualifier + visibility) are compacted away by
emember "Accumulo can do that
> automatically", rather than implement that at a higher level.
>
> Thanks
>
> On Tue, Apr 14, 2020 at 12:55 AM Adam J. Shook
> wrote:
>
>> Hi Niclas,
>>
>> 1. Accumulo uses a VersioningIterator for all tables which e
I have few tables to replicate so I am thinking I will set all others
> properties using shell config command
>
>
>
> To test this, I just insert value using shell right? Or do I need to flush
> or compact on the table to see those values on the other side?
>
>
>
> -S
Your configurations look correct to me, and it sounds like it is partially
working as you are seeing files that need replicated in the Accumulo
Monitor. I do have the replication.name and all replication.peer.*
properties defined in accumulo-site.xml. Do you have all these properties
defined
rrect status, and once in while I see
> In-Progress Replication section flashing by. But don’t see any new data in
> the target table. ☹
>
>
>
> -S
>
>
>
> *From:* Adam J. Shook
> *Sent:* Thursday, September 23, 2021 12:10 PM
> *To:* user@accumulo.apache.org
>
This is certainly anecdotal, but we've seen this "ERROR: Read a frame size
of (large number)" before on our Accumulo cluster that would show up at a
regular and predictable frequency. The root cause was due to a routine scan
done by the security team looking for vulnerabilities across the entire
35 matches
Mail list logo