Midpoint between two splits

2015-12-17 Thread Adam J. Shook
Hello all, I've got an odd use case that requires me to calculate the midpoint between two Accumulo splits. I've been searching through the Accumulo source code for a little bit trying to find where Accumulo automatically calculates a new split. I am assuming that the new split point is

Re: Midpoint between two splits

2015-12-17 Thread Adam J. Shook
t to look > at org.apache.accumulo.server.util.FileUtil#findMidPoint. Note that it > isn't in the public API, and can change in the future. > > -Eric > > > On Thu, Dec 17, 2015 at 1:17 PM, Adam J. Shook <adamjsh...@gmail.com> > wrote: > >> Hello all, >> >> I've got an odd use ca

Re: Map Lexicoder

2015-12-29 Thread Adam J. Shook
, if any. Hard to find some good examples of people who are sorting a list of maps. On Tue, Dec 29, 2015 at 2:47 PM, Keith Turner <ke...@deenlo.com> wrote: > > > On Mon, Dec 28, 2015 at 11:47 AM, Adam J. Shook <adamjsh...@gmail.com> > wrote: > >> Hello all, >

Re: Map Lexicoder

2015-12-28 Thread Adam J. Shook
ot impossible) -- > thought it was worth mentioning to make sure you thought about it. > > > Adam J. Shook wrote: > >> Hello all, >> >> Any suggestions for using a Map Lexicoder (or implementing one)? I am >> currently using a new ListLexicoder(new PairLexicoder(

Map Lexicoder

2015-12-28 Thread Adam J. Shook
Hello all, Any suggestions for using a Map Lexicoder (or implementing one)? I am currently using a new ListLexicoder(new PairLexicoder(some lexicoder, some lexicoder), which is working for single maps. However, when one of the lexicoders in the Pair is itself a Map (and therefore another

Re: Status record lacked createdTime

2017-02-17 Thread Adam J. Shook
it a createdTime. --Adam On Fri, Feb 17, 2017 at 1:40 PM, Josh Elser <josh.el...@gmail.com> wrote: > Hey Adam, > > Thanks for sharing this one. > > Adam J. Shook wrote: > >> Hello folks, >> >> One of our clusters has been throwing a handful of replication errors >&

Re: Improving Accumulo Replication Latency

2017-02-15 Thread Adam J. Shook
tency? > That would help in identifying the changes to make and how best to > implement them. > > - Josh > > > Adam J. Shook wrote: > >> I'm currently scoping what it would take to improve the latency in the >> replication feature of Accumulo. I'm interested in knowing what w

Improving Accumulo Replication Latency

2017-02-15 Thread Adam J. Shook
I'm currently scoping what it would take to improve the latency in the replication feature of Accumulo. I'm interested in knowing what work, if any, is being done to improve replication latency? If work is being done, would there be some interest in collaborating on that effort? If nothing is

Re: Accumulo Seek performance

2016-09-12 Thread Adam J. Shook
As an aside, this is actually pretty relevant to the work I've been doing for Presto/Accumulo integration. It isn't uncommon to have around a million exact Ranges (that is, Ranges with a single row ID) spread across the five Presto worker nodes we use for scanning Accumulo. Right now, these

Missing replication metadata

2017-07-24 Thread Adam J. Shook
We had some corrupt WAL blocks on our stage environment the other day and opted to delete them. We not have some missing metadata and about 3k files pending for replication. I've dug into it a bit and noticed that many of the WALs in the `order` queue of the replication table A) no longer exist

Re: Missing replication metadata

2017-07-24 Thread Adam J. Shook
wrote: > > > On 7/24/17 1:44 PM, Adam J. Shook wrote: > >> We had some corrupt WAL blocks on our stage environment the other day and >> opted to delete them. We not have some missing metadata and about 3k files >> pending for replication. I've dug into it a bit

Skip trash on delete

2017-08-09 Thread Adam J. Shook
Hello all, Has there ever been discussion of having the Garbage Collector skip the HDFS trash when deleting WALs and old RFiles as a configurable feature (assuming it isn't already -- I couldn't find it)? Outside of the risks involved in having the files immediately deleted, what'd be the

IPv6-only hosts for MAC

2017-08-29 Thread Adam J. Shook
Howdy folks, Anyone have any experience running Accumulo on IPv6-only hosts? Specifically the MiniAccumloCluster? There is an open issue in the Presto-Accumulo connector (see [1] and [2]) saying the MAC doesn't work in an IPv6-only environment, and the PR comment thread has some suggestions to

log4j SocketNode error on upgrading to 1.8.1

2017-10-24 Thread Adam J. Shook
Anyone run into the below error? We're upgrading from 1.7.3 to 1.8.1 on Hadoop 2.6.0, ZooKeeper 3.4.6, and JDK 8u121. The monitor is continuously complaining about the log4j socket appender and it eventually crashes. It is accepting connections, but they are promptly closed with the

Re: Question on missing RFiles

2018-05-12 Thread Adam J. Shook
rote: >> >>> This is strange. I've only ever seen this when HDFS has reported >>> problems, such as missing blocks, or another obvious failure. What is your >>> durability settings (were WALs turned on)? >>> >>> On Fri, May 11, 2018 at 12:45 PM Adam J.

Question on missing RFiles

2018-05-11 Thread Adam J. Shook
Hello all, On one of our clusters, there are a good number of missing RFiles from HDFS, however HDFS is not/has not reported any missing blocks. We were experiencing issues with HDFS; some flapping DataNode processes that needed more heap. I don't anticipate I can do much besides create a bunch

Re: Question on missing RFiles

2018-05-16 Thread Adam J. Shook
I > had better answers for you. > > > On Wed, May 16, 2018 at 11:25 AM Adam J. Shook <adamjsh...@gmail.com> > wrote: > >> I tried building a timeline but the logs are just not there. We weren't >> sending the debug logs to Splunk due to the verbosity, but we may be &g

Re: Question on missing RFiles

2018-05-16 Thread Adam J. Shook
failure. > > Mike > > On Sat, May 12, 2018 at 5:26 PM Adam J. Shook <adamjsh...@gmail.com> > wrote: > >> WALs are turned on. Durability is set to flush for all tables except for >> root and metadata which are sync. The current rfile names on HDFS and >

Re: Corrupt WAL

2018-06-11 Thread Adam J. Shook
The WAL is from 1.9.1. On Mon, Jun 11, 2018 at 6:33 PM, Christopher wrote: > That's what I was thinking it was related to. Do you know if the > particular WAL file was created from a previous version, from before you > upgraded? > > On Mon, Jun 11, 2018 at 6:00 PM Adam J.

Re: Corrupt WAL

2018-06-13 Thread Adam J. Shook
Sorry, I had the error backwards. There is an OPEN for the WAL and then immediately a COMPACTION_FINISH entry. This would cause the error. On Wed, Jun 13, 2018 at 11:34 AM, Adam J. Shook wrote: > Looking at the log I see that the last two entries are COMPACTION_START of > one

Corrupt WAL

2018-06-11 Thread Adam J. Shook
Hey all, The root tablet on one of our dev systems isn't loading due to an illegal state exception -- COMPACTION_FINISH preceding COMPACTION_START. What'd be the best way to mitigate this issue? This was likely caused due to both of our NameNodes failing. Thank you, --Adam

Large number of used ports from tserver

2018-01-24 Thread Adam J. Shook
Hello all, Has anyone come across an issue with a TabletServer occupying a large number of ports in a CLOSED_WAIT state? 'Normal' number of used ports on a 12-node cluster are around 12,000 to 20,000 ports. In one instance, there were over 68k and it was affecting other applications from

Re: Large number of used ports from tserver

2018-01-26 Thread Adam J. Shook
er > than to check for OS updates. What JVM are you running? > > It's possible it's not a leak... and these are just getting cleaned up too > slowly. That might be something that can be tuned with sysctl. > > On Thu, Jan 25, 2018 at 11:27 AM Adam J. Shook <adamjsh...@gmail.com> >

Re: Large number of used ports from tserver

2018-01-25 Thread Adam J. Shook
s-referencing the PID with `jps -ml` (or similar)? Are you able to >> confirm based on the port number that these were Thrift connections or >> could they be ZooKeeper or Hadoop connections? Do you have any special >> non-default Accumulo RPC configuration (SSL or SASL)? >> &g

Re: Question on how Accumulo binds to Hadoop

2018-01-31 Thread Adam J. Shook
Yes, it does use RPC to talk to HDFS. You will need to update the value of instance.volumes in accumulo-site.xml to reference this address, haz0-m:8020, instead of the default localhost:9000. --Adam On Wed, Jan 31, 2018 at 4:45 PM, Geoffry Roberts wrote: > I have a

Re: Corrupt WAL

2018-08-22 Thread Adam J. Shook
uccess with this workaround strategy? I am also > experiencing this issue. > > On 2018/06/13 16:30:22, "Adam J. Shook" wrote: > > Sorry, I had the error backwards. There is an OPEN for the WAL and > > then immediately a COMPACTION_FINISH entry. This would cause

Re: upgrading from Accumulo 1.8.1 to 1.9.2

2019-03-18 Thread Adam J. Shook
It is possible to do a rolling upgrade from 1.8.1 to 1.9.2 -- no need to shut down the whole cluster. The 1.9 series is effectively a continuation of the 1.8 line, however some client methods were deprecated which caused it to be a 1.9 release per the laws of semantic versioning. If your code is

Re: upgrading from 1.8.x to 1.9.x

2019-06-10 Thread Adam J. Shook
1.9.x is effectively the continuation of the 1.8.x bug fix releases. I've also upgraded several clusters from 1.8 to 1.9. There were some issues identified and fixed in the interim 1.9.x versions that you may experience, so I would recommend upgrading directly to the latest 1.9.3. On Mon, Jun

Re: Accumulo Tracer?

2020-02-28 Thread Adam J. Shook
I've used the Accumulo Tracer API before to help identify bottlenecks in my scans. You can find the most recent traces in the Accumulo Monitor UI, and there are also some tools you can use to view the contents of the trace table. See section 18.10.4 "Viewing Collected Traces" at

Re: Noob questions

2020-04-13 Thread Adam J. Shook
Hi Niclas, 1. Accumulo uses a VersioningIterator for all tables which ensures that you see the latest version of a particular entry, defined as the entry that has the highest value for the timestamp. Older versions of the same key (row ID + family + qualifier + visibility) are compacted away by

Re: Noob questions

2020-04-14 Thread Adam J. Shook
emember "Accumulo can do that > automatically", rather than implement that at a higher level. > > Thanks > > On Tue, Apr 14, 2020 at 12:55 AM Adam J. Shook > wrote: > >> Hi Niclas, >> >> 1. Accumulo uses a VersioningIterator for all tables which e

Re: [External] RE: accumulo 1.10 replication issue

2021-09-23 Thread Adam J. Shook
I have few tables to replicate so I am thinking I will set all others > properties using shell config command > > > > To test this, I just insert value using shell right? Or do I need to flush > or compact on the table to see those values on the other side? > > > > -S

Re: [External] RE: accumulo 1.10 replication issue

2021-09-23 Thread Adam J. Shook
Your configurations look correct to me, and it sounds like it is partially working as you are seeing files that need replicated in the Accumulo Monitor. I do have the replication.name and all replication.peer.* properties defined in accumulo-site.xml. Do you have all these properties defined

Re: [External] RE: accumulo 1.10 replication issue

2021-09-23 Thread Adam J. Shook
rrect status, and once in while I see > In-Progress Replication section flashing by. But don’t see any new data in > the target table. ☹ > > > > -S > > > > *From:* Adam J. Shook > *Sent:* Thursday, September 23, 2021 12:10 PM > *To:* user@accumulo.apache.org >

Re: [External] Re: odd issue with accumulo 1.10.0 starting up

2022-03-16 Thread Adam J. Shook
This is certainly anecdotal, but we've seen this "ERROR: Read a frame size of (large number)" before on our Accumulo cluster that would show up at a regular and predictable frequency. The root cause was due to a routine scan done by the security team looking for vulnerabilities across the entire