Thanks,

I think we are having the same or similar issue with virus scan/security scan. 
However that should not bring down the master, can it??

I am still digging thru the logs.

-S

From: Adam J. Shook <adamjsh...@gmail.com>
Sent: Wednesday, March 16, 2022 2:46 PM
To: user@accumulo.apache.org
Subject: Re: [External] Re: odd issue with accumulo 1.10.0 starting up

This is certainly anecdotal, but we've seen this "ERROR: Read a frame size of 
(large number)" before on our Accumulo cluster that would show up at a regular 
and predictable frequency. The root cause was due to a routine scan done by the 
security team looking for vulnerabilities across the entire enterprise (nothing 
Accumulo-specific). I don't have any additional information about the specifics 
of the scan. From all that we can tell, it has no impact on our Accumulo 
cluster outside of these error messages.

--Adam

On Wed, Mar 16, 2022 at 8:35 AM Christopher 
<ctubb...@apache.org<mailto:ctubb...@apache.org>> wrote:
Since that error message is coming from the libthrift library, and not Accumulo 
code, we would need a lot more context to even begin helping you troubleshoot 
it. For example, the complete stack trace that shows the Accumulo code that 
called into the Thrift library, would be extremely helpful.

It's a bit concerning that you're trying to send a single buffer over thrift 
that's over a gigabyte in size, according to that number. You've said before 
that you use live ingest. Are you trying to send a 1GB mutation to a tablet 
server? Or are you using replication and the stack trace looks like it's 
sending 1GB of replication data?

On Wed, Mar 16, 2022 at 7:14 AM Ligade, Shailesh [USA] 
<ligade_shail...@bah.com<mailto:ligade_shail...@bah.com>> wrote:
Well, I re-initialized accumulo but I still see

ERROR: Read a frame size of 1195725856, which is bigger than the maximum 
allowable buffer size for ALL connections.

Is there a setting that I can increase to get past it?

-S


________________________________
From: Ligade, Shailesh [USA] 
<ligade_shail...@bah.com<mailto:ligade_shail...@bah.com>>
Sent: Tuesday, March 15, 2022 12:47 PM
To: user@accumulo.apache.org<mailto:user@accumulo.apache.org> 
<user@accumulo.apache.org<mailto:user@accumulo.apache.org>>
Subject: Re: [External] Re: odd issue with accumulo 1.10.0 starting up

Not daily but  over weekend.
________________________________
From: Mike Miller <mmil...@apache.org<mailto:mmil...@apache.org>>
Sent: Tuesday, March 15, 2022 10:39 AM
To: user@accumulo.apache.org<mailto:user@accumulo.apache.org> 
<user@accumulo.apache.org<mailto:user@accumulo.apache.org>>
Subject: Re: [External] Re: odd issue with accumulo 1.10.0 starting up

Why are you bringing the cluster down every night? That is not ideal.

On Tue, Mar 15, 2022 at 9:24 AM Ligade, Shailesh [USA] 
<ligade_shail...@bah.com<mailto:ligade_shail...@bah.com>> wrote:
Thanks Mike,

We bring the servers down nightly. these are on aws. This worked yesterday 
(Monday) but this (Tuesday) i went on to check on it and it was down, I guess i 
didn't check yesterday. I assume it was up as no one complained., but it was up 
and kicking last week for sure.

So not exactly sure when or what caused it, all services are up (tserver, 
master) so services are not crashing themselves.

I guess worst case, i can re-initialize and recreate tables form hdfs..:-(

-S
________________________________
From: Mike Miller <mmil...@apache.org<mailto:mmil...@apache.org>>
Sent: Tuesday, March 15, 2022 9:16 AM
To: user@accumulo.apache.org<mailto:user@accumulo.apache.org> 
<user@accumulo.apache.org<mailto:user@accumulo.apache.org>>
Subject: Re: [External] Re: odd issue with accumulo 1.10.0 starting up

What was going on in the tserver before you saw that error? Did it finish 
recovering after the restart? If it is still recovering, I don't think you will 
be able to do any scans.

On Tue, Mar 15, 2022 at 8:56 AM Ligade, Shailesh [USA] 
<ligade_shail...@bah.com<mailto:ligade_shail...@bah.com>> wrote:
Thanks Mike,

That was my first reaction but the instance is backed up by puppet and no 
configuration was updated (i double checked and ran puppet manually as well as 
automatically after restart), Since the system was operational yesterday, So I 
think I can rule that out.

For other error, I did see the exact error 
https://lists.apache.org/thread/bobn2vhkswl6c0pkzpy8n13z087z1s6j<https://urldefense.com/v3/__https:/lists.apache.org/thread/bobn2vhkswl6c0pkzpy8n13z087z1s6j__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3t6n73Xg$>,
  
https://github.com/RENCI-NRIG/COMET-Accumulo/issues/14<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
https://markmail.org/message/bc7ijdsgqmod5p2h<https://urldefense.com/v3/__https:/markmail.org/message/bc7ijdsgqmod5p2h__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs2d0PoHHw$>
 but those are for lot older accumulo. and server didn't go out of memory so I 
think that must have been fixed..
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
COMET - accumulomaster out of memory issue · Issue #14 · 
RENCI-NRIG/COMET-Accumulo<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
In COMET cluster running in AWS, node running accumulomaster also hosts comet 
head node. In current deployment, EC2 instance is of type small which has 2GB 
ram. Issue: Accumulomaster process is 
kil...<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
github.com<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
-S<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
________________________________
From: Mike Miller <mmil...@apache.org>
Sent: Tuesday, March 15, 2022 8:47 AM
To: user@accumulo.apache.org <user@accumulo.apache.org>
Subject: [External] Re: odd issue with accumulo 1.10.0 starting up 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
Check your configuration. The log message indicates that there is a problem 
with the internal system user performing operations. The internal system user 
uses credentials derived from the configuration (such as the instance.secret 
field). Make sure your configuration is identical across all nodes in your 
cluster.<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
On Tue, Mar 15, 2022 at 8:34 AM Ligade, Shailesh [USA] 
<ligade_shail...@bah.com> 
wrote:<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
Hello,<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
I am getting little odd issue with accumulo starting 
up<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
on tserver i am 
seeing<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
[tserver.TabletServer] ERROR: Caller doesn't have permission to get active 
scnas<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
ThriftSecurityException(user:!SYSTEM, 
code:BAD_CREDENTIALS)<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
on the ,aster log i am 
seeing<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
ERROR: read a frame size of 1195725856, which is bigger than the maximum 
allowable buffer size for ALL 
connections.<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
from the shell i can list all the tables but canot scan any. Monitor is shwoing 
tablet count 0 and unassigned tablet 
1<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
HDFS fsck is all 
healthy.<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
Any 
suggestions?<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
Thanks<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
-S<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>
 
<https://urldefense.com/v3/__https:/github.com/RENCI-NRIG/COMET-Accumulo/issues/14__;!!May37g!bEmBzvybPxmvx4MS_-OYwTOeru_6IIn_qXlJD6pLuO1q59kx4txH7_I3zs3RaAeRzw$>

Reply via email to