RE: [External] Re: accumulo 1.10.0 unassigned tablets issue

Ligade, Shailesh [USA] Fri, 04 Mar 2022 05:36:29 -0800

Thanks Chris,

Appreciate your support!

Not sure why volumes.replacement was set, especially since we have HA namenode 
and that’s the only hdfs targeted. The volumes.replacement was set to the same 
url though e.g. nameservice/accumulo, nameservice:8020/accumulo 

Regardless, when tserver went down, even though if we set 
table.suspend.duration=15m, I was seeing volume replacement messages in the 
master log for every tablet hosted and that is taking looong time (hours for 
33k tablets/tserver). So how best to remove this volumes? There is no 
delete-volumes, I see only add-volumes under accumulo init. Is there anything I 
need to do after I remove entire instance.volumes.replacement section from 
accumulo-site.xml?

I will have to look at each and every property to ensure it makes sense for 
sure..

Thanks

-S

-----Original Message-----
From: Christopher <[email protected]> 
Sent: Wednesday, March 2, 2022 3:09 PM
To: accumulo-user <[email protected]>
Subject: Re: [External] Re: accumulo 1.10.0 unassigned tablets issue

On Wed, Mar 2, 2022 at 1:51 PM Ligade, Shailesh [USA] <[email protected]> 
wrote:
>
EDIT > Thanks Chris[topher],
>
> I do have instance.volume.replacement overridden
>
> Does that mean it will not work with table.suspend.duration property?

No. It's just that's where the RecoveryManager message is coming from.

>
> uhmm thinking about it i am not sure why we set that as we have only one hdfs 
> and we have less than 10 beefy nodes...
>
> may be I can remove this property after i set table.suspend.duration, and 
> stop/reboot tserver. After i am done, i can restore the property. Please 
> advise.

I have no idea why you would set that if you're not replacing one volume with 
another. I think you would probably benefit from reviewing all of your 
configuration. Please check the documentation for an explanation of each 
property. If you have a specific question regarding them, you can ask here, but 
I would start by reviewing your configs against the docs.

>
> Thanks
>
> -S
>
>
> ________________________________
> From: Christopher <[email protected]>
> Sent: Wednesday, March 2, 2022 1:32 PM
> To: accumulo-user <[email protected]>
> Subject: [External] Re: accumulo 1.10.0 unassigned tablets issue
>
> The replacements message should only appear if you have 
> instance.volumes.replacements set in your configuration.
>
> On Wed, Mar 2, 2022 at 11:02 AM Ligade, Shailesh [USA] 
> <[email protected]> wrote:
> >
> > Hello,
> >
> > I need reboot a tserver with 34k hosted tablets.
> >
> > I set table.supend.duration to 15 min and stop tserver and rebooted the 
> > machine.
> >
> > As soon as tablet server came on line the its hosted tablets counts went 
> > from 0 to 34k, however, on the master i see 34k unassigned tablets, 
> > although the count is going down it is taking hours.
> > not sure why master is stating unassigne dtablets when the tablet server 
> > has correct hosted tablet server count?
> >
> > Also in the master log i see
> >
> > recovery.RecoveryManager INFO: Volume replaced hdfs://xxxx -> hdfs://xxxx   
> > the issue is both from and to hdfs urls are identical, so why master is 
> > trying to do that??
> >
> > Is the cluster safe to use? I can reboot another tablet server before this 
> > unassigned tablet count goes to 0? I can reboot entire cluster if i have 
> > to, will that help?
> >
> > Thanks in advance.
> >
> > -S

RE: [External] Re: accumulo 1.10.0 unassigned tablets issue

Reply via email to