> Do you have any warnings/errors in the new server's logs? On smaller cluster where I try to reproduce the problem - no
On big cluster, unfortunately, there are no local logs as the tserver logs were sent to the monitor :( At the moment I cannot add a new tserver there to collect new logs as clients are using the cluster. On 1/13/15, Eric Newton <[email protected]> wrote: > The fact that the tablets are being taken offline means that the master is > actively trying to balance. > > The master will periodically ask the new server to host the tablets. Do > you have any warnings/errors in the new server's logs? > > -Eric > > > On Tue, Jan 13, 2015 at 11:48 AM, Denis <[email protected]> wrote: > >> > If you jstack your new tablet server, does it show a deadlock? >> >> No >> >> On 1/13/15, Eric Newton <[email protected]> wrote: >> > This may be a result of ACCUMULO-3372. If you jstack your new tablet >> > server, does it show a deadlock? >> > >> > $ jps -m >> > 12345 Main tserver --address host:9997 >> > >> > $ jstack 12345 | grep -i deadlock >> > Deadlock detected >> > >> > This particular bug only happens at start-up. There's a trivial patch >> > (which you can find through the bug report), which will be in accumulo >> > 1.6.2. >> > >> > -Eric >> > >> > >> > On Mon, Jan 12, 2015 at 4:06 PM, Denis <[email protected]> wrote: >> > >> >> I have not tried yet anything newer than 1.6.1 >> >> >> >> On 1/12/15, Josh Elser <[email protected]> wrote: >> >> > Denis wrote: >> >> >> created https://issues.apache.org/jira/browse/ACCUMULO-3471 >> >> > >> >> > Thanks a bunch! >> >> > >> >> >> BTW, In 1.6.1 also balancing may get stuck until the master server >> >> >> is >> >> >> restarted. >> >> > >> >> > Is this a known issue in 1.6.1 that's been since fixed or is it >> >> > still >> >> > outstanding? >> >> > >> >> >> But then, after the master restart, balancing works very >> >> >> "aggressively", putting many tablets offline for quite long time >> >> >> (minutes) >> >> >> >> >> >> On 1/11/15, Denis<[email protected]> wrote: >> >> >>> Sometimes it left unbalanced with new tserver hosts zero tablets >> >> >>> or >> >> >>> much less that others. >> >> >>> So I had to restart master to initiate the balancing process. >> >> >>> Then balancing was performed slowly without putting thousands of >> >> >>> tablets offline. >> >> >>> >> >> >>> On 1/11/15, John Vines<[email protected]> wrote: >> >> >>>> I have a hunch that the 1.4 version being used possibly had one >> >> >>>> or >> >> more >> >> >>>> of >> >> >>>> the many bugs regarding balancing getting 'stuck', which was >> >> >>>> typically >> >> >>>> resolved via bouncing the master. Denis, in 1.4 when you brought >> you >> >> >>>> tserver back online, did you find that things were then balanced >> >> >>>> or >> >> did >> >> >>>> you >> >> >>>> just have a tserver up and things were left unbalanced? >> >> >>>> >> >> >>>> On Sun, Jan 11, 2015 at 11:30 AM, Denis<[email protected]> wrote: >> >> >>>> >> >> >>>>> yes, per server >> >> >>>>> >> >> >>>>> On 1/11/15, Sean Busbey<[email protected]> wrote: >> >> >>>>>> On Sat, Jan 10, 2015 at 3:42 PM, Denis<[email protected]> wrote: >> >> >>>>>> On 1/10/15, Christopher<[email protected]> wrote: >> >> >>>>>> >> >> >>>>>>> ... >> >> >>>>>>> 3) how many tablets do you have per server?.... >> >> >>>>>> 3. about 6000 >> >> >>>>>> >> >> >>>>>> Just to confirm, this is 6000 tablets per-server and not 6000 >> >> tablets >> >> >>>>>> per-table or overall, right? >> >> >>>>>> >> >> >>>>>> >> >> >>>>>> -- >> >> >>>>>> Sean >> >> >>>>>> >> >> > >> >> >> > >> >
