[GitHub] ctubbsii commented on issue #357: ACCUMULO-4611 Deprecate commons config in api

2018-01-12 Thread GitBox
ctubbsii commented on issue #357: ACCUMULO-4611 Deprecate commons config in api
URL: https://github.com/apache/accumulo/pull/357#issuecomment-357380247
 
 
   > Well, I did a thing on gitbox.a.o. Maybe that will do it?
   
   I think so. It seems to have added you to the list. It should also give you 
more control over the repos on GitHub, and related things, too (like being able 
to close issues or merge PRs from the UI, and add/remove labels, update 
milestones, and trigger rebuilds and clear caches on Travis CI).
   
   In any case, I would appreciate a review of this if/when you get a chance. I 
understand how important it was to fix the commons-config in our API for Hadoop 
3, and I want to make sure this change is in the direction we can all agree on.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (ACCUMULO-4782) With many threads scanning seeing lock contention on SessionManager

2018-01-12 Thread Keith Turner (JIRA)
Keith Turner created ACCUMULO-4782:
--

 Summary: With many threads scanning seeing lock contention on 
SessionManager
 Key: ACCUMULO-4782
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4782
 Project: Accumulo
  Issue Type: Bug
Affects Versions: 1.8.1, 1.7.3
Reporter: Keith Turner


While profiling many threads doing small scans against accumulo, lock 
contention on the tablet servers SessionManager was high.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] keith-turner opened a new pull request #359: ACCUMULO-4781 fixed logging performance issue

2018-01-12 Thread GitBox
keith-turner opened a new pull request #359: ACCUMULO-4781 fixed logging 
performance issue
URL: https://github.com/apache/accumulo/pull/359
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (ACCUMULO-4781) Per scan logging is expensive

2018-01-12 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324543#comment-16324543
 ] 

Keith Turner commented on ACCUMULO-4781:


For 2.0.0 I am thinking of bumping this logging from debug to trace.  Not sure 
about doing this for 1.7 and 1.8.

> Per scan logging is expensive
> -
>
> Key: ACCUMULO-4781
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4781
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>
> While profiling Accumulo in a situation where many threads where doing small 
> scans it was noticed that per scan logging was expensive. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ACCUMULO-4781) Per scan logging is expensive

2018-01-12 Thread Keith Turner (JIRA)
Keith Turner created ACCUMULO-4781:
--

 Summary: Per scan logging is expensive
 Key: ACCUMULO-4781
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4781
 Project: Accumulo
  Issue Type: Bug
Affects Versions: 1.8.1, 1.7.3
Reporter: Keith Turner


While profiling Accumulo in a situation where many threads where doing small 
scans it was noticed that per scan logging was expensive. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] keith-turner opened a new pull request #358: ACCUMULO-4779 fixed classpath context config performance issue

2018-01-12 Thread GitBox
keith-turner opened a new pull request #358: ACCUMULO-4779 fixed classpath 
context config performance issue
URL: https://github.com/apache/accumulo/pull/358
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] keith-turner commented on a change in pull request #356: ACCUMULO-4777 Removed the unused sequence generator.

2018-01-12 Thread GitBox
keith-turner commented on a change in pull request #356: ACCUMULO-4777 Removed 
the unused sequence generator.
URL: https://github.com/apache/accumulo/pull/356#discussion_r161266750
 
 

 ##
 File path: core/src/main/java/org/apache/accumulo/core/conf/Property.java
 ##
 @@ -263,8 +263,11 @@
   "The maximum size for each write-ahead log. See comment for property 
tserver.memory.maps.max"),
   TSERV_WALOG_MAX_AGE("tserver.walog.max.age", "24h", 
PropertyType.TIMEDURATION, "The maximum age for each write-ahead log."),
   
TSERV_WALOG_TOLERATED_CREATION_FAILURES("tserver.walog.tolerated.creation.failures",
 "50", PropertyType.COUNT,
-  "The maximum number of failures tolerated when creating a new WAL file 
within the period specified by tserver.walog.failures.period."
-  + " Exceeding this number of failures in the period causes the 
TabletServer to exit."),
+  "The maximum number of failures tolerated when creating a new WAL file."
+  + " Exceeding this number of failures consecutively trying to create 
a new WAL causes the TabletServer to exit."),
+  
TSERV_WALOG_TOLERATED_WRITING_FAILURES("tserver.walog.tolerated.writing.failures",
 "1000", PropertyType.COUNT,
 
 Review comment:
   This would be a separate issue, but I think it would be more user friendly 
to make this a time based config.  For example retry for up to 30 min.  Also 
its nice to have the option to retry forever.  
   
   If DFS is grumpy for a bit, do we really want all tservers to exit?  This 
would also be a separate issue, but it would be nice if the behavior was 
different when all tservers are having problems vs a few tservers are having 
problems.  Maybe this is best handled outside of Accumulo, with tservers 
providing enough info for external system to make decisions about killing 
tservers. Basically the health check concept.  An external system can query all 
tservers for health info and make decsions about killing individual ones.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] joshelser commented on issue #357: ACCUMULO-4611 Deprecate commons config in api

2018-01-12 Thread GitBox
joshelser commented on issue #357: ACCUMULO-4611 Deprecate commons config in api
URL: https://github.com/apache/accumulo/pull/357#issuecomment-357282602
 
 
   Well, I did a thing on gitbox.a.o. Maybe that will do it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] joshelser commented on issue #357: ACCUMULO-4611 Deprecate commons config in api

2018-01-12 Thread GitBox
joshelser commented on issue #357: ACCUMULO-4611 Deprecate commons config in api
URL: https://github.com/apache/accumulo/pull/357#issuecomment-357282354
 
 
   > Probably b/c you haven't sync'd with GitBox to be added to the repo? 
   
   Thanks for the ping. I have no clue what that even means :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Accumulo-Pull-Requests - Build # 960 - Fixed

2018-01-12 Thread Apache Jenkins Server
The Apache Jenkins build system has built Accumulo-Pull-Requests (build #960)

Status: Fixed

Check console output at 
https://builds.apache.org/job/Accumulo-Pull-Requests/960/ to view the results.

[jira] [Commented] (ACCUMULO-4777) Root tablet got spammed with 1.8 million log entries

2018-01-12 Thread Ivan Bella (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324162#comment-16324162
 ] 

Ivan Bella commented on ACCUMULO-4777:
--

so if we do an overflow check on that sequence, what would we do?  Depending on 
a continuous sequence anywhere seems like a process destined to eventually fail.

> Root tablet got spammed with 1.8 million log entries
> 
>
> Key: ACCUMULO-4777
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4777
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.8.1
>Reporter: Ivan Bella
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We had a tserver that was handling accumulo.metadata tablets that somehow got 
> into a loop where it created over 22K empty wal logs.  There were around 70 
> metadata tablets and this resulted in around 1.8 million log entries in added 
> to the accumulo.root table.  The only reason it stopped creating wal logs is 
> because it ran out of open file handles.  This took us many hours and cups of 
> coffee to clean up.
> The log contained the following messages in a tight loop:
> log.TabletServerLogger INFO : Using next log hdfs://...
> tserver.TabletServfer INFO : Writing log marker for hdfs://...
> tserver.TabletServer INFO : Marking hdfs://... closed
> log.DfsLogger INFO : Slow sync cost ...
> ...
> Unfortunately we did not have DEBUG turned on so we have no debug messages.
> Tracking through the code there are three places where the 
> TabletServerLogger.close method is called:
> 1) via resetLoggers in the TabletServerLogger, but nothing calls this method 
> so this is ruled out
> 2) when the log gets too large or too old, but neither of those checks should 
> have been hitting here.
> 3) In a loop that is executed (while (!success)) in the 
> TabletServerLogger.write method.  In this case when we unsuccessfullty write 
> something to the wal, then that one is closed and a new one is created.  This 
> loop will go forever until we successfully write out the entry.  A 
> DfsLogger.LogClosedException seems the most logical reason.  This is most 
> likely because a ClosedChannelException was thrown from the DfsLogger.write 
> methods (around line 609 in DfsLogger).
> So the root cause was most likely hadoop related.  However in accumulo we 
> probably should not be doing a tight retry loop around a hadoop failure.  I 
> recommend at a minimum doing some sort of exponential back off and perhaps 
> setting a limit on the number of retries resulting in a critical tserver 
> failure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ACCUMULO-4777) Root tablet got spammed with 1.8 million log entries

2018-01-12 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324157#comment-16324157
 ] 

Keith Turner commented on ACCUMULO-4777:


bq.  I believe the sequence you are referring to is pulled from 
CommitSession.getWALogSeq()

Ok I see now, thanks for the pointer.  My memories are slowly coming back on 
this.   I think each mutation batch used to get a separate seq # in the log.  
The seq # logic was moved to the tablet (for a reason I can not remember) and 
it was only incremented on minor compactions.   During sorting, the seq number 
is only needed to determine if a mutation was before or after a compaction.  
This vestigial code was left behind when CommitSession was created.

This makes me realize that the seq # in commit session has no overflow check.  
If a tablet does over 1 billion minor compactions on the same tablet server, it 
could have strange recovery problems.  I think it increments by 2 because one 
seq # is for minor compaction and the other is for mutations.

> Root tablet got spammed with 1.8 million log entries
> 
>
> Key: ACCUMULO-4777
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4777
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.8.1
>Reporter: Ivan Bella
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We had a tserver that was handling accumulo.metadata tablets that somehow got 
> into a loop where it created over 22K empty wal logs.  There were around 70 
> metadata tablets and this resulted in around 1.8 million log entries in added 
> to the accumulo.root table.  The only reason it stopped creating wal logs is 
> because it ran out of open file handles.  This took us many hours and cups of 
> coffee to clean up.
> The log contained the following messages in a tight loop:
> log.TabletServerLogger INFO : Using next log hdfs://...
> tserver.TabletServfer INFO : Writing log marker for hdfs://...
> tserver.TabletServer INFO : Marking hdfs://... closed
> log.DfsLogger INFO : Slow sync cost ...
> ...
> Unfortunately we did not have DEBUG turned on so we have no debug messages.
> Tracking through the code there are three places where the 
> TabletServerLogger.close method is called:
> 1) via resetLoggers in the TabletServerLogger, but nothing calls this method 
> so this is ruled out
> 2) when the log gets too large or too old, but neither of those checks should 
> have been hitting here.
> 3) In a loop that is executed (while (!success)) in the 
> TabletServerLogger.write method.  In this case when we unsuccessfullty write 
> something to the wal, then that one is closed and a new one is created.  This 
> loop will go forever until we successfully write out the entry.  A 
> DfsLogger.LogClosedException seems the most logical reason.  This is most 
> likely because a ClosedChannelException was thrown from the DfsLogger.write 
> methods (around line 609 in DfsLogger).
> So the root cause was most likely hadoop related.  However in accumulo we 
> probably should not be doing a tight retry loop around a hadoop failure.  I 
> recommend at a minimum doing some sort of exponential back off and perhaps 
> setting a limit on the number of retries resulting in a critical tserver 
> failure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Accumulo-Pull-Requests - Build # 959 - Unstable

2018-01-12 Thread Apache Jenkins Server
The Apache Jenkins build system has built Accumulo-Pull-Requests (build #959)

Status: Unstable

Check console output at 
https://builds.apache.org/job/Accumulo-Pull-Requests/959/ to view the results.

[jira] [Commented] (ACCUMULO-4777) Root tablet got spammed with 1.8 million log entries

2018-01-12 Thread Ivan Bella (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324086#comment-16324086
 ] 

Ivan Bella commented on ACCUMULO-4777:
--

[~kturner] I believe the sequence you are referring to is pulled from 
CommitSession.getWALogSeq() which is populated from the nextSeq int in 
TabletMemory.

> Root tablet got spammed with 1.8 million log entries
> 
>
> Key: ACCUMULO-4777
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4777
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.8.1
>Reporter: Ivan Bella
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We had a tserver that was handling accumulo.metadata tablets that somehow got 
> into a loop where it created over 22K empty wal logs.  There were around 70 
> metadata tablets and this resulted in around 1.8 million log entries in added 
> to the accumulo.root table.  The only reason it stopped creating wal logs is 
> because it ran out of open file handles.  This took us many hours and cups of 
> coffee to clean up.
> The log contained the following messages in a tight loop:
> log.TabletServerLogger INFO : Using next log hdfs://...
> tserver.TabletServfer INFO : Writing log marker for hdfs://...
> tserver.TabletServer INFO : Marking hdfs://... closed
> log.DfsLogger INFO : Slow sync cost ...
> ...
> Unfortunately we did not have DEBUG turned on so we have no debug messages.
> Tracking through the code there are three places where the 
> TabletServerLogger.close method is called:
> 1) via resetLoggers in the TabletServerLogger, but nothing calls this method 
> so this is ruled out
> 2) when the log gets too large or too old, but neither of those checks should 
> have been hitting here.
> 3) In a loop that is executed (while (!success)) in the 
> TabletServerLogger.write method.  In this case when we unsuccessfullty write 
> something to the wal, then that one is closed and a new one is created.  This 
> loop will go forever until we successfully write out the entry.  A 
> DfsLogger.LogClosedException seems the most logical reason.  This is most 
> likely because a ClosedChannelException was thrown from the DfsLogger.write 
> methods (around line 609 in DfsLogger).
> So the root cause was most likely hadoop related.  However in accumulo we 
> probably should not be doing a tight retry loop around a hadoop failure.  I 
> recommend at a minimum doing some sort of exponential back off and perhaps 
> setting a limit on the number of retries resulting in a critical tserver 
> failure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ACCUMULO-4777) Root tablet got spammed with 1.8 million log entries

2018-01-12 Thread Ivan Bella (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324069#comment-16324069
 ] 

Ivan Bella commented on ACCUMULO-4777:
--

BTW An option here is to implement the backoff mechanism against a separate 
ticket so that we can get the unused sequence generation mechanism removed 
immediately.

> Root tablet got spammed with 1.8 million log entries
> 
>
> Key: ACCUMULO-4777
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4777
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.8.1
>Reporter: Ivan Bella
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We had a tserver that was handling accumulo.metadata tablets that somehow got 
> into a loop where it created over 22K empty wal logs.  There were around 70 
> metadata tablets and this resulted in around 1.8 million log entries in added 
> to the accumulo.root table.  The only reason it stopped creating wal logs is 
> because it ran out of open file handles.  This took us many hours and cups of 
> coffee to clean up.
> The log contained the following messages in a tight loop:
> log.TabletServerLogger INFO : Using next log hdfs://...
> tserver.TabletServfer INFO : Writing log marker for hdfs://...
> tserver.TabletServer INFO : Marking hdfs://... closed
> log.DfsLogger INFO : Slow sync cost ...
> ...
> Unfortunately we did not have DEBUG turned on so we have no debug messages.
> Tracking through the code there are three places where the 
> TabletServerLogger.close method is called:
> 1) via resetLoggers in the TabletServerLogger, but nothing calls this method 
> so this is ruled out
> 2) when the log gets too large or too old, but neither of those checks should 
> have been hitting here.
> 3) In a loop that is executed (while (!success)) in the 
> TabletServerLogger.write method.  In this case when we unsuccessfullty write 
> something to the wal, then that one is closed and a new one is created.  This 
> loop will go forever until we successfully write out the entry.  A 
> DfsLogger.LogClosedException seems the most logical reason.  This is most 
> likely because a ClosedChannelException was thrown from the DfsLogger.write 
> methods (around line 609 in DfsLogger).
> So the root cause was most likely hadoop related.  However in accumulo we 
> probably should not be doing a tight retry loop around a hadoop failure.  I 
> recommend at a minimum doing some sort of exponential back off and perhaps 
> setting a limit on the number of retries resulting in a critical tserver 
> failure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] ivakegg commented on a change in pull request #356: ACCUMULO-4777 Removed the unused sequence generator.

2018-01-12 Thread GitBox
ivakegg commented on a change in pull request #356: ACCUMULO-4777 Removed the 
unused sequence generator.
URL: https://github.com/apache/accumulo/pull/356#discussion_r161241237
 
 

 ##
 File path: 
server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java
 ##
 @@ -400,24 +405,34 @@ private int write(final Collection 
sessions, boolean mincFinish,
 if (currentLogId == logId.get()) {
 
   // write the mutation to the logs
-  seq = seqGen.incrementAndGet();
-  if (seq < 0)
-throw new RuntimeException("Logger sequence generator wrapped!  
Onos!!!11!eleven");
-  LoggerOperation lop = writer.write(copy, seq);
+  LoggerOperation lop = writer.write(copy);
   lop.await();
 
   // double-check: did the log set change?
   success = (currentLogId == logId.get());
 }
   } catch (DfsLogger.LogClosedException ex) {
-log.debug("Logs closed while writing, retrying " + attempt);
+log.debug("Logs closed while writing, retrying " + 
writeRetry.retriesCompleted());
   } catch (Exception t) {
-if (attempt != 1) {
-  log.error("Unexpected error writing to log, retrying attempt " + 
attempt, t);
+// We have more retries or we exceeded the maximum number of accepted 
failures
+if (writeRetry.canRetry()) {
+  // Use the createRetry and record the time in which we did so
+  writeRetry.useRetry();
+
+  try {
+// Backoff
+writeRetry.waitForNextAttempt();
+  } catch (InterruptedException e) {
+Thread.currentThread().interrupt();
+throw new RuntimeException(e);
+  }
+} else {
+  log.error("Repeatedly failed to write WAL. Going to exit 
tabletserver.", t);
 
 Review comment:
   Is a termination reasonable in this circumstance or should we retry forever 
as we were doing before?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (ACCUMULO-4777) Root tablet got spammed with 1.8 million log entries

2018-01-12 Thread Ivan Bella (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324050#comment-16324050
 ] 

Ivan Bella commented on ACCUMULO-4777:
--

I updated the pull request with a backoff mechanism and termination criteria 
when failing to write to the WALs.  I used a mechanism parallel to the WAL 
creation backoff process.

> Root tablet got spammed with 1.8 million log entries
> 
>
> Key: ACCUMULO-4777
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4777
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.8.1
>Reporter: Ivan Bella
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We had a tserver that was handling accumulo.metadata tablets that somehow got 
> into a loop where it created over 22K empty wal logs.  There were around 70 
> metadata tablets and this resulted in around 1.8 million log entries in added 
> to the accumulo.root table.  The only reason it stopped creating wal logs is 
> because it ran out of open file handles.  This took us many hours and cups of 
> coffee to clean up.
> The log contained the following messages in a tight loop:
> log.TabletServerLogger INFO : Using next log hdfs://...
> tserver.TabletServfer INFO : Writing log marker for hdfs://...
> tserver.TabletServer INFO : Marking hdfs://... closed
> log.DfsLogger INFO : Slow sync cost ...
> ...
> Unfortunately we did not have DEBUG turned on so we have no debug messages.
> Tracking through the code there are three places where the 
> TabletServerLogger.close method is called:
> 1) via resetLoggers in the TabletServerLogger, but nothing calls this method 
> so this is ruled out
> 2) when the log gets too large or too old, but neither of those checks should 
> have been hitting here.
> 3) In a loop that is executed (while (!success)) in the 
> TabletServerLogger.write method.  In this case when we unsuccessfullty write 
> something to the wal, then that one is closed and a new one is created.  This 
> loop will go forever until we successfully write out the entry.  A 
> DfsLogger.LogClosedException seems the most logical reason.  This is most 
> likely because a ClosedChannelException was thrown from the DfsLogger.write 
> methods (around line 609 in DfsLogger).
> So the root cause was most likely hadoop related.  However in accumulo we 
> probably should not be doing a tight retry loop around a hadoop failure.  I 
> recommend at a minimum doing some sort of exponential back off and perhaps 
> setting a limit on the number of retries resulting in a critical tserver 
> failure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] jkrdev commented on issue #341: ACCUMULO-3902 Ensure [Batch]Scanners are closed in ITs

2018-01-12 Thread GitBox
jkrdev commented on issue #341: ACCUMULO-3902 Ensure [Batch]Scanners are closed 
in ITs
URL: https://github.com/apache/accumulo/pull/341#issuecomment-357256342
 
 
   Awesome! Thanks for merging! And ha just let me know and I will try to get 
it done. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services