[GitHub] [accumulo-website] ctubbsii commented on issue #190: Update new PMC members on people page
ctubbsii commented on issue #190: Update new PMC members on people page URL: https://github.com/apache/accumulo-website/issues/190#issuecomment-520120467 @alerman Even without linking your Apache ID to your GitHub account, you can still push changes directly to: `https://gitbox.apache.org/repos/asf/accumulo{,-website, etc...}` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo-website] alerman commented on issue #190: Update new PMC members on people page
alerman commented on issue #190: Update new PMC members on people page URL: https://github.com/apache/accumulo-website/issues/190#issuecomment-520110896 I hadn't linked my github account to my Apache Id. That is done now so once that updates I will push the change for me. Thanks @ctubbsii This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo-website] keith-turner commented on issue #191: Move entry to contributors
keith-turner commented on issue #191: Move entry to contributors URL: https://github.com/apache/accumulo-website/pull/191#issuecomment-520060156 The asf-site branch is automatically generated from Jekyll. Need to edit the markdown page at https://github.com/apache/accumulo-website/blob/master/pages/people.md in the master branch. The website readme as some instructions for running jekyll to update the asf-site branch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo-website] keith-turner edited a comment on issue #191: Move entry to contributors
keith-turner edited a comment on issue #191: Move entry to contributors URL: https://github.com/apache/accumulo-website/pull/191#issuecomment-520060156 The asf-site branch is automatically generated from Jekyll. Need to edit the markdown page at https://github.com/apache/accumulo-website/blob/master/pages/people.md in the master branch. The website readme has some instructions for running jekyll to update the asf-site branch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo-website] hkeebler opened a new pull request #191: Move entry to contributors
hkeebler opened a new pull request #191: Move entry to contributors URL: https://github.com/apache/accumulo-website/pull/191 Practice - ticket #190 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo-website] ctubbsii opened a new issue #190: Update new PMC members on people page
ctubbsii opened a new issue #190: Update new PMC members on people page URL: https://github.com/apache/accumulo-website/issues/190 @hkeebler and @alerman were recently voted as committers/PMC members to Accumulo. I'm creating this issue to encourage them to use their new privileges to commit directly to an Accumulo repository, and the easiest first commit is normally to each move themselves from the contributors section to the PMC section of the people page on the website. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance
ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance URL: https://github.com/apache/accumulo/pull/1318#discussion_r312584634 ## File path: server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectWriteAheadLogs.java ## @@ -90,20 +89,13 @@ * @param useTrash * true to move files to trash rather than delete them */ - GarbageCollectWriteAheadLogs(final AccumuloServerContext context, VolumeManager fs, - boolean useTrash) throws IOException { + GarbageCollectWriteAheadLogs(final AccumuloServerContext context, final VolumeManager fs, + final LiveTServerSet liveServers, boolean useTrash) throws IOException { this.context = context; this.fs = fs; this.useTrash = useTrash; -this.liveServers = new LiveTServerSet(context, new Listener() { - @Override - public void update(LiveTServerSet current, Set deleted, - Set added) { -log.debug("New tablet servers noticed: " + added); -log.debug("Tablet servers removed: " + deleted); - } -}); -liveServers.startListeningForTabletServerChanges(); +this.liveServers = liveServers; Review comment: The choice is to decide whether `startListeningForTabletServerChanges()` in a single background task every 5 seconds is better, or calling `scanServers()` immediately before using the set (once each gc cycle) is better. I don't know that one is significantly easier to reason about than the other. In both scenarios we are getting a relatively recent view of the current live tservers (within 5 seconds). But, I do like avoiding the background task if we don't actually need it. SimpleTimer is dubious. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance
ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance URL: https://github.com/apache/accumulo/pull/1318#discussion_r312584634 ## File path: server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectWriteAheadLogs.java ## @@ -90,20 +89,13 @@ * @param useTrash * true to move files to trash rather than delete them */ - GarbageCollectWriteAheadLogs(final AccumuloServerContext context, VolumeManager fs, - boolean useTrash) throws IOException { + GarbageCollectWriteAheadLogs(final AccumuloServerContext context, final VolumeManager fs, + final LiveTServerSet liveServers, boolean useTrash) throws IOException { this.context = context; this.fs = fs; this.useTrash = useTrash; -this.liveServers = new LiveTServerSet(context, new Listener() { - @Override - public void update(LiveTServerSet current, Set deleted, - Set added) { -log.debug("New tablet servers noticed: " + added); -log.debug("Tablet servers removed: " + deleted); - } -}); -liveServers.startListeningForTabletServerChanges(); +this.liveServers = liveServers; Review comment: The choice is to decide whether `startListeningForTabletServerChanges()` in a single background task every 5 seconds is better, or calling `scanServers()` immediately before using the set (once each gc cycle) is better. I don't know that one is significantly easier to reason about than the other. In both scenarios we are getting a relatively recent view of the current live tservers. But, I do like avoiding the background task if we don't actually need it. SimpleTimer is dubious. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance
ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance URL: https://github.com/apache/accumulo/pull/1318#discussion_r312584634 ## File path: server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectWriteAheadLogs.java ## @@ -90,20 +89,13 @@ * @param useTrash * true to move files to trash rather than delete them */ - GarbageCollectWriteAheadLogs(final AccumuloServerContext context, VolumeManager fs, - boolean useTrash) throws IOException { + GarbageCollectWriteAheadLogs(final AccumuloServerContext context, final VolumeManager fs, + final LiveTServerSet liveServers, boolean useTrash) throws IOException { this.context = context; this.fs = fs; this.useTrash = useTrash; -this.liveServers = new LiveTServerSet(context, new Listener() { - @Override - public void update(LiveTServerSet current, Set deleted, - Set added) { -log.debug("New tablet servers noticed: " + added); -log.debug("Tablet servers removed: " + deleted); - } -}); -liveServers.startListeningForTabletServerChanges(); +this.liveServers = liveServers; Review comment: The choice is to decide whether `startListeningForTabletServerChanges()` in a single background task every 5 seconds is better, or calling `scanServers()` immediately before using the set is better. I don't know that one is significantly easier to reason about than the other. In both scenarios we are getting a relatively recent view of the current live tservers. But, I do like avoiding the background task if we don't actually need it. SimpleTimer is dubious. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance
ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance URL: https://github.com/apache/accumulo/pull/1318#discussion_r312582344 ## File path: server/gc/src/main/java/org/apache/accumulo/gc/SimpleGarbageCollector.java ## @@ -564,6 +566,19 @@ private void run() { ProbabilitySampler sampler = new ProbabilitySampler(getConfiguration().getFraction(Property.GC_TRACE_PERCENT)); +// This is created outside of the run loop and passed to the walogCollector so that +// only a single timed task is created (internal to LiveTServerSet using SimpleTimer. +final LiveTServerSet liveTServerSet = new LiveTServerSet(this, new LiveTServerSet.Listener() { + @Override + public void update(LiveTServerSet current, Set deleted, + Set added) { +if (!deleted.isEmpty() || !added.isEmpty()) { + log.debug("Number of current servers {}, tservers added {}, removed {}", + current == null ? -1 : current.size(), added, deleted); +} + } +}); + Review comment: I agree. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (ACCUMULO-4867) Java Collections Improvements in DefaultLoadBalancer
[ https://issues.apache.org/jira/browse/ACCUMULO-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-4867. - Resolution: Duplicate Closed as duplicate of the linked GitHub PR. We no longer use JIRA to track new issues. The GitHub PR is sufficient on its own. > Java Collections Improvements in DefaultLoadBalancer > > > Key: ACCUMULO-4867 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4867 > Project: Accumulo > Issue Type: Improvement > Components: server-base >Reporter: David Mollitor >Priority: Minor > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [accumulo] ctubbsii edited a comment on issue #1225: Use fewer ZooKeeper watchers
ctubbsii edited a comment on issue #1225: Use fewer ZooKeeper watchers URL: https://github.com/apache/accumulo/issues/1225#issuecomment-520001967 The (potential) update to ZK 3.5 is unrelated to this issue. It was mentioned here merely for the possibility of using local session feature, which is one of many possible ways to reduce the number of watchers we use. If we want to use that feature (which is only one possible way to reduce the number of watchers... and not necessarily even the best way to do so), then upgrading is a prerequisite, but otherwise, it is not related to this issue. Please create a separate issue if you want to pursue upgrading ZK to 3.5. @jzgithub1 The build error you were getting is because we have strict dependency checks in the build. That error indicates we need to explicitly name a direct dependency, if we are using code that is actually located in a transitive dependency... or that we should remove a declared dependency if we are not directly using code from that dependency. As for the origin of zookeeper-jute... I don't know what that repo is that you found, but I found the zookeeper-jute code in ZooKeeper's repository here: https://github.com/apache/zookeeper/tree/master/zookeeper-jute quite easily... it's just another component of ZooKeeper's source code, split into a sub-module, as @nkalmar says. @nkalmar If we update to require ZK 3.5, rest assured, we will update to the latest patch version. I'm not sure how we're using curator, but it's very limited, and I'm not sure the code that uses it is current at all. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] ctubbsii commented on a change in pull request #1319: ACCUMULO-4867: Java Collections Improvements in DefaultLoadBalancer
ctubbsii commented on a change in pull request #1319: ACCUMULO-4867: Java Collections Improvements in DefaultLoadBalancer URL: https://github.com/apache/accumulo/pull/1319#discussion_r312580397 ## File path: server/base/src/main/java/org/apache/accumulo/server/master/balancer/DefaultLoadBalancer.java ## @@ -26,6 +26,7 @@ import java.util.Objects; import java.util.Set; import java.util.SortedMap; +import java.util.concurrent.ThreadLocalRandom; Review comment: Don't need this import. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] ctubbsii commented on issue #1225: Use fewer ZooKeeper watchers
ctubbsii commented on issue #1225: Use fewer ZooKeeper watchers URL: https://github.com/apache/accumulo/issues/1225#issuecomment-520001967 The (potential) update to ZK 3.5 is unrelated to this issue. It was mentioned here merely for the possibility of using local session feature, which is one of many possible ways to reduce the number of watchers we use. If we want to use that feature (which is only one possible way to reduce the number of watchers... and not necessarily even the best way to do so), then upgrading is a prerequisite, but otherwise, it is not related to this issue. Please create a separate issue if you want to pursue upgrading ZK to 3.5. @jzgithub1 The build error you were getting is because we have strict dependency checks in the build. That error indicates we need to explicitly name a direct dependency, if we are using code that is actually located in a transitive dependency... or that we should remove a declared dependency if we are not directly using code from that dependency. As for the origin of zookeeper-jute... I don't know what that repo is that you found, but I found the zookeeper-jute code in ZooKeeper's repository here: https://github.com/apache/zookeeper/tree/master/zookeeper-jute quite easily... it's just another component of ZooKeeper's source code, split into a sub-module, as @nkalmar says. @nkalmar If we update to require ZK 3.5, rest assured, we will update to the latest patch version. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] keith-turner commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance
keith-turner commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance URL: https://github.com/apache/accumulo/pull/1318#discussion_r312573630 ## File path: server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectWriteAheadLogs.java ## @@ -90,20 +89,13 @@ * @param useTrash * true to move files to trash rather than delete them */ - GarbageCollectWriteAheadLogs(final AccumuloServerContext context, VolumeManager fs, - boolean useTrash) throws IOException { + GarbageCollectWriteAheadLogs(final AccumuloServerContext context, final VolumeManager fs, + final LiveTServerSet liveServers, boolean useTrash) throws IOException { this.context = context; this.fs = fs; this.useTrash = useTrash; -this.liveServers = new LiveTServerSet(context, new Listener() { - @Override - public void update(LiveTServerSet current, Set deleted, - Set added) { -log.debug("New tablet servers noticed: " + added); -log.debug("Tablet servers removed: " + deleted); - } -}); -liveServers.startListeningForTabletServerChanges(); +this.liveServers = liveServers; Review comment: The existing code was always calling `scanServers()` because `startListeningForTabletServerChanges()` always called it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] keith-turner commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance
keith-turner commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance URL: https://github.com/apache/accumulo/pull/1318#discussion_r312572995 ## File path: server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectWriteAheadLogs.java ## @@ -90,20 +89,13 @@ * @param useTrash * true to move files to trash rather than delete them */ - GarbageCollectWriteAheadLogs(final AccumuloServerContext context, VolumeManager fs, - boolean useTrash) throws IOException { + GarbageCollectWriteAheadLogs(final AccumuloServerContext context, final VolumeManager fs, + final LiveTServerSet liveServers, boolean useTrash) throws IOException { this.context = context; this.fs = fs; this.useTrash = useTrash; -this.liveServers = new LiveTServerSet(context, new Listener() { - @Override - public void update(LiveTServerSet current, Set deleted, - Set added) { -log.debug("New tablet servers noticed: " + added); -log.debug("Tablet servers removed: " + deleted); - } -}); -liveServers.startListeningForTabletServerChanges(); +this.liveServers = liveServers; Review comment: I think its best to separate the issue of logging tserver information and reasoning about GC correctness. The timer seems to add unnecessary concurrency, that while it seems correct does make reasoning about the GC algorithm more complex. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance
ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance URL: https://github.com/apache/accumulo/pull/1318#discussion_r312571500 ## File path: server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectWriteAheadLogs.java ## @@ -90,20 +89,13 @@ * @param useTrash * true to move files to trash rather than delete them */ - GarbageCollectWriteAheadLogs(final AccumuloServerContext context, VolumeManager fs, - boolean useTrash) throws IOException { + GarbageCollectWriteAheadLogs(final AccumuloServerContext context, final VolumeManager fs, + final LiveTServerSet liveServers, boolean useTrash) throws IOException { this.context = context; this.fs = fs; this.useTrash = useTrash; -this.liveServers = new LiveTServerSet(context, new Listener() { - @Override - public void update(LiveTServerSet current, Set deleted, - Set added) { -log.debug("New tablet servers noticed: " + added); -log.debug("Tablet servers removed: " + deleted); - } -}); -liveServers.startListeningForTabletServerChanges(); +this.liveServers = liveServers; Review comment: One benefit for the timer is that it keeps the tserver set up-to-date so we don't have to wait to scan them in the GC loop. I'm not sure that's a big deal, though. Another is that it lets us know about the changes in the logs. However, using `scanServers()` directly here would still let us know about changes in the logs... so long as the LiveTServersSet object is kept as a singleton in the SimpleGarbageCollector. The problem with the first "option 2" mentioned in #1314 that used `scanServers()` directly is that a new LiveTServersSet was being created over and over again, so the logs would have always shown an "updated" list of tservers being the entire set of current tservers, before being reset... not very useful. But the `scanServers()` option should work fine now with respect to logging changes that the LiveTServersSet object is being passed in and reused. We could even remove the guard, so it always logs on each cycle, instead of skipping logging in the "no changes" scenario (which was desired with the listener, to reduce spam). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] keith-turner commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance
keith-turner commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance URL: https://github.com/apache/accumulo/pull/1318#discussion_r312571698 ## File path: server/gc/src/main/java/org/apache/accumulo/gc/SimpleGarbageCollector.java ## @@ -564,6 +566,19 @@ private void run() { ProbabilitySampler sampler = new ProbabilitySampler(getConfiguration().getFraction(Property.GC_TRACE_PERCENT)); +// This is created outside of the run loop and passed to the walogCollector so that +// only a single timed task is created (internal to LiveTServerSet using SimpleTimer. +final LiveTServerSet liveTServerSet = new LiveTServerSet(this, new LiveTServerSet.Listener() { + @Override + public void update(LiveTServerSet current, Set deleted, + Set added) { +if (!deleted.isEmpty() || !added.isEmpty()) { + log.debug("Number of current servers {}, tservers added {}, removed {}", + current == null ? -1 : current.size(), added, deleted); +} + } +}); + Review comment: ok, I think I like calling `scanServers()` explicitly anyway because it seems easier to reason about when things happen during the GC cycle. Otherwise there is some timer that runs concurrently at random points during the cycle. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance
ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance URL: https://github.com/apache/accumulo/pull/1318#discussion_r312571500 ## File path: server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectWriteAheadLogs.java ## @@ -90,20 +89,13 @@ * @param useTrash * true to move files to trash rather than delete them */ - GarbageCollectWriteAheadLogs(final AccumuloServerContext context, VolumeManager fs, - boolean useTrash) throws IOException { + GarbageCollectWriteAheadLogs(final AccumuloServerContext context, final VolumeManager fs, + final LiveTServerSet liveServers, boolean useTrash) throws IOException { this.context = context; this.fs = fs; this.useTrash = useTrash; -this.liveServers = new LiveTServerSet(context, new Listener() { - @Override - public void update(LiveTServerSet current, Set deleted, - Set added) { -log.debug("New tablet servers noticed: " + added); -log.debug("Tablet servers removed: " + deleted); - } -}); -liveServers.startListeningForTabletServerChanges(); +this.liveServers = liveServers; Review comment: One benefit for the timer is that it keeps the tserver set up-to-date so we don't have to wait to scan them in the GC loop. I'm not sure that's a big deal, though. Another is that it lets us know about the changes in the logs. However, using `scanServers()` directly here would still let us know about changes in the logs... so long as the LiveTServersSet object is kept as a singleton in the SimpleGarbageCollector. The problem with the first "option 2" mentioned in #1314 that used `scanServers()` directly is that a new LiveTServersSet was being created over and over again, so the logs would have always shown an "updated" list of tservers being the entire set of current tservers, before being reset... not very useful. But the `scanServers()` option should work fine now that the LiveTServersSet object is being passed in and reused. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance
ctubbsii commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance URL: https://github.com/apache/accumulo/pull/1318#discussion_r312569249 ## File path: server/gc/src/main/java/org/apache/accumulo/gc/SimpleGarbageCollector.java ## @@ -564,6 +566,19 @@ private void run() { ProbabilitySampler sampler = new ProbabilitySampler(getConfiguration().getFraction(Property.GC_TRACE_PERCENT)); +// This is created outside of the run loop and passed to the walogCollector so that +// only a single timed task is created (internal to LiveTServerSet using SimpleTimer. +final LiveTServerSet liveTServerSet = new LiveTServerSet(this, new LiveTServerSet.Listener() { + @Override + public void update(LiveTServerSet current, Set deleted, + Set added) { +if (!deleted.isEmpty() || !added.isEmpty()) { + log.debug("Number of current servers {}, tservers added {}, removed {}", + current == null ? -1 : current.size(), added, deleted); +} + } +}); + Review comment: I discussed this solution with @EdColeman yesterday. I think the intent was to call `liveServers.startListeningForTabletServerChanges()` here. I had thought it was present when I approved the PR, but now I can't find it in the commits. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] keith-turner edited a comment on issue #1320: Review code and test for Accumulo's zookeeper atomic mutation function
keith-turner edited a comment on issue #1320: Review code and test for Accumulo's zookeeper atomic mutation function URL: https://github.com/apache/accumulo/issues/1320#issuecomment-519991726 Also, its very important the `mutate()` function not retry when session expired. This is an indication that an Accumulo server process has lost its lock in ZK and therefore should not be updating anything in ZK. #1086 is related. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] keith-turner commented on issue #1320: Review code and test for Accumulo's zookeeper atomic mutation function
keith-turner commented on issue #1320: Review code and test for Accumulo's zookeeper atomic mutation function URL: https://github.com/apache/accumulo/issues/1320#issuecomment-519991726 Also, its very important the `mutate()` function not retry when session expired. This is an indication that an Accumulo server process has lost its lock in ZK and therefore should not be updating anything in ZK. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] keith-turner commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance
keith-turner commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance URL: https://github.com/apache/accumulo/pull/1318#discussion_r312555392 ## File path: server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectWriteAheadLogs.java ## @@ -90,20 +89,13 @@ * @param useTrash * true to move files to trash rather than delete them */ - GarbageCollectWriteAheadLogs(final AccumuloServerContext context, VolumeManager fs, - boolean useTrash) throws IOException { + GarbageCollectWriteAheadLogs(final AccumuloServerContext context, final VolumeManager fs, + final LiveTServerSet liveServers, boolean useTrash) throws IOException { this.context = context; this.fs = fs; this.useTrash = useTrash; -this.liveServers = new LiveTServerSet(context, new Listener() { - @Override - public void update(LiveTServerSet current, Set deleted, - Set added) { -log.debug("New tablet servers noticed: " + added); -log.debug("Tablet servers removed: " + deleted); - } -}); -liveServers.startListeningForTabletServerChanges(); +this.liveServers = liveServers; Review comment: Could call `liveServers.scanServers()` here making it more functionally equivalent to the old code. The old code used to call `liveServers.startListeningForTabletServerChanges()` which internally called `scanServers()` and registered a timer to call `scanServers()`. I don't think registering the timer is needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] keith-turner commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance
keith-turner commented on a change in pull request #1318: Fix #1314 fixes gc thread issue by creating one timed task instance URL: https://github.com/apache/accumulo/pull/1318#discussion_r312556108 ## File path: server/gc/src/main/java/org/apache/accumulo/gc/SimpleGarbageCollector.java ## @@ -564,6 +566,19 @@ private void run() { ProbabilitySampler sampler = new ProbabilitySampler(getConfiguration().getFraction(Property.GC_TRACE_PERCENT)); +// This is created outside of the run loop and passed to the walogCollector so that +// only a single timed task is created (internal to LiveTServerSet using SimpleTimer. +final LiveTServerSet liveTServerSet = new LiveTServerSet(this, new LiveTServerSet.Listener() { + @Override + public void update(LiveTServerSet current, Set deleted, + Set added) { +if (!deleted.isEmpty() || !added.isEmpty()) { + log.debug("Number of current servers {}, tservers added {}, removed {}", + current == null ? -1 : current.size(), added, deleted); +} + } +}); + Review comment: Why not call here `liveServers.startListeningForTabletServerChanges()` here? I don't think its needed if my other suggestion to call `scanServers` is made. However I am curious if there was reason for not calling it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] keith-turner opened a new issue #1321: Review how Ample reads root tablet metadata from zookeeper
keith-turner opened a new issue #1321: Review how Ample reads root tablet metadata from zookeeper URL: https://github.com/apache/accumulo/issues/1321 Currently Ample reads root tablet metadata using ZooCache. I am not sure this is always appropriate. Its certainly appropriate and desirable in some cases, but I am not sure about all. May want to add a consistency level to the Ample API and make all code choose if it wants eventual or immediate consistency. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] keith-turner commented on issue #817: Refactor tablet loading code to use TabletMetadata
keith-turner commented on issue #817: Refactor tablet loading code to use TabletMetadata URL: https://github.com/apache/accumulo/issues/817#issuecomment-519972273 This was done by e357d8ab5a29f28794593d31ca2981619266499c and #1302 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] keith-turner closed issue #817: Refactor tablet loading code to use TabletMetadata
keith-turner closed issue #817: Refactor tablet loading code to use TabletMetadata URL: https://github.com/apache/accumulo/issues/817 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] keith-turner commented on issue #1086: ZooLock uses ZooReader which may retry on session expired
keith-turner commented on issue #1086: ZooLock uses ZooReader which may retry on session expired URL: https://github.com/apache/accumulo/issues/1086#issuecomment-519970692 I added this a blocker because I think it would be good to investigate before 1.9.4. Since its a long standing issue not introduced in 1.9.3, It need not block 1.9.4 if not investigated in time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] keith-turner opened a new issue #1320: Review code and test for Accumulo's zookeeper atomic mutation function
keith-turner opened a new issue #1320: Review code and test for Accumulo's zookeeper atomic mutation function URL: https://github.com/apache/accumulo/issues/1320 Changes like #1313 heavily depend on accumulo's `ZooReaderWriter.mutate()` to make atomic changes to zookeeper using optimistic locking. Would be good to thoroughly review this code and its test. Ideally there would be a multi-threaded stress test of some sort. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] belugabehr opened a new pull request #1319: ACCUMULO-4867: Java Collections Improvements in DefaultLoadBalancer
belugabehr opened a new pull request #1319: ACCUMULO-4867: Java Collections Improvements in DefaultLoadBalancer URL: https://github.com/apache/accumulo/pull/1319 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (ACCUMULO-4867) Java Collections Improvements in DefaultLoadBalancer
[ https://issues.apache.org/jira/browse/ACCUMULO-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4867: - Labels: pull-request-available (was: ) > Java Collections Improvements in DefaultLoadBalancer > > > Key: ACCUMULO-4867 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4867 > Project: Accumulo > Issue Type: Improvement > Components: server-base >Reporter: David Mollitor >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ACCUMULO-4867) Java Collections Improvements in DefaultLoadBalancer
David Mollitor created ACCUMULO-4867: Summary: Java Collections Improvements in DefaultLoadBalancer Key: ACCUMULO-4867 URL: https://issues.apache.org/jira/browse/ACCUMULO-4867 Project: Accumulo Issue Type: Improvement Components: server-base Reporter: David Mollitor -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [accumulo] keith-turner commented on issue #1043: Support stable ~del split points and avoid ~del hotspots
keith-turner commented on issue #1043: Support stable ~del split points and avoid ~del hotspots URL: https://github.com/apache/accumulo/issues/1043#issuecomment-519962464 Another reason to resolve in upgrade is that over time handling more legacy persisted formats becomes harder and harder to test. For example assume in the future there are 5 different legacy data formats for the metadata table, will we ever test different combination of the legacy formats? No, that testing will likely not be done. Doing a transformation in the upgrade step and testing that transformation independently seems more maintainable. The current code only needs to worry about testing the current format. The testing is much simpler and can better cover the data expected. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] keith-turner commented on issue #1043: Support stable ~del split points and avoid ~del hotspots
keith-turner commented on issue #1043: Support stable ~del split points and avoid ~del hotspots URL: https://github.com/apache/accumulo/issues/1043#issuecomment-519959871 > what did anyone think of the solution in my previous comment that would not require an upgrade? I think its a clever solution and I think that utilizing the values is a good way to go if we do not resolve this in upgrade. I am leaning towards resolving this in upgrade because I think it keeps the existing code simpler, but I am not 100% sure this is the best path to take. The `makeRelative()` function you asked about earlier has some code in it to handle legacy paths that makes it more complicated. We should improve `makeRelative()` to clearly document parts of it that are handling legacy paths. That code is confusing now because of the legacy support. If think if you pursue not resolving this in upgrade that it would be best to clearly document and isolate the code that handles legacy delete candidates. If you are going to pursue the upgrade path, the following is one possible way to do it. * First submit a PR that changes the code to use hashes without any concern for upgrade. So it only works with a newyly initialized system at first. Just add something that throws an exception in the upgrade code. * Second submit a PR for the upgrade code. I will try to get #1313 in ASAP so you can build on it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] hkeebler commented on issue #1043: Support stable ~del split points and avoid ~del hotspots
hkeebler commented on issue #1043: Support stable ~del split points and avoid ~del hotspots URL: https://github.com/apache/accumulo/issues/1043#issuecomment-519937890 So we should say that this is dependent on #1313 being completed. I will start taking a look at the upgrade code necessary but what did anyone think of the solution in my previous comment that would not require an upgrade? - Pros - no upgrade code to be written, no conditional checking for old vs new deletes, supports multi-sized hash - Cons - more complex process for stripping delete prefix (just a little), key field not really being used for intended purpose BUT maybe if value was used (which is also "") then field intent is maintained. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] nkalmar commented on issue #1225: Use fewer ZooKeeper watchers
nkalmar commented on issue #1225: Use fewer ZooKeeper watchers URL: https://github.com/apache/accumulo/issues/1225#issuecomment-519931138 Since Accumulo uses curator 2.x, I'm not sure if it won't cause any problems. Only 4.x supports 3.5.x as stated on their website. If you wan't, I can check the ZooKeeper update to 3.5.5, maybe even create a seperate ticket for it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] nkalmar edited a comment on issue #1225: Use fewer ZooKeeper watchers
nkalmar edited a comment on issue #1225: Use fewer ZooKeeper watchers URL: https://github.com/apache/accumulo/issues/1225#issuecomment-519920891 No, that's not it! ZooKeeper has been migrated to maven (from ant based build) starting from version 3.5.5. We have seperated the modules, and now we have zookeeper and also zookeeper-jute as a separate module (artifact). See here: https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper-jute/3.5.5 So there is an extra artifact zookeeper:zookeeper-jute that zookeeper now depends on. The reason for separate artifact is there is a plan to replace jute with a standard serialization library (with protobuff or avro for example) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] nkalmar edited a comment on issue #1225: Use fewer ZooKeeper watchers
nkalmar edited a comment on issue #1225: Use fewer ZooKeeper watchers URL: https://github.com/apache/accumulo/issues/1225#issuecomment-519920891 No, that's not it! ZooKeeper has been migrated to maven starting from version 3.5.5. We have seperated the modules, and now we have zookeeper and also zookeeper-jute as a separate module (artifact). See here: https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper-jute/3.5.5 So there is an extra artifact zookeeper:zookeeper-jute that zookeeper now depends on. The reason for separate artifact is there is a plan to replace jute with a standard serialization library (with protobuff or avro for example) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] nkalmar commented on issue #1225: Use fewer ZooKeeper watchers
nkalmar commented on issue #1225: Use fewer ZooKeeper watchers URL: https://github.com/apache/accumulo/issues/1225#issuecomment-519920891 No, that's not it! ZooKeeper has been migrated to maven from 3.5.5. We have seperated the modules, and now we have zookeeper and also zookeeper-jute as a separate module (artifact). See here: https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper-jute/3.5.5 So there is an extra artifact zookeeper:zookeeper-jute that zookeeper now depends on. The reason for separate artifact is there is a plan to replace jute with a standard serialization library (with protobuff or avro for example) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers
jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers URL: https://github.com/apache/accumulo/issues/1225#issuecomment-519900033 @ctubbsii , Zookeeper 3.5.5 seems to depend on this project that is not Apache Foundation but the source code at quick glance does not seem harmless but it has Chinese origin. https://github.com/wangfucai/zookeeper-jute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers
jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers URL: https://github.com/apache/accumulo/issues/1225#issuecomment-519895280 @nkalmar, I get this warning when I try to build Accumulo with zookeeper 3.5.5. [WARNING] Used undeclared dependencies found: [WARNING]org.apache.zookeeper:zookeeper-jute:jar:3.5.5:compile It ultimately causes a failure in the build of Accumulo Core. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] nkalmar edited a comment on issue #1225: Use fewer ZooKeeper watchers
nkalmar edited a comment on issue #1225: Use fewer ZooKeeper watchers URL: https://github.com/apache/accumulo/issues/1225#issuecomment-519889242 ZooKeeper 3.5.0 is an early alpha release. First stable release of 3.5 branch was released 1-2 month ago. Please use that (3.5.5) version! edit: there was even minor API changes between 3.5.0 and 3.5.5. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo] nkalmar commented on issue #1225: Use fewer ZooKeeper watchers
nkalmar commented on issue #1225: Use fewer ZooKeeper watchers URL: https://github.com/apache/accumulo/issues/1225#issuecomment-519889242 ZooKeeper 3.5.0 is an early alpha release. First stable release of 3.5 branch was released 1-2 month ago. Please use that (3.5.5) version! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo-examples] pattanib commented on issue #57: sealing violation: package org.apache.accumulo.core.client is sealed
pattanib commented on issue #57: sealing violation: package org.apache.accumulo.core.client is sealed URL: https://github.com/apache/accumulo-examples/issues/57#issuecomment-519841963 issue is resolved for me. I just have done following items. - Removed accumulo client jar from $SPARK_HOME/jars directory. - Build accumulo-spark-shaded.jar It works fine for me. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [accumulo-examples] pattanib closed issue #57: sealing violation: package org.apache.accumulo.core.client is sealed
pattanib closed issue #57: sealing violation: package org.apache.accumulo.core.client is sealed URL: https://github.com/apache/accumulo-examples/issues/57 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services