[GitHub] [accumulo] EdColeman merged pull request #3160: remove upgrade code for versions before 2.1
EdColeman merged PR #3160: URL: https://github.com/apache/accumulo/pull/3160 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo] dlmarion merged pull request #3180: Enable users to provide per-volume Hadoop Filesystem overrides
dlmarion merged PR #3180: URL: https://github.com/apache/accumulo/pull/3180 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo] dlmarion commented on a diff in pull request #3180: Enable users to provide per-volume Hadoop Filesystem overrides
dlmarion commented on code in PR #3180: URL: https://github.com/apache/accumulo/pull/3180#discussion_r1100625103 ## core/src/main/java/org/apache/accumulo/core/conf/Property.java: ## @@ -120,11 +120,11 @@ public enum Property { + " a comma or other reserved characters in a URI use standard URI hex" + " encoding. For example replace commas with %2C.", "1.6.0"), - INSTANCE_VOLUMES_CONFIG("instance.volumes.config.", null, PropertyType.PREFIX, + INSTANCE_VOLUMES_CONFIG("instance.volume.config.", null, PropertyType.PREFIX, Review Comment: Changes applied in 55fe4f3 ## server/base/src/main/java/org/apache/accumulo/server/fs/VolumeManagerImpl.java: ## @@ -382,44 +382,37 @@ public short getDefaultReplication(Path path) { */ private static Configuration getVolumeManagerConfiguration(AccumuloConfiguration conf, final Configuration hadoopConf, final String filesystemURI) { + final Configuration volumeConfig = new Configuration(hadoopConf); -final Map customProps = - conf.getAllPropertiesWithPrefixStripped(Property.INSTANCE_VOLUMES_CONFIG); -customProps.forEach((key, value) -> { - if (key.startsWith(filesystemURI)) { -String property = key.substring(filesystemURI.length() + 1); -log.info("Overriding property {} to {} for volume {}", property, value, filesystemURI); -volumeConfig.set(property, value); - } -}); -return volumeConfig; - } - private static void warnVolumeOverridesMissingVolume(AccumuloConfiguration conf, - Set definedVolumes) { -final Map overrideProperties = new ConcurrentHashMap<>( - conf.getAllPropertiesWithPrefixStripped(Property.INSTANCE_VOLUMES_CONFIG)); + conf.getAllPropertiesWithPrefixStripped(Property.INSTANCE_VOLUMES_CONFIG).entrySet().stream() +.filter(e -> e.getKey().startsWith(filesystemURI + ".")).forEach(e -> { + String key = e.getKey().substring(filesystemURI.length() + 1); + String value = e.getValue(); + log.info("Overriding property {} for volume {}", key, value, filesystemURI); + volumeConfig.set(key, value); +}); -definedVolumes.forEach(vol -> { - log.debug("Looking for defined volume: {}", vol); - overrideProperties.keySet().forEach(override -> { -if (override.startsWith(vol)) { - log.debug("Found volume {}, removing property {}", vol, override); - overrideProperties.remove(override); -} - }); -}); +return volumeConfig; + } -overrideProperties.forEach((k, v) -> log -.warn("Found no matching volume for volume config override property {} = {}", k, v)); + protected static List> + findVolumeOverridesMissingVolume(AccumuloConfiguration conf, Set definedVolumes) { +return conf.getAllPropertiesWithPrefixStripped(Property.INSTANCE_VOLUMES_CONFIG).entrySet() +.stream() +// log only configs where none of the volumes (with a dot) prefix its key +.filter(e -> definedVolumes.stream().noneMatch(vol -> e.getKey().startsWith(vol + "."))) +.collect(Collectors.toList()); Review Comment: Changes applied in 55fe4f3 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo] EdColeman merged pull request #3191: Enforce sort order of data version for upgraders
EdColeman merged PR #3191: URL: https://github.com/apache/accumulo/pull/3191 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo] ctubbsii merged pull request #3176: fix parameter check and minor variable rename
ctubbsii merged PR #3176: URL: https://github.com/apache/accumulo/pull/3176 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo] EdColeman opened a new pull request, #3191: Enforce sort order of data version for upgraders
EdColeman opened a new pull request, #3191: URL: https://github.com/apache/accumulo/pull/3191 The map in of data version was using Map.of - use TreeMap to guarantee the sort order by data version. Noticed by @ctubbsii while working #3160 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo] dlmarion commented on pull request #3189: Add group argument to server processes
dlmarion commented on PR #3189: URL: https://github.com/apache/accumulo/pull/3189#issuecomment-1422704728 Considering the information about the configuration, I think I'll work on a PR to get rid of server process arguments, and replace them with properties. Then, once that is merged in, I can update this PR which should just be the ServiceLockData changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo-proxy] DomGarguilo closed issue #73: Missing TableOperations method implementations
DomGarguilo closed issue #73: Missing TableOperations method implementations URL: https://github.com/apache/accumulo-proxy/issues/73 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo-proxy] DomGarguilo commented on issue #73: Missing TableOperations method implementations
DomGarguilo commented on issue #73: URL: https://github.com/apache/accumulo-proxy/issues/73#issuecomment-1422673246 > I don't think they were omitted purposefully... more likely they not added because there was no demand/interest to add them. Somebody has to think "I want that" first, and then somebody has to say "I'll do that" second. As far as I can tell, that first step simply never happened for these, and that's fine. We could certainly add more if they are of interest to somebody. However, I do not think it's necessary to add everything. And we definitely shouldn't add the deprecated ones. Makes sense, I'll close this for now. Other tickets can be opened if there are requests for any of these to be implemented. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo] EdColeman commented on a diff in pull request #3160: remove upgrade code for versions before 2.1
EdColeman commented on code in PR #3160: URL: https://github.com/apache/accumulo/pull/3160#discussion_r1100200971 ## server/manager/src/main/java/org/apache/accumulo/manager/upgrade/UpgradeCoordinator.java: ## @@ -142,11 +145,17 @@ public synchronized void upgradeZookeeper(ServerContext context, "Not currently in a suitable state to do zookeeper upgrade %s", status); try { - int cv = context.getServerDirs() - .getAccumuloPersistentVersion(context.getVolumeManager().getFirst()); - ServerContext.ensureDataVersionCompatible(cv); + int cv = AccumuloDataVersion.getCurrentVersion(context); this.currentVersion = cv; + int oldestVersion = upgraders.entrySet().iterator().next().getKey(); + if (cv < oldestVersion) { +String oldRelease = dataVersionToReleaseName(oldestVersion); +throw new UnsupportedOperationException("Upgrading from a version less than " + oldRelease ++ " data version (" + oldestVersion + ") is not supported. Upgrade to at least " ++ oldRelease + " before upgrading to " + Constants.VERSION); + } + Review Comment: With the changes - the error message now reads: ```2023-02-08T14:15:02,313 [start.Main] ERROR: Thread 'manager' died. java.lang.IllegalStateException: This version of accumulo (3.0.0-SNAPSHOT) is not compatible with files stored using data version 8. Please upgrade from 2.1.0 or later. at org.apache.accumulo.server.ServerContext.ensureDataVersionCompatible(ServerContext.java:304) ~[accumulo-server-base-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at org.apache.accumulo.server.ServerContext.init(ServerContext.java:373) ~[accumulo-server-base-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at org.apache.accumulo.server.AbstractServer.(AbstractServer.java:50) ~[accumulo-server-base-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at org.apache.accumulo.manager.Manager.(Manager.java:412) ~[accumulo-manager-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at org.apache.accumulo.manager.Manager.main(Manager.java:406) ~[accumulo-manager-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at org.apache.accumulo.manager.ManagerExecutable.execute(ManagerExecutable.java:45) ~[accumulo-manager-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at org.apache.accumulo.start.Main.lambda$execKeyword$0(Main.java:122) ~[accumulo-start-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo] ctubbsii commented on pull request #3187: WIP: Make scan initial wait configurable
ctubbsii commented on PR #3187: URL: https://github.com/apache/accumulo/pull/3187#issuecomment-1422338787 @dlmarion wrote: > This was something that a user noticed their client waiting on, and not performing scans, when the user thought scans should have been running. Aside from their surprise at the observation, what kind of impact did this cause? Do we know how long the user's scans were delayed? Was it reasonable? Was it a problem? I'm asking because I don't want to add a bunch of complexity and configuration knobs if it's not really a problem, and it's not clear from the discussion how much, if at all, it is a problem. The current behavior seems "safe", and I'm also reluctant to add knobs to allow the user to shoot themselves in the foot and perform operations that aren't really safe. Also, if the user doesn't care about consistency, can't they query using a ScanServer instead? If so, it's probably fine if this is what users get when the access a read/write server (tserver). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (ACCUMULO-2431) Allow localization of errors and shell
[ https://issues.apache.org/jira/browse/ACCUMULO-2431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-2431. - Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > Allow localization of errors and shell > -- > > Key: ACCUMULO-2431 > URL: https://issues.apache.org/jira/browse/ACCUMULO-2431 > Project: Accumulo > Issue Type: Sub-task > Components: shell >Reporter: Sean Busbey >Priority: Minor > Labels: i18n, localization > > It would be nice to allow localization of commands and error messages for > non-english speakers. > Initial implementation for this ticket just needs to: > * add an articulation point in the code for localization > * move existing text into an EN-US implementation > Future Implementer, please try to use an existing i18n framework rather than > rolling our own. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-483) Create purge locality utility
[ https://issues.apache.org/jira/browse/ACCUMULO-483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-483. Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > Create purge locality utility > - > > Key: ACCUMULO-483 > URL: https://issues.apache.org/jira/browse/ACCUMULO-483 > Project: Accumulo > Issue Type: New Feature > Components: client >Reporter: John Vines >Priority: Minor > Labels: newbie > > For some high capacity ingest, the desired path is to do some pre-splits, > bulk import, and then let it naturally split down the rest of the way. If all > of the pre-split tablets made split evenly, then the system will have > continuous ranges bundled together on tservers. This poor distribution can > impact performance depending on the operations performed. This could be > handled in the load balancer, but it could be tricky. You don't want to > randomly reassign tablets with any sort of frequency. Rather, you want to do > a one-time operation in doing so. Given the initial assignment code is a bit > random (needs to be validated), this could easily be done by offlining a > table, purging all location records for it from the !METADATA table, and > bringing it back online. The balancer will assign the table randomly, at > which point the user could force a major compaction to restablish locality > (as well some permanence in tablet assignment). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-594) Show per table plots
[ https://issues.apache.org/jira/browse/ACCUMULO-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-594. Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > Show per table plots > > > Key: ACCUMULO-594 > URL: https://issues.apache.org/jira/browse/ACCUMULO-594 > Project: Accumulo > Issue Type: New Feature > Components: monitor >Reporter: Keith Turner >Priority: Trivial > Labels: gsoc2013, mentor > > The monitor overview page plots accumulo performance data over time for all > tables. I would like to be able to see this data over time per table (and > maybe per tablet server). Could add this to the monitor page. Eric > suggested maybe using OpenTSDB to collect and view this data. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-2797) Localization for non-english
[ https://issues.apache.org/jira/browse/ACCUMULO-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-2797. - Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > Localization for non-english > > > Key: ACCUMULO-2797 > URL: https://issues.apache.org/jira/browse/ACCUMULO-2797 > Project: Accumulo > Issue Type: Improvement > Components: client, docs >Reporter: Sean Busbey >Priority: Minor > Labels: i18n > > This ticket will probably need subtasks. I'd like to improve our support for > non-English. > * Docs would be a big step. English. Before I start pushing to get > translations done, it would be nice if our build process already had the > means to handle multiple languages. > * Client API messages should have pluggable message langauges > * Server log messages as well > Those last two will probably require finding some kind of library to build > on. Or we could do something simple with Properties and the Java Services API. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-3418) create a microbenchmark for tablet splitting
[ https://issues.apache.org/jira/browse/ACCUMULO-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-3418. - Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > create a microbenchmark for tablet splitting > > > Key: ACCUMULO-3418 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3418 > Project: Accumulo > Issue Type: Improvement > Components: tserver >Affects Versions: 1.7.0 >Reporter: Sean Busbey >Priority: Minor > > It would be nice if we had a targeted benchmark that could easily be run to > see the impact of optimization choices on the time it takes to complete > tablet splits. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-3943) volumn definition agreement with default settings
[ https://issues.apache.org/jira/browse/ACCUMULO-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-3943. - Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > volumn definition agreement with default settings > - > > Key: ACCUMULO-3943 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3943 > Project: Accumulo > Issue Type: Bug > Components: gc, master, tserver >Reporter: Eric C. Newton >Priority: Minor > > I was helping a new user trying to use Accumulo. They managed to set up HDFS, > running on hdfs://localhost:8020. But they didn't set it up with specific > settings, and just used the default port. Accumulo worked initially, but > would not allow a bulk import. > During the bulk import process, the servers need to move the files into the > accumulo volumes, but keeping the volume the same. This makes the move > efficient, since nothing is copied between namespaces. In this case it > refused the import because it could not find the correct volume. > Accumulo needs to be more nuanced when comparing hdfs://localhost:8020, and > hdfs://localhost. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-4552) Create abstract class to encapsulate ZK lock logic
[ https://issues.apache.org/jira/browse/ACCUMULO-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-4552. - Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo (Note: I think this exists now, as ServiceLock` > Create abstract class to encapsulate ZK lock logic > -- > > Key: ACCUMULO-4552 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4552 > Project: Accumulo > Issue Type: Sub-task > Components: monitor >Reporter: Luis Tavarez >Priority: Minor > > [~elserj] > [commented|https://github.com/apache/accumulo/pull/195#issuecomment-270422597] > on Github that we could create a common abstract class which encapsulates > the logic of acquiring a ZK lock. This would reduce the duplication of > obtaining a lock before the services start. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-400) continuous random walk execution
[ https://issues.apache.org/jira/browse/ACCUMULO-400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-400. Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > continuous random walk execution > > > Key: ACCUMULO-400 > URL: https://issues.apache.org/jira/browse/ACCUMULO-400 > Project: Accumulo > Issue Type: Improvement >Reporter: Adam Fuchs >Priority: Major > > Random walk is finding bugs like a boss, but we can anticipate future usage > in which the current setup will be limiting. In particular, with a larger > development team knocking off bugs and writing new tests we might get to the > point where the most obvious bug is the only one that we find in a given run > of all of the random walkers. Consider hundreds of random walkers walking > over all of the tests. Many of these tests will find bugs > non-deterministically. If we add one test that finds one bug with high > probability, all of the walkers will find that bug and halt. None of the > other bugs will be found until the one bug is fixed or the test is removed. > Here are some things we could do to improve this situation and migrate to > more of a continual random walk setup: > 1. Stop executing a test after some number of walkers have found a bug when > running it. > 2. Store the random walk graph in a database and have the walkers re-query it > with some regularity. This will let us add new tests to running walkers. > 3. Have the walkers snapshot the relevant parts of the overall system when > they find a bug. We currently rely on the walkers halting to preserve the > state of the system so that we can manually extract all of the relevant > details that may have led to the bug. Dynamically snapshotting the system > makes it possible to continue to run tests without rolling over logs and > forensic information. Exactly what information needs to be kept is TBD. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-425) tservers should fail into read only mode
[ https://issues.apache.org/jira/browse/ACCUMULO-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-425. Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo (Note: this could be done with a user-defined Constraint) > tservers should fail into read only mode > > > Key: ACCUMULO-425 > URL: https://issues.apache.org/jira/browse/ACCUMULO-425 > Project: Accumulo > Issue Type: Improvement > Components: tserver >Reporter: John Vines >Priority: Major > > Aaron came up with the thought to have tservers fail into a read only mode. > Perhaps once hdfs reaches X percent full, all writes are blocked until space > improves. There's potential here, but it may involve some work to be optimal. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-549) Create infrastructure that supports debugging.
[ https://issues.apache.org/jira/browse/ACCUMULO-549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-549. Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > Create infrastructure that supports debugging. > -- > > Key: ACCUMULO-549 > URL: https://issues.apache.org/jira/browse/ACCUMULO-549 > Project: Accumulo > Issue Type: New Feature >Reporter: Keith Turner >Priority: Major > > Before a release an extensive amount of testing is done that usually involves > running test like continuous ingest and random walk on a cluster for extended > periods of time. When bugs are found by these test it can take a lot of time > to track the issue down sometimes. Inorder to make tracking these issues > down easier the write ahead logs are archived. These walogs archives make it > possible to answer questions about a tablets history because everything ever > written to the metadata table is there. It would be nice to always have this > capability on an accumulo system, and have it be easy to use. Spelunking > around in the write ahead logs is not an easy task. > It would be nice if accumulo could answer questions like the following. > > * Where has a tablet been assigned > * What compactions has a tablet done > * What split or merge created a tablet > These questions can currently be answered with walogs and log4j logs, but its > painful. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-1737) MultiTableBatchScanner
[ https://issues.apache.org/jira/browse/ACCUMULO-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-1737. - Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > MultiTableBatchScanner > -- > > Key: ACCUMULO-1737 > URL: https://issues.apache.org/jira/browse/ACCUMULO-1737 > Project: Accumulo > Issue Type: Improvement > Components: client >Reporter: John Vines >Priority: Major > > We currently have the BatchScanner, which does not guarantee in order results > with the benefit of higher parallellization. Unfortunately, if you want to > query multiple tables, the only option is to chain BatchScanner's together, > but you lose something of the optimizations under the hood for both using > system resources well as well as getting data back in order. We have another > ticket for multi-table accumulo input format for mapreduce, but I would like > to see a multi table version of a batch scanner for regular client use. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-902) Have a common resource pool for minor and major compactions
[ https://issues.apache.org/jira/browse/ACCUMULO-902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-902. Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo With compaction strategies and external compactions and compaction queues, this feature may not make sense anymore. > Have a common resource pool for minor and major compactions > --- > > Key: ACCUMULO-902 > URL: https://issues.apache.org/jira/browse/ACCUMULO-902 > Project: Accumulo > Issue Type: Improvement > Components: tserver >Reporter: John Vines >Priority: Major > > Currently we have a defined threadpool for minor and major compactions, > independent of one another. However, there are situations where a system may > be minor compaction heavy with no major, or vice versa. I would like to see a > common threadpool which is accessible to both operations for work to be done, > with guarantees for certain resources to be available to the other type of > work. That is, it should be a defined pool size with a (configurable) minimum > of resources maintained for the other to maintain a certain QoS. Of course, > major is heavier than minor, so some weighting of operations needs to be done > to keep workloads reasonable. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-2147) Create test to verify mappers are running locally
[ https://issues.apache.org/jira/browse/ACCUMULO-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-2147. - Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > Create test to verify mappers are running locally > - > > Key: ACCUMULO-2147 > URL: https://issues.apache.org/jira/browse/ACCUMULO-2147 > Project: Accumulo > Issue Type: Improvement >Reporter: Keith Turner >Priority: Major > > Map task that read from an Accumulo tablet should run on the node where that > tablet is hosted. We need a test that verifies that this is working. One > possible way of doing this is to add something to the continuous ingest > verify step. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-255) Improve Iterator Debugging
[ https://issues.apache.org/jira/browse/ACCUMULO-255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-255. Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > Improve Iterator Debugging > -- > > Key: ACCUMULO-255 > URL: https://issues.apache.org/jira/browse/ACCUMULO-255 > Project: Accumulo > Issue Type: Improvement > Components: tserver >Reporter: John Vines >Priority: Major > > There is currently little to no information provided to end users when there > is a failure in custom iterators. We do not want to provide the full stack > trace for security purposes, but that doesn't mean we can't provide some > error information like error name (particularly ClassNotFoundException) or > any exception which stems directly from the written iterator. Or we could > have it return the partial stack trace from just within the iterator itself. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-272) Prefer local tablet server for client operations
[ https://issues.apache.org/jira/browse/ACCUMULO-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-272. Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > Prefer local tablet server for client operations > > > Key: ACCUMULO-272 > URL: https://issues.apache.org/jira/browse/ACCUMULO-272 > Project: Accumulo > Issue Type: Improvement > Components: client >Reporter: Keith Turner >Priority: Major > > Accumulo client code will pick a random tablet server for certain operations > (like security operations such checking permissions). This code should > choose a local tablet server if there is one, otherwise chose a random tablet > server. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-3374) master should take action if tablets are not loaded in a reasonable time frame
[ https://issues.apache.org/jira/browse/ACCUMULO-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-3374. - Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > master should take action if tablets are not loaded in a reasonable time frame > -- > > Key: ACCUMULO-3374 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3374 > Project: Accumulo > Issue Type: Improvement > Components: master >Reporter: Eric C. Newton >Priority: Major > > The master should warn if a tablet assignment is not picked up by the tablet > server in a reasonable timeframe. The monitor does display offline tablets > as a "red" condition, but if an assignment isn't processed in a reasonable > time frame, a bigger warning, and identification of the tablet and server > would be helpful. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-4107) Need metrics on conditional mutations
[ https://issues.apache.org/jira/browse/ACCUMULO-4107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-4107. - Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > Need metrics on conditional mutations > - > > Key: ACCUMULO-4107 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4107 > Project: Accumulo > Issue Type: Improvement >Reporter: Keith Turner >Priority: Major > > Tracking down uneven load in Fluo applications would be much easier if > Accumulo reported some per tablet server conditional mutation metrics. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-2614) take advantage of block placement to reduce MTTR
[ https://issues.apache.org/jira/browse/ACCUMULO-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-2614. - Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > take advantage of block placement to reduce MTTR > > > Key: ACCUMULO-2614 > URL: https://issues.apache.org/jira/browse/ACCUMULO-2614 > Project: Accumulo > Issue Type: Improvement > Components: logger, tserver >Reporter: Sean Busbey >Priority: Major > Labels: recovery > > We should look at the block placement for WALs that need to be recovered and > then attempt to have the recovery task run on a tserver that is a minimum > distance from the set. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-577) Allow option for ignoring repeatedly failed iterators during compaction
[ https://issues.apache.org/jira/browse/ACCUMULO-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-577. Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > Allow option for ignoring repeatedly failed iterators during compaction > --- > > Key: ACCUMULO-577 > URL: https://issues.apache.org/jira/browse/ACCUMULO-577 > Project: Accumulo > Issue Type: Improvement > Components: tserver >Affects Versions: 1.4.0, 1.4.1 >Reporter: John Vines >Priority: Major > > Currently when an iterator is causing errors, either stemming from the > iterator configuration or from instantiating, it will endlessy try over and > over again. I think we should provide a table configuration to allow eventual > fail over for these circumstances. I think in these circumstances all user > configured iterators (including versioning) should be disabled to ensure the > data is written (or rewritten) and once the configuration is fixed, then the > next major compaction will sort it all out. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-4492) operations tool to migrate existing users when enabling Kerberos
[ https://issues.apache.org/jira/browse/ACCUMULO-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-4492. - Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > operations tool to migrate existing users when enabling Kerberos > > > Key: ACCUMULO-4492 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4492 > Project: Accumulo > Issue Type: New Feature >Reporter: Sean Busbey >Priority: Major > > When converting an existing cluster to use Kerberos, existing user > permissions aren't much use unless the user names happen to be formatted like > Kerberos principals. > An offline tool that folks migrating can use to map existing user names to > principals would be super useful. > Essentially something like: > {code} > $ accumulo kerberos-migration --include-users=* --exclude-users=root > --no-instance --realm=EXAMPLE.COM > Migrating users matching '*' and not matching 'root'. > User principals will not have an instance component. > User principals will be in the realm 'EXAMPLE.COM' > Found user 'auser', converted to 'au...@example.com' > Found user 'another_user', converted to 'another_u...@example.com' > Found user 'hpnewton', converted to 'hpnew...@example.com' > Found user 'root', skipped due to exclusion rule > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ACCUMULO-4491) operations tool to enable the administrative user in kerberos installations w/o security reset
[ https://issues.apache.org/jira/browse/ACCUMULO-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs resolved ACCUMULO-4491. - Resolution: Abandoned Closing this old issue. If this is still a problem, please open a new issue or PR at https://github.com/apache/accumulo > operations tool to enable the administrative user in kerberos installations > w/o security reset > -- > > Key: ACCUMULO-4491 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4491 > Project: Accumulo > Issue Type: Improvement >Affects Versions: 1.7.1, 1.7.2, 1.8.0 >Reporter: Sean Busbey >Priority: Critical > > Right now converting an existing cluster to use Kerberos requires using the > {{accumulo init --reset-security}} tool in order to replace the "root" user > with an appropriate administrative user principal. This has the side effect > of dumping existing user permission information. > That means downstream folks have to use the config dumping tools or the like > in order to save existing permissions if they want to refer to them later > while setting up kerberos users. > It'd be preferable for us to have a cli tool to set the administrative user. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [accumulo] ctubbsii closed pull request #2827: refactor GarbageCollectionLogger so one instance per context
ctubbsii closed pull request #2827: refactor GarbageCollectionLogger so one instance per context URL: https://github.com/apache/accumulo/pull/2827 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo] ctubbsii commented on pull request #2827: refactor GarbageCollectionLogger so one instance per context
ctubbsii commented on PR #2827: URL: https://github.com/apache/accumulo/pull/2827#issuecomment-1422279637 > This PR will be superseded by the changes with #3161 In that case, I'm closing this, so we can focus on reviewing that one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo] ctubbsii commented on a diff in pull request #3136: WIP - remove VFS and only search file system for jars for context classloading
ctubbsii commented on code in PR #3136: URL: https://github.com/apache/accumulo/pull/3136#discussion_r1099852723 ## core/src/main/java/org/apache/accumulo/core/classloader/DefaultContextClassLoaderFactory.java: ## @@ -48,55 +46,49 @@ public class DefaultContextClassLoaderFactory implements ContextClassLoaderFacto private static final Logger LOG = LoggerFactory.getLogger(DefaultContextClassLoaderFactory.class); private static final String className = DefaultContextClassLoaderFactory.class.getName(); - @SuppressWarnings("removal") - private static final Property VFS_CONTEXT_CLASSPATH_PROPERTY = - Property.VFS_CONTEXT_CLASSPATH_PROPERTY; + // Do we set a max size here and how long until we expire the classloader? + private final Cache contexts = + Caffeine.newBuilder().maximumSize(100).expireAfterAccess(1, TimeUnit.DAYS).build(); - public DefaultContextClassLoaderFactory(final AccumuloConfiguration accConf) { + public DefaultContextClassLoaderFactory() { if (!isInstantiated.compareAndSet(false, true)) { throw new IllegalStateException("Can only instantiate " + className + " once"); } -Supplier> contextConfigSupplier = -() -> accConf.getAllPropertiesWithPrefix(VFS_CONTEXT_CLASSPATH_PROPERTY); -setContextConfig(contextConfigSupplier); -LOG.debug("ContextManager configuration set"); -startCleanupThread(accConf, contextConfigSupplier); } - @SuppressWarnings("deprecation") - private static void setContextConfig(Supplier> contextConfigSupplier) { -org.apache.accumulo.start.classloader.vfs.AccumuloVFSClassLoader -.setContextConfig(contextConfigSupplier); - } + @Override + public ClassLoader getClassLoader(String contextName) { Review Comment: I was just calling it "context". The semantics of the context, whether it's a URL, a path, a name, etc., depends on the implementation. But generically, it's just a context, and the factory's job is to map a context to a ClassLoader, effectively equivalent to `Function`. You could call it `context` or `contextString` or similar. I agree it doesn't make sense to call it a `name`, though it could be a name for some implementations. Another thing about naming: this should not be called `DefaultContextClassLoaderFactory`. It should be named for how it behaves... not it's current status as the default. It will be very confusing (and I've seen this happen) if the default is changed, and it's no longer the default. So, for example, the default could change from `DefaultThing` to `MyThing`, and so then when people say `default thing`, you don't know if they mean the default `MyThing` or the non-default `DefaultThing`. In this case, it could just be `URLClassLoaderFactory` or `URLContextClassLoaderFactory`, and return instances of `URLClassLoader` objects, given a context that represents a list of URLs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo-proxy] ctubbsii commented on issue #73: Missing TableOperations method implementations
ctubbsii commented on issue #73: URL: https://github.com/apache/accumulo-proxy/issues/73#issuecomment-1422262953 I don't think they were omitted purposefully... more likely they not added because there was no demand/interest to add them. Somebody has to think "I want that" first, and then somebody has to say "I'll do that" second. As far as I can tell, that first step simply never happened for these, and that's fine. We could certainly add more if they are of interest to somebody. However, I do not think it's necessary to add everything. And we definitely shouldn't add the deprecated ones. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo] ctubbsii commented on pull request #3189: Add group argument to server processes
ctubbsii commented on PR #3189: URL: https://github.com/apache/accumulo/pull/3189#issuecomment-1422252698 > > I like the changes to the ServiceLockData abstraction. I think that's a useful change on its own. After that change is done, I think the changes to add the service group are probably relatively minimal, so it's probably okay (meaning, I think I'd be in favor), but I'd like to see what this PR looks like after the ServiceLockData abstraction is done first, without the addition of the group. Would you be willing to do that? > > Yes, I would be willing to do that. However, if the ServiceLockData object doesn't support the [group](https://github.com/apache/accumulo/blob/main/server/tserver/src/main/java/org/apache/accumulo/tserver/ScanServer.java#L340), then that will break the [client](https://github.com/apache/accumulo/blob/main/core/src/main/java/org/apache/accumulo/core/clientImpl/ClientContext.java#L409) and likely break the build. I am imagining the ScanServer's current `-g` option would go in the second PR, along with adding support for groups for tserver as well, so nothing is broken. I just think it makes sense to apply the "ServiceLockData abstraction/serialization" change feature first, then the "replace scan server group option with group added to ServiceLockData" feature after that, because they seem like they are discrete changes. I don't think the build needs to break to separate those out. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo] ctubbsii commented on a diff in pull request #3189: Add group argument to server processes
ctubbsii commented on code in PR #3189: URL: https://github.com/apache/accumulo/pull/3189#discussion_r1099831059 ## server/base/src/main/java/org/apache/accumulo/server/ServerOpts.java: ## @@ -27,6 +28,17 @@ public class ServerOpts extends ConfigOpts { @Parameter(names = {"-a", "--address"}, description = "address to bind to") private String address = null; + @Parameter(required = false, names = {"-g", "--group"}, Review Comment: My comment above regarding using our normal configuration property would make this a non-issue. Only server types where it makes sense would have that configuration property. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [accumulo] ctubbsii commented on a diff in pull request #3189: Add group argument to server processes
ctubbsii commented on code in PR #3189: URL: https://github.com/apache/accumulo/pull/3189#discussion_r1099829455 ## server/base/src/main/java/org/apache/accumulo/server/ServerOpts.java: ## @@ -27,6 +28,17 @@ public class ServerOpts extends ConfigOpts { @Parameter(names = {"-a", "--address"}, description = "address to bind to") private String address = null; + @Parameter(required = false, names = {"-g", "--group"}, + description = "Optional group name that will be made available to the client (e.g. ScanServerSelector) " + + "and server (e.g. Balancers) plugins. If not specified will be set to '" + + ServiceLockData.ServiceDescriptor.DEFAULT_GROUP_NAME + + "'. Assigning servers to groups support dedicating server resources for specific purposes, where supported.") + private String groupName = ServiceLockData.ServiceDescriptor.DEFAULT_GROUP_NAME; Review Comment: We use commons-configuration2, which does support property interpolation, but not exactly like your example. You can do something like `tserver.address.listen=${env:HOSTNAME}` and in in your environment, you can do `HOSTNAME=$(hostname -i)` (for example, in `accumulo-env.sh` or `.bashrc`). We also support setting any of our configuration properties on the command-line to override what's in the config file. So, instead of `-a $(hostname -i)`, you could do `-o tserver.address.listen=$(hostname -i)`, so you don't need to have the address hard-coded in the config file, and can set it dynamically on the command-line, even though it's a configuration file property. This feature was specifically added in 2.0 to support container deployments. For the address, that would depend on us actually having a configuration property for the listen/bind address to replace `-a`, which we currently do not have (`tserver.address.listen` was merely an example). But, that's the direction I think we should go; for the purposes of this issue, I think the addition of the group should follow the direction of making it a configuration property, instead of the old precedent of `-a`. We have a complete configuration mechanism with a lot of flexibility already, so we shouldn't need to add one-off command-line options outside that configuration mechanism, like `-g` (and we should eventually get rid of the other one-off command-line options like `-a`, but that's for another ticket). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org